Multi-robot task allocation for service robotics: from ... - CiteSeerX

18 downloads 616146 Views 13MB Size Report
SEO (Sistemas multi-robot en aplicaciones de servicio y seguridad – TEams of ...... Affordable of commodity hardware in both of these fields has created.
PhD Thesis

Multi-Robot Task Allocation for Service Robotics: from Unlimited to Limited Communication Range Alejandro R. Mosteo Departamento de Informática e Ingeniería de Sistemas Grupo de Robótica, Percepción y Tiempo Real Centro Politécnico Superior Universidad de Zaragoza

PhD Thesis

Multi-Robot Task Allocation for Service Robotics: from Unlimited to Limited Communication Range

Alejandro Rafael Mosteo Chagoyen November 2010

Supervisor: Luis E. Montano Grupo de Rob´otica, Percepci´on y Tiempo Real (RoPeRT) Instituto de Investigaci´on en Ingenier´ıa de Arag´on (I3A) Departamento de Inform´atica e Ingenier´ıa de Sistemas (DIIS) C ENTRO P OLIT E´ CNICO S UPERIOR - U NIVERSIDAD DE Z ARAGOZA

PhD Thesis

Multi-Robot Task Allocation for Service Robotics: from Unlimited to Limited Communication Range

Alejandro Rafael Mosteo Chagoyen November 2010

Supervisor: Luis E. Montano Jury: An´ıbal Ollero Jos´e Luis Villarroel Rachid Alami Alberto Sanfeliu Michail G. Lagoudakis Carlos Sag¨ue´ s Luis Moreno

European Reviewers: Pedro Lima Simon Lacroix

Universidad de Sevilla Universidad de Zaragoza LAAS-CNRS, Toulouse Universitat Polit`ecnica de Catalunya Technical University of Crete Universidad de Zaragoza Universidad Carlos III, Madrid

Instituto Superior T´ecnico, Lisboa LAAS-CNRS, Toulouse

Acknowledgments Thanks to all the people that made this possible: Thanks to my advisor, Luis Montano, for his help and support during these years. Thanks to Michail and his wife for their warm reception at Greece; they were the most obliging hosts in a beautiful but new land to me. Thanks to my family for their unending, inexhaustible trust in me. Thanks to the colleagues in the department, always ready to help. Thanks to all the friends that have at one or other time shared my joys and sorrows. Thanks to N. Ascheuer and C. Helmberg for their help with the MTSP problem. And thanks to the person that pushed the final button that triggered the start of my writing.

Best of lucks to my sister, which is following in my footsteps.

I

Project framework The work of this thesis has been developed with the Robotics, Perception and Real Time Group of the Universidad de Zaragoza, in the framework of the projects NRS (Network Robot Systems, EURON II Network Research Atelier, NoE-507728 RA), EXPRES (T´ecnicas de EXPloraci´on avanzadas en tareas de REScate, DPI2003-07986), NERO (NEtworked RObots, DPI2006-07928), TESSEO (Sistemas multi-robot en aplicaciones de servicio y seguridad – TEams of robots for Service and Security missiOns, DPI2009-08126) and URUS (Ubiquitous Networking Robotics in Urban Settings, EC IST-1-045062-URUS-STP); and the scholarship DGA (BOA 2004/04/21 n.46 p.3770 resoluci´on 1091 ref. B079/2004) from the Aragon Government. The thesis author also belongs to the Instituto de Investigaci´on en Ingenier´ıa de Arag´on (I3A).

Some results in this thesis were the outcome of a research visit to the Intelligent Systems Laboratory, at the Department of Electronic and Computer Engineering from the Technical University of Crete, in Chania, Greece (May–August 2007). This research stay was supported by a CAI/CONSI+D grant awarded to the author, and a Marie Curie international reintegration grant MIRG-CT-2006-044980 awarded to Dr. M.G. Lagoudakis.

III

NRS - Research Atelier on Network Robot Systems The purpose is to analyze the potential of the NRS paradigm, the constituent elements and their inter-relationships, standards, and the actuations to be taken into account with regards to NRS in the European Community; and its immediate implications in the development of robots, telecommunication systems, sensors, perception systems or smart interfaces. EXPRES - automated EXPloration techniques for REScue applications The main project objective is the research in exploration strategies: a set of perception-action techniques that allow to obtain environment information, to plan motions for refining and completing this information (active perception), and to perform safe robot motions in non-structured scenarios. In recent years, these techniques have been improved and applied in indoor environments with very good results. The goal of this project is to further develop these techniques to apply them to novel problems and more difficult scenarios, like rescue operations. NERO - NEtworked RObots The complex nature of mobile robot tasks leads to the necessity of systems with several coordinated robots (agents) working in cooperation. Networked robotic systems combine robotic elements connected to communication nets and sensors distributed in the working place (static agents) exchanging and sharing information. This concept is extended to robot interactions between humans, sensors and the environment. This project is a continuation of EXPRES on subjects related to multi-robot cooperation techniques, computer vision, robot vision for motion and communications. URUS - Ubiquitous networking Robotics in Urban Settings European cities have growing issues related to noise, pollution and security. Also, the average age of people is growing and in the near future there will be an important community of elderly people. City Halls are studying solutions, e.g. like reducing the free car circulation areas. Free car areas imply a revolution in the planning of urban settings, by requiring new means for transportation of people and goods, security, etc. This project studies the idea of deploying a network of agents (robots, intelligent sensors, devices, etc) in order to improve life quality in such urban areas. TESSEO - TEams of robots for Service and Security missiOns This project investigates techniques for a multi-robot team to act in coordination in realistic scenarios. For the deployment is necessary to integrate algorithms devoted to task planning and allocation, coordinated navigation planning, environment perception from multiple views provided by every member of the team; all of this while the communication among all the elements of the system is maintained —robots, infrastructure, supervisors, etc. Given that these needed capabilities are presented in the literature and in many projects somehow independently, the research in this project is also oriented to develop integration techniques for the different subjects involved. IV

Resumen M´ultiples robots aut´onomos trabajando en un mismo entorno requieren nuevas capacidades de decisi´on para ser de m´axima utilidad: cu´al es la mejor forma de abordar un problema (planificaci´on) y qui´en se encarga de cada parte del trabajo (asignaci´on de tareas). Esta u´ ltima actividad, en la que se centra esta tesis, ha sido abordada con una gran diversidad de m´etodos: desde la coordinaci´on impl´ıcita en ausencia de comunicaci´on (cooperaci´on como resultado emergente de mecanismos predefinidos), pasando por m´etodos de inspiraci´on biol´ogica (enjambres), hasta t´ecnicas similares a la cooperaci´on humana donde hay comunicaci´on expl´ıcita (contratos, subastas) o m´etodos fundamentalmente matem´aticos (modelos Markovianos). En soluciones donde existe comunicaci´on es habitual el uso de redes inal´ambricas por su conveniencia. Sin embargo, e´ stas tienen un alcance limitado por razones bien econ´omicas (no siempre es factible instalar puntos de acceso en todo el entorno de trabajo) o f´ısicas (la intervenci´on se realiza en un a´ rea no preparada, como un escenario de desastre natural), entre otras. Por lo tanto, el uso de comunicaciones en a´ reas extensas requiere un tratamiento expl´ıcito de la conectividad en estas circunstancias. Dentro de este amplio abanico de posibilidades hay aspectos que han sido ya objeto de abundante investigaci´on. Sin embargo, su tratamiento es generalmente aislado y requiere por tanto una labor de investigaci´on adicional de cara a su integraci´on en soluciones completas. Adicionalmente, es el aspecto de limitaciones en la comunicaci´on el m´as novedoso y el que est´a recibiendo actualmente una atenci´on creciente dentro del contexto de la rob´otica de servicio. Por ello, la presente tesis aborda estos aspectos: M´etodos que no est´en estrechamente ligados a un problema en particular, sino que sean aplicables con generalidad a tareas independientes o con pocas interdependencias, de cara a su uso en una variedad de tareas de servicio. En particular, se estudian distintas variaciones de m´etodos basados en subastas y se proponen mejoras para su uso en combinaci´on con sistemas de tareas jer´arquicas y en a´ mbitos de comunicaciones limitadas. Metas de optimizaci´on y m´etricas de evaluaci´on precisas y contrastables dentro de la rob´otica de servicio, apropiadas para estos m´etodos de asignaci´on V

de tareas. Partiendo de un modelo cl´asico del problema, basado en el problema del viajante, se identifican varias m´etricas adecuadas, sus efectos en la optimizaci´on, y se proponen m´etodos de implementaci´on que permiten una r´apida adaptabilidad de los algoritmos a las distintas m´etricas. Soluciones para la integraci´on de todo lo anterior con las limitaciones de conectividad, ya sea trat´andolas como una restricci´on adicional para la optimizaci´on, o como un elemento fundamental y determinante de la metodolog´ıa de asignaci´on. Demostraciones experimentales de los resultados obtenidos en simulaci´on. Estos experimentos respaldan dos puntos interrelacionados y de relevancia creciente en sistemas complejos como es un equipo multirrobot: por una parte, la validaci´on experimental es un paso necesario en una investigaci´on completa; por otra, al hacer p´ublico el c´odigo fuente (la arquitectura S ANC TA ) de nuestro equipo rob´ otico (llamado C ODEX C ONAN), se simplifica la tarea de verificaci´on y reproducci´on de resultados por parte de otros investigadores, am´en de posibilitar la reutilizaci´on del c´odigo con el consiguiente ahorro de esfuerzo para la comunidad rob´otica. Implementaci´on e integraci´on, como responsables del paquete de trabajo dedicado a asignaci´on de tareas, de los algoritmos pertinentes dentro del contexto del proyecto europeo URUS. Este proyecto est´a enfocado a la integraci´on de robots y sensores en ambientes urbanos para facilitar tareas tales como transporte de objetos y personas, seguimiento de usuarios, detecci´on de situaciones an´omalas y consiguiente intervenci´on rob´otica, por ejemplo. Conjuntamente, estos puntos contribuir´an a la construcci´on de equipos rob´oticos de intervenci´on o servicio altamente aut´onomos, supervisables por un reducido equipo humano y programables mediante objetivos y directivas de alto nivel. Puede decirse que este objetivo es uno de los “griales” de la rob´otica de servicio, dada su repercusi´on en m´ultiples a´ mbitos de aplicaci´on.

VI

Abstract Autonomous robots working in a shared environment require new coordination capabilities in order to achieve enhanced efficiency: planning to decide what is the best way to tackle a problem and task allocation to decide which robot will perform each workload. The latter, which is the main focus of this thesis, has been addressed by many different means: from implicit coordination without communications (cooperation as an emergent property of predefined algorithms), through biologically inspired systems (swarms), mathematical models (e.g. Markovian ones), to human-mimicking explicit communication (contracts, auctions). Task allocation solutions often require explicit communication. This communication is commonly carried out by means of Wi-Fi transmitters, since this is a well established technology that does not impair robots’ mobility. However, Wi-Fi coverage may be limited, for instance due to economic or infrastructure reasons (unprepared, destroyed, or too large areas). This prompts the use of MANETs and explicit management of multi-hop messaging and network route preservation. While there is a respectable amount of work on task allocation, there is still the need for research towards the integration of problems that are typically treated in an isolated way. Furthermore, network integrity preservation is a growing concern in mobile robotics and as such it is receiving increasing attention. For the outlined reasons, this thesis provides novel research on the following subjects: Problem-independent generic allocation methods for loosely coupled or uncoupled tasks (e.g. with few or no temporal restrictions), suitable for a variety of service missions. In particular, several auction based methods are explored, and their use in combination with hierarchical task networks or limited communication environments is studied. Well-defined optimization objectives and metrics that allow a fair evaluation and comparison of these task allocation methods within the context of service robotics. Starting from the classic traveling salesman model, several metrics and their effects in the optimization are identified. Methods for the adaptability of algorithms to interchangeably use any of these metrics are proposed. Algorithms for the integration of these results and its application to constrained networking environments. Two approaches are studied: one considVII

ering networking as an additional restriction in the optimization process, and another treating it as a fundamental aspect which determines the methodology used for assignation. Experimental demonstrations of simulation results. These experiments address two related points of increasing relevance in complex systems such as a multi-robot team: on the one hand, experimental validation is a necessary step in a whole research plan; on the other, by publishing source code (S ANCTA architecture) of our robots (C ODEX C ONAN team), the task of verifying and replicating research is simplified for other researchers. Furthermore, this enables code reuse by the robotic community with the corresponding implementation time savings. Implementation, integration and deployment, as leaders of the work package devoted to task allocation, of the pertinent algorithms within the context of the European project URUS. This project focuses on the introduction of robots and sensors in urban settings, in order to enable tasks like people and goods transportation, tracking of users, triggering of safety alerts and subsequent robotic intervention, for instance. By tackling all these issues as a whole, this thesis is posed to contribute to the furthering of the field in its goal of building versatile robotic teams that can be quickly and easily tailored for the particular objectives of a mission, performing its tasks with a high degree of autonomy and relying on humans only for supervision and customization of high level directives. It can be said that such teams are one of the “holy grails” of service robotics because of its impact in many contexts.

VIII

Contents Resumen

V

Abstract

VII

Table of Contents

IX

List of Figures

XI

List of Tables

XIII

List of Multimedia

XIV

1

Introduction 1.1 Towards service multi-robot teams . . . . . . . . . . . . . . . . . 1.2 Goals and contributions . . . . . . . . . . . . . . . . . . . . . . .

2

Survey of multi-robot task allocation 2.1 Introduction . . . . . . . . . . . 2.2 Objectives and challenges . . . . 2.3 Definitions . . . . . . . . . . . . 2.4 Computational complexity . . . 2.5 Solution models . . . . . . . . . 2.6 Architectures . . . . . . . . . . 2.7 Discussion . . . . . . . . . . . .

3

4

1 2 3

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

13 14 15 16 21 24 29 32

A comparison of metrics and auction algorithms 3.1 Introduction . . . . . . . . . . . . . . . . . . 3.2 Optimization criteria . . . . . . . . . . . . . 3.3 Implemented algorithms . . . . . . . . . . . 3.4 Simulations and comparatives . . . . . . . . 3.5 Discussion . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

33 34 34 37 39 47

A stochastic solution for improved task allocation 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . .

49 50 51

IX

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

CONTENTS

4.3 4.4 4.5 4.6 4.7 4.8 4.9 5

6

7

8

Data replication in critical missions Simulated annealing solution . . . . Market-based solution . . . . . . . Evaluation . . . . . . . . . . . . . . Proofs-of-concept . . . . . . . . . . Deployment in real robots . . . . . Discussion . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

55 57 59 60 63 65 67

Reactive allocation in constrained networking environments 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . 5.3 System overview . . . . . . . . . . . . . . . . . . . . . . 5.4 Task allocation module . . . . . . . . . . . . . . . . . . . 5.5 Timing and implementation issues . . . . . . . . . . . . . 5.6 Simulation results . . . . . . . . . . . . . . . . . . . . . . 5.7 Experimental results . . . . . . . . . . . . . . . . . . . . 5.8 Simulations in light of optimization objectives. . . . . . . 5.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

69 70 71 72 75 81 81 86 88 93

The C ONNECT T REE strategy for cluttered environments 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 6.2 Multi-robot routing under limited connectivity . . . . 6.3 The C ONNECT T REE algorithm . . . . . . . . . . . . 6.4 Performance bounds . . . . . . . . . . . . . . . . . 6.5 Simulation results . . . . . . . . . . . . . . . . . . . 6.6 Implementation considerations . . . . . . . . . . . . 6.7 Concurrent C ONNECT T REE strategy . . . . . . . . . 6.8 Problem characterization . . . . . . . . . . . . . . . 6.9 Algorithms . . . . . . . . . . . . . . . . . . . . . . 6.10 Evaluation . . . . . . . . . . . . . . . . . . . . . . . 6.11 Experiments . . . . . . . . . . . . . . . . . . . . . . 6.12 Discussion . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

97 98 98 99 103 104 108 110 111 112 116 120 130

. . . . . .

133 134 134 139 142 144 150

Reusable components for multi-robot research 7.1 Introduction . . . . . . . . . . . . . . . . . 7.2 Architecture . . . . . . . . . . . . . . . . . 7.3 Implementation details . . . . . . . . . . . 7.4 New bindings . . . . . . . . . . . . . . . . 7.5 S ANCTA deployment in the URUS project . 7.6 Discussion . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . . . .

. . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Conclusion 153 8.1 Towards autonomous multi-robot teams . . . . . . . . . . . . . . 154 8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 X

Appendices

157

A Multi-robot routing cost model

159

B Reactive allocation complementary modules 163 B.1 Cooperative navigation module . . . . . . . . . . . . . . . . . . . 164 B.2 Real-Time Wireless Multi-hop Protocol . . . . . . . . . . . . . . 176 C Robots used in this thesis C.1 UniZar robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.2 URUS robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . C.3 A low-cost LEGO robot for indoor research . . . . . . . . . . . .

181 182 183 184

Acronyms

187

Bibliography

189

List of Figures 1.1 1.2 1.3 1.4 1.5

Cost evaluation of metric and algorithm combinations. . A parking exploration plan expressed as a HTN. . . . . . TSP tour with connection to static base. . . . . . . . . . Deployment plan using the MST C ONNECT T REE strategy. Example of connected S ANCTA components. . . . . . . .

. . . . .

5 6 7 8 9

2.1

HTNs and task relationships. . . . . . . . . . . . . . . . . . . . .

23

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10

Simulation scenarios for auction evaluation. . . . . . . . . . . . Grid map corresponding to simulated floor plan. . . . . . . . . . Average CMM and CMS costs in the random worlds. . . . . . . . The best M IN M AX solution found via auctions. . . . . . . . . . The best M IN S UM solution found via auctions. . . . . . . . . . . Best CMM and CMS solutions and their histograms. . . . . . . . . The four auction algorithms compared. . . . . . . . . . . . . . . Objects found by the best CMM and CMS allocations. . . . . . . . The four algorithms in search limited to half mission time . . . . The four algorithms in search limited to same fixed mission time

. . . . . . . . . .

38 39 40 43 44 45 46 46 47 48

4.1 4.2

Task tree for a simple parking exploration. . . . . . . . . . . . . . Aerial shot of a parking like the ones modelled in simulation. . . .

52 53

XI

. . . . .

. . . . .

. . . . .

. . . . .

LIST OF FIGURES

4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10

A robot equipped for vehicle search and identification. . . . A particular result of M IN S UM criterion. . . . . . . . . . . . Parking and city scenarios. . . . . . . . . . . . . . . . . . . Costs of solutions for the 12-, 32- and 43-lane problems. . . Evolution of solution quality with simulated annealing. . . . Simulations of situations with the annealing algorithm . . . Two robots testing the simulated annealing algorithm . . . . The Tibi and Dabo Segway-based robots in the Campus Nord

. . . . . . . .

. . . . . . . .

. . . . . . . .

54 55 60 61 62 64 65 66

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13

Modules and information flows. . . . . . . . . . . . . . MANET and Spring-Damper System structure example. Typical simulated degradation of signal quality. . . . . . Several task allocation related situations. . . . . . . . . . Snapshots of allocation strategies I. . . . . . . . . . . . . Mission time. . . . . . . . . . . . . . . . . . . . . . . . S-cluster changes. . . . . . . . . . . . . . . . . . . . . . Histogram of concurrency. . . . . . . . . . . . . . . . . Outdoor final experiment. . . . . . . . . . . . . . . . . . Snapshots of allocation strategies II. . . . . . . . . . . . Total distances traveled. . . . . . . . . . . . . . . . . . . Total mission completion times. . . . . . . . . . . . . . Average task completion times. . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

73 75 75 77 82 84 84 85 87 90 91 92 93

6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 6.21

A possible path tree visiting four goals. . . . . . . . . . . . Simulation scenario with overlaid path trees. . . . . . . . . . M IN S UM evaluation. . . . . . . . . . . . . . . . . . . . . . M IN M AX evaluation. . . . . . . . . . . . . . . . . . . . . . M INAVE evaluation. . . . . . . . . . . . . . . . . . . . . . An example of concurrent C ONNECT T REE strategy. . . . . . Counterexamples of optimality of depth-first traversals. . . . Counterexamples for early and late splits. . . . . . . . . . . Cost improvements for the M IN M AX objective. . . . . . . . Cost improvements for the M INAVE objective. . . . . . . . . Cost improvements with clusters for the M INAVE objective. . Cost penalties over optimal solutions. . . . . . . . . . . . . Randomly generated scenario. . . . . . . . . . . . . . . . . Cost improvements for M IN M AX in random scenarios. . . . Single-robot test with internal antenna and high threshold. . Single-robot test with internal antenna and low threshold. . . Single-robot test with aerial antenna and high threshold. . . . Single-robot test with no threshold using the 5.2GHz g band. Single-robot test with no threshold using the 2.4GHz b band. Paths in the single-robot comparison of bands test. . . . . . Single-robot test in Canfranc tunnel. . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

99 105 106 107 108 110 112 114 115 116 117 118 119 119 120 121 122 123 124 124 125

XII

. . . . . . . . . . . . .

6.22 6.23 6.24 6.25 6.26 6.27

Canfranc tunnel . . . . . . . . . . . . . . . . . . . . . . . . . Experiment with two robots. . . . . . . . . . . . . . . . . . . Signal quality in C ONNECT T REE experiment with three robots. Map plot for a C ONNECT T REE experiment with three robots. . Map plot for a combined auctions+C ONNECT T REE experiment. Qualities in a combined auctions+C ONNECT T REE experiment. .

. . . . . .

. . . . . .

125 126 127 128 129 129

7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10

Key elements in the S ANCTA architecture. . . . . A node configuration example. . . . . . . . . . . Attributes in configuration files. . . . . . . . . . Actors in the URUS NRS. . . . . . . . . . . . . Module relations. . . . . . . . . . . . . . . . . . Smartphone screen for URUS mission request. . . Evolution of Taxi mission in URUS. . . . . . . . Experimental area in Campus Nord of UPC. . . . Task coupling in URUS. . . . . . . . . . . . . . Interface for URUS local and remote monitoring .

. . . . . . . . . .

. . . . . . . . . .

135 136 137 143 144 145 146 147 148 150

B.1 B.2 B.3 B.4 B.5 B.6 B.7 B.8 B.9 B.10 B.11

Spring-damper model. . . . . . . . . . . . . . . . . . . . . . . . Theoretical function or radio signal vs. distance. . . . . . . . . . . Sprint-damper system example by Prim-based algorithm. . . . . . Evolution of the robots movement and links created between them. Linear velocity and relative distances. . . . . . . . . . . . . . . . Snapshots of the robots during the experiment. . . . . . . . . . . . Linear velocity and distance during real experiment. . . . . . . . . Linear velocity and γ(s) during quality-based test. . . . . . . . . . Evolution of the link quality in an indoor environment. . . . . . . Example of network graph and corresponding LQM. . . . . . . . Example of a modified frame. . . . . . . . . . . . . . . . . . . .

165 166 167 168 169 170 171 173 175 178 179

C.1 C.2 C.3 C.4

University of Zaragoza robot team. . Segway-based Tibi and Dabo robots. Dala and Romeo robots. . . . . . . LEGO prototype. . . . . . . . . . .

. . . .

182 183 183 184

Mean costs of full allocations in random scenarios. . . . . . . . . Costs for the large grid world. . . . . . . . . . . . . . . . . . . .

41 42

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . . . . . . . .

. . . .

. . . .

List of Tables 3.1 3.2

XIII

3.3

Solution ratios to optimal and their theoretical bounds. . . . . . .

43

4.1 4.2

Ratio of mean to best solution. . . . . . . . . . . . . . . . . . . . Team size & SA algorithm. . . . . . . . . . . . . . . . . . . . . .

62 63

6.1

Replanning occurrences for L = 25. . . . . . . . . . . . . . . . . 108

7.1

Some components, ordered in decreasing abstraction level. . . . . 138

List of Multimedia 3.1 http://robots.unizar.es/data/videos/expres/auct-sngext-mix.mp4 Fig. 3.4: S INGLE E XT example allocation. . . . . . . . . . . . . . . . . 4.1 http://robots.unizar.es/html/ficha_resultado.php?id=73 Fig. 4.3: Robot performing vehicle identification. . . . . . . . . . . . . 4.2 http://robots.unizar.es/html/ficha_resultado.php?id=45 Fig. 4.5: Real-time simulation using the annealing planner. . . . . . . . 4.3 http://robots.unizar.es/data/videos/expres/anneal-city43.avi Fig. 4.5: Example allocation using auctions of HTN nodes. . . . . . . . 4.4 http://robots.unizar.es/data/videos/expres/anneal-dynamic.avi Fig. 4.8: On-line addition of tasks. . . . . . . . . . . . . . . . . . . . . 4.5 http://robots.unizar.es/data/videos/expres/anneal-lost.mpg Fig. 4.8: Reallocation of tasks from a lost robot. . . . . . . . . . . . . . 4.6 http://robots.unizar.es/data/videos/expres/anneal-changing.avi Fig. 4.8: Reallocation upon criterion change. . . . . . . . . . . . . . . 4.7 http://robots.unizar.es/data/videos/expres/anneal-demo.mpg Fig. 4.9: Two robots running the simulated annealing algorithm. . . . . 5.1 http://robots.unizar.es/data/videos/urus/ctree-react-dlock.mp4 Fig. 5.4: Deadlock example. . . . . . . . . . . . . . . . . . . . . . . . 5.2 http://robots.unizar.es/data/videos/urus/ctree-react-oscil.mp4 Fig. 5.4: Livelock example. . . . . . . . . . . . . . . . . . . . . . . . . 5.3 http://robots.unizar.es/data/videos/urus/ctree-react-melee.mp4 Fig. 5.4: Induced chain formation. . . . . . . . . . . . . . . . . . . . . 5.4 http://robots.unizar.es/data/videos/urus/ctree-react-nidle.mp4 Fig. 5.4: Dangerous idle robots. . . . . . . . . . . . . . . . . . . . . . 5.5 http://robots.unizar.es/data/videos/urus/ijrr-taskalloc.mp4 Fig. 5.5: Reactive allocation strategies I. . . . . . . . . . . . . . . . . . XIV

43 54 60 60 64 64 64 65 77 77 77 77 82

5.6 http://robots.unizar.es/data/videos/urus/ijrr-final.mp4 Fig. 5.9: Three robots executing the reactive outdoors experiment. . . . 5.7 http://robots.unizar.es/data/videos/urus/ctree-react-algs.mp4 Fig. 5.10: Reactive allocation strategies II. . . . . . . . . . . . . . . . . 6.1 http://robots.unizar.es/data/videos/urus/ctree-seq-x8.avi Fig. 6.2: Sequential C ONNECT T REE simulation run in mall scenario. . . 6.2 http://robots.unizar.es/data/videos/auct+ctree.avi Fig. 6.26: Experiment combining auctions in infrastructure area with C ONNECT T REE in uncovered area. . . . . . . . . . . . . . . . . . . . . 7.1 http://robots.unizar.es/data/videos/urus/urus-wp9.avi Fig. 7.8: Evacuation simulation in Campus Nord area. . . . . . . . . . . B.1 http://robots.unizar.es/data/videos/urus/ijrr-lineal.mp4 Fig. B.8: Spring-damper system validation in outdoor test. . . . . . . . C.1 http://urus.upc.es/NuevoRobots.html Fig. C.3: URUS robots. . . . . . . . . . . . . . . . . . . . . . . . . . .

87 90 105

129 147 173 183

Chapter 1

Introduction

1

Chapter 1. Introduction

1.1.

Towards service multi-robot teams

We are on the verge of the 50 anniversary from the day when Unimate, the first industrial robot, started working in a General Motors assembly line in New Jersey. It was very much unlike the ones imagined by the fertile minds of science-fiction writers, in both appearance and features. Nonetheless, robotics has since continued its steady progress into becoming an integral part of our lives, with increasing capabilities and embodied Unimate intelligence. Technological news nowadays have eagerly embraced robotics; whether it be showing us humanoid robots of growing sophistication, closing the gap towards the universal archetype of robots as artificial copies of humans; or following fascinating autonomous vehicles by ground, air or under water; or showcasing domestic appliances which take away from us mundane duties like floor vacuMetr´opolis by Fritz Lang uming. In particular, steady progress in mobile, field and service robotics heralds the day when robots will be ubiquitous in our lives: service robots will carry out day-to-day tasks like house cleaning, and help in other situations (museum guides, intelligent vehicles increasingly autonomous); they promise help in distress situations (road accidents, natural disasters, civil threat response); and are starting to enter the military (logistic Roomba vacuum cleaner support, demining, battlefield reduction of human assets). The growing use of robots leads naturally to its use in cooperative ways; that is, deploying teams that can perform their duties with more reliability (due to the multiplicity of resources), more quickly (by distributing workloads among team members), and with more efficiency (by jointly cooperating on a same task); even tackling problems that would be impossible for a single robot (e.g. lifting of heavy loads). The literature offers a number of surveys on multi-robot systems, ranging from generalistic approaches ([Iocchi 01, Pynadath 02b, Chlamtac 03]) to the ones devoted specifically Asimo to task allocation ([Gerkey 03a, Stone 97]). Here, we can find also studies devoted to particularly popular subproblems, like market-based task allocation ([Dias 05]). Of special interest for any task allocation study is the theoretical taxonomy found in [Gerkey 04a], since many problems can be identified to belong to one of the classes therein discussed. 2

1.2. Goals and contributions

A robot’s basic capabilities usually comprise a handful of specific, low-level actions: movement, object pushing or grasping, pattern recognition. However, specifying certain problems in terms of such low-level actions may be cumbersome, impractical or even impossible. This, in turn, favors the use of high-level tasks in order to model problems, and also to enable finding better solutions. Task recipes which aggregate task sequences, joint tasks comprising several simultaneous low-level actions [Gancet 05], or full HTNs [Erol 94], are some of the proposed solutions. Thus, the line between planning (“what” has to be done) and allocation (“who” does what) is not always clearly defined, depending on the problems to be solved and solution frameworks adopted. Furthermore, sometimes it may not be desirable to isolate both stages, but to carry them out jointly [Zlot 05]. For this reason, even when focusing on task allocation, planning aspects may arise in a particular context. The goal of task allocation in a multi-robot team is to optimize the use of the available resources, namely robots (but sometimes also computing devices or static sensors), in a way that is most convenient for the problem at hand. Heterogeneous robots with different capabilities, disparity of tasks to be performed, secondary problems like localization and navigation, are factors that affect the outcome of a mission. In order to perform the allocation, some model has to be adopted, which in turn shapes or conditions the deployed algorithms. There is a wide range of approaches, but a common starting point is to define a cost or utility [Gerkey 04b] to be minimized or maximized, respectively. This quantity somehow synthesizes all modeled aspects: time, distance, energy, quality of task execution, etc. Alternatively, this scalar may denote the fitness of a robot for a particular task, which is a favored approach in role-based task assignation [Vail 03], where a robot can commit for an extended time period to a same activity. Since the range of possibilities is quite large, this thesis begins in the next chapter with an study of the different approaches that can be found in the literature. There, many aspects are explored: task, cost, execution and solution models, computational complexity of typical algorithms, and the approaches taken by several representative multi-robot architectures. This overview served as input for the EURON roadmap [Bicchi 08], which in turn served to shape the EUROP strategic agenda [Liepert 09].

1.2.

Goals and contributions

As already stated, task allocation objective is basically to optimize resource usage. In this sense, task allocation is not an objective in itself, but the means to perform a given mission in the best possible way. For this reason, there are examples in the literature where ad-hoc task allocation is performed in order to achieve some purpose. E.g., in mapping, one option is to generate goals that maximize information gain ([Yamauchi 98], [Simmons 00]). While this is a satisfactory so3

Chapter 1. Introduction

lution for a particular problem at hand, the drawback is that switching the team to another kind of mission is not straightforward. The first stated goal of this thesis is to find generalizable results in task allocation techniques, in order to build and deploy multi-robot teams that can be quickly tailored to different missions with several performance goals to choose from. Even if this may come at the cost of underperformance when comparing to a specifically custom-made solution, on the other hand the expected advantage is flexibility and reusability. The second goal is to carry these results on to the realm of range-limited communications; that is, when robots are deployed over areas where they cannot remain in contact with each other and with the monitoring station. This is a common occurrence in outdoor missions or in unprepared indoor scenarios without network infrastructure. In these cases, a common modus operandi is to establish a command station from which robots are deployed at the beginning of the mission. It is then necessary to manage the robots in order to use them as routers for their teammates. This way, a communication path can be maintained from the base to any of them, while they carry out their mission. Combining these connectivity constraints with mission execution and, moreover, doing so in such a way that the team remains flexible in its objectives and performance metrics, is a challenging field of research which is growing as the use of multi-robot teams does.

Auction-based algorithms in light of optimization objectives Task allocation has been shown several times to be an NP-complete problem [Gerkey 04a]. For this reason, approximated or greedy methods have been used extensively. One of the most popular solutions in the literature are auction-based methods [Dias 05]. Auction methods have strong similarities with the process of a real auctioning of goods, hence their name, and are likewise based on the exchange of bids amongst robots. In this case, robots compete for the right to perform a task. An initiator entity (the auctioneer) proposes a job that has to be done; robots use their local information state to compute a bid which, according to some predefined rules, measures the robot expected cost or benefit when performing the task. After bid collection, the auctioneer decides upon a winner, which will be responsible for carrying out the task to completion. Auction methods sometimes rely on a close analogy to a market economy [Stentz 99], in which case they are best named as market-based methods. In other cases they simply mimic the announce-bid-award process using non-economic bid functions [Nanjanath 06], as many Contract Net-inspired [Smith 80] approaches do. In any case, they are favored, amongst other reasons, because they can be easily decentralized, since no central entity is mandatory, each auctioneer becoming temporarily a coordination master; they are efficient bandwidth-wise because all relevant information is synthesized and exchanged by means of such bids, which usually take the form of single scalars; also, experimental results have shown that 4

1.2. Goals and contributions

they perform very well in real life, often far above their worst case bounds when known [Lagoudakis 04]. For all these reasons they have been the subject of abundant research. Numerous implementations have been described in the literature ([Viguria 07, Nanjanath 06, Dias 05, Zlot 05, Kalra 05, Lemaire 04, Gerkey 02]) but efforts for systematic mathematical characterization of its theoretical efficiency have been scarcer [Lagoudakis 05, Lagoudakis 04, Gerkey 04a]. Furthermore, although they are usually presented as flexible general task allocation methods not particularly bounded to a definite problem, the effects of different optimization criteria are seldom thoroughly analyzed. Lastly, comparisons between implementations and algorithm variants across authors are scarce. The second chapter is aimed at filling this gap for a class of canonical auction algorithms by various authors. Since many application domains involve robots visiting a number of goals scattered over a geographical area, the mission used in this study is object searching in a large building. Two intuitive optimization objectives, minimum total resource usage and minimum total time, are evaluated by means of simulations over a model of a real world mall. Complementarily, simulations on randomized synthetic graphs are used to extract trends not confined to a particular scenario.

A stochastic solution for improved task allocation Hierarchical Task Networks (HTNs) is one of the tools used to represent problems which require a certain degree of planning. They allow the formulation of a problem in terms of a top-down hierarchy, in which higher levels in the task tree represent tasks of higher abstraction. At the bottom, there are atomic or primitive tasks, which cannot be further refined into simpler tasks. This decomposition is a good fit for a robotic problem, in which atomic tasks map naturally to robot capabilities. CMM

CMS

1000

400 Cost

Cost

600

200 0

500

0

Lago LagoTail Single SingleExt Algorithm

Lago LagoTail Single SingleExt

MinSum MinMix MinTim MinMax Criterion

Algorithm

MinSum MinMix MinTim MinMax Criterion

Figure 1.1: Cost evaluation of metric and algorithm combinations. 5

Chapter 1. Introduction

While auction-based task allocation is considered generally efficient, usual implementations make use of greedy steps in order to maintain algorithmic complexity bounded and speed up the auctioning process. In search of a compromise in order to improve the performance of basic auctions, combinatorial auctions is one of the favored approaches. These arise when bundles of tasks are considered simultaneously for bid computation [Berhault 03], and also when HTNs are used in combination with auctions [Zlot 05].

Figure 1.2: A parking exploration plan expressed as a HTN. The third chapter is devoted to an alternative solution for problems with a very large solution space, which tackles simultaneously the planning and allocation issues. It discusses a general framework for flexible mission planning, using hierarchical task networks as descriptive model, the multiple traveling salesmen [Helmberg 99] as optimization model, and decentralized simulated annealing [Kirkpatrick 83] for solution search. This proposal does not discard valid solutions, contrarily to greedy auctions, which means that the optimal solution is always in the solution pool; nor is it impaired by the need to quickly compute bids for many tasks. Instead, solutions appear naturally during the annealing process, and improved results may be obtained at any point during the remaining mission time. This proposal is compared to an auction-based implementation inspired in [Zlot 05] which makes use of HTNs during bid computation, in order to highlight the benefits of each approach.

Constrained networking environments A central problem in mobile domains is multi-robot routing, namely the problem of coordinating a team of robots in terms of the locations they should visit and the routes they should follow in order to accomplish their common mission. A typical assumption made in early work on multi-robot routing is that robots are able to communicate uninterruptedly at all times independently of their locations. As time goes by, real life limitations like large areas to cover, out of sight operation, 6

1.2. Goals and contributions

unstructured environments and so on have grown to be a forefront issue to be addressed. In such conditions, communication amongst entities cannot be taken for granted, but must be explicitly considered and managed. Chapters 5 and 6 are devoted to investigate the multi-robot routing problem under communication constraints, reflecting on the fact that real mobile robots have a limited range of communication and the requirement that connectivity must be managed through relaying. These communication constraints add a new level of complexity to the task allocation problem, but on the other hand they bring the multi-robot routing problem closer to reality. Basic approaches opportunistically take advantage of connectivity when available [Burgard 05], but do not explicitly avoid network splits. A possibility is to include link quality in the motion [Mostofi 08] or goal generation functions [Rooker 07]. This idea is not readily applicable to general-purpose field or service robotics, because the tasks are provided by external sources and may be totally unrelated to those generated by the team in order to preserve communication. These approaches are, furthermore, prone to local minima problems. Some approaches do not measure the actual signal, but rely on models like circular radius [Vazquez 04] or constant decay. There is growing evidence [Hsieh 08, Quin 03] suggesting that such models can badly misrepresent the behavior of real signals [Sridhara 07], leading to temporary connectivity losses. Even building a multi-hop chain to reach a single target is challenging, involving quality and resource compromises [Burdakov 10, Stump 08]. While some domains of application may tolerate temporary losses of communication between team members, in others it is beneficial or even mandatory to maintain uninterrupted connectivity at all times: monitored surveillance, rescue missions with teleoperation components, tightly coordinated cooperation, etc. In these domains, uninterrupted contact to a stationary monitoring base or command center is often required. Approaches that are suitable for general purpose service robotics in which tasks are not tied to a particular problem domain and uninterrupted connectivity is an explicit requirement are scarce. Two such approaches to the connectivity constraint are explored in this thesis. In chapter 5, a reactive algorithm is presented which restricts robot mobility in response to changes in the network quality graph. At the same time, mechanisms are provided in order to ensure that arbitrary targets are visited by one robot with the support of others, by form- Figure 1.3: Robots visit targets while ing clusters of connected robots that span continuously connected to a static base, following a pre-planned optifrom the static base to the target. This solution is highly dynamic and al- mal TSP tour. 7

Chapter 1. Introduction

Figure 1.4: Deployment plan using the MST C ONNECT T REE strategy. lows robots a high degree of liberty, as long as their local signal quality is good. This translates into higher throughput an lower completion times per task. However, it is not well-suited for cluttered environments with large or complex obstacles. This limitation is addressed in chapter 6, where the C ONNECT T REE strategy is presented. This solution builds upon the idea of fixed routes that originate at the deploying point, that extend out to reach the mission targets, and from which robots cannot deviate. By using some robots as temporarily static relays, a best effort is made to visit every target. At the same time, by purposely building this connecting path tree, some guarantees of performance in the form of cost ratios over optimal unconstrained tours can be provided. The methods in both chapters build on approaches that treat network integrity as an inviolable condition. With this premise, both solutions are, in that regard, safe (a connected graph of valid links spans over all the robots at all times) and complete (all targets within reach of a chain formed by all robots are eventually visited).

Reusable components for multi-robot research and deployment Two strategies are becoming increasingly relevant in the robotic community as efforts to improve the quality of research and simplify the comparison of results within the same topics. On the one hand, the use of standardized or community defacto approved data sets [Howard 03] or benchmarks [Bonarini 06] which allow comparison of implementations. On the other hand, publication of source code as a means to enable code verification, algorithm validation and full reproducibility [Bonsignorio 09a] of simulations and experiments is increasingly a concern [Bonsignorio 09b, Wawerla 09] due to the many complexities of large robotic systems, and in particular of multi-robot [McLurkin 09] systems. Multi-robot research imposes a wide range of demands upon software develop8

1.2. Goals and contributions

ers: embedded code in the robots which complicates debugging and deployment; remote Graphical User Interface (GUI)s for monitoring and control; message passing across robots, sometimes with specialized protocols to be used [Tardioli 07]; computing that is often distributed or decentralized; all of this besides the specific field of multi-robot research being tackled. Amongst published robotic software architectures, there is an emerging trend on the use of component-based solutions [Quigley 09, Brooks 05, Gerkey 03b, Bruyninckx 01], for they provide a black-box approach to algorithms via welldefined interfaces, while hiding the complexities and internal states of particular implementations. S ANCTA1 , the library discussed in the final chapter of this thesis, is a flexible architecture for multi-robot teams following a component-based approach that takes advantage of the high-level multitasking capacities of Ada, not present in other languages like C or C++. The aforementioned complex multi-robot development environment positions the Ada programming language as a very suitable choice for robotic research, due to its ingrained principles in software safety, reusability, portability and maintainability. Published under the GPL open source license, S ANCTA is the backbone to all simulations and experiments in this thesis. Its availability allows other researchers the close inspection of the algorithms described in this thesis, and provides a big jump-start to anyone interested in reproducing or furthering its results.

Figure 1.5: S ANCTA components can be plugged into each other via inputs/outputs with matching data types. This library is amongst the first published large pieces of software that make use of novel features in the Ada 2005 language revision2 , particularly in academic research settings. It also brings many readily usable features to the Ada robotics community, thanks to its binding to the Player robotic library [Gerkey 03b]. For these reasons, an introductory paper about it was published in the Ada-Europe international conference on reliable software technologies [Mosteo 07b]. S ANCTA has been used to provide the task allocation functionality for the URUS European project [Sanfeliu 07], of which we were responsible for the task allocation work package. This project goal was to bring mobile robots into the city, in order to provide services like transportation of objects and guiding of people 1 Simulated 2 Standard

ANnealing Constrained Task Allocation, after its first allocation algorithm. ISO/IEC 8652:1995/Amd 1:2007

9

Chapter 1. Introduction

within a sensor network area. One of the principal results of URUS was the definition of the URUS robot, as a set of services that each robot must provide via a well-defined interface. Any robot implementing this URUS protocol can be then integrated in the team and start receiving orders and performing missions. This is the largest project in which the S ANCTA library has been deployed. During final experiments, four robots of three completely different classes (two Segway-based platforms, one robotized golf cart, and one differential-drive four-wheeled robot) were running this software and concurrently executing missions.

Summary of contributions These are the contributions and publications product of this thesis: A survey of the task allocation problem from the point of view of generalpurpose service robotics, including a brief review of relevant characteristics of recently proposed multi-robot architectures and teams. This survey served as input for the EURON excellence network roadmap [Bicchi 08] and the EUROP strategic research agenda [Liepert 09]. Implementation and evaluation of canonical auctioning algorithms with respect to several metrics relevant in multi-robot routing domains. Typical metrics like minimum mission time and minimum distance traveled, as well as weighted linear combinations are considered. This evaluation reveals particular pathological performance for some combinations of algorithms and metrics, shedding light on some pitfalls present in these basic auctioning algorithms. These results were published in the international conference IEEE ICRA’07 [Mosteo 07a]. Design and implementation of a new simulated annealing hierarchical planner and allocator, and its evaluation and comparison to an auction-based equivalent algorithm. This study highlights the advantages and drawbacks of each of the approaches. On the one hand, the simulated annealing solution requires extra steps to make it decentralized, as discussed in this implementation; on the other hand, results show that it is better suited for problems with very large solution spaces, and faster into achieving complete solutions since all tasks are planned simultaneously. A beneficial side effect of this is that some special operations, like evaluation of very different plans, is done for free thanks to the intrinsic feature of reheating in the simulated annealing process. These results were published in the workshop Network Robot Systems: Toward intelligent robotic systems integrated with environments from the IEEE IROS’06 [Mosteo 06b]. A new algorithm for reactive task allocation in limited-range communication operations. This algorithm works on top of a virtual spring-damper mesh which manages low-level mobile control, designed to avoid network 10

1.2. Goals and contributions

splits in the MANET formed by the robots carrying out a mission. The main characteristics of the allocation algorithm is that it has no final control over the robot movements, which is always granted to the virtual mesh. Instead, it manages the clustering of robots and task assignation to these clusters, providing guarantees over the absence of deadlocks or livelocks, and thus over the eventual mission completion. Another advantageous feature of this algorithm is that it includes, as one of its subroutines, a black-box step in which several different traditional task allocation solutions can be plugged in. This confers great flexibility to this proposal, as demonstrated by the various examples provided in this IEEE ICRA’08 [Mosteo 08] paper. Full details over the algorithm, with a complete evaluation and experimental demonstration, are further provided in an article [Tardioli 10] in the International Journal of Robotics Research. A theoretical cost analysis is performed on the idea of using static relays to extend the range of operation of a robot. Algorithms to devise the placement of such relays are presented which allow the statement of upper bounds on the cost of the solution over the optimal tour for an unconstrained robot. An implementation is provided and demonstrated in simulation, in a paper in the international symposium DARS’08 published as a chapter in the proceedings book by Springer [Mosteo 09b]. A further improvement to the basic ideas in this work, making use of concurrency in the algorithm, is presented in an IEEE IROS’09 paper [Mosteo 09a]. The sum of these results has been named the C ONNECT T REE strategy. The source code for the algorithms behind the simulations and real experiments shown in these publications is itself published [Mosteo 10a] under an open source license, and introduced in an international conference paper [Mosteo 07b] under the name S ANCTA (Simulated ANnealing Constrained Task Allocation). The proceedings of the conference were published in a book in the Lecture Notes on Computer Science series from Springer. This library was awarded the second prize in the XVII Ada-Spain competition for academic projects. This software has been deployed in large-scale experiments in the context of the URUS European project, in which the simulated annealing, auction-based and C ONNECT T REE algorithms were all integrated.

∗∗∗ As stated at the beginning of this introduction, the ultimate goal of these results is to advance towards the realization of highly autonomous and customizable multipurpose robot teams, able to operate in unprepared environments. The pervasive use of the metrics identified in the first publications across all the work herein presented, the integration and publication of all described algorithms by means of the S ANCTA library, and its ultimate deployment as part of the URUS project, illustrate this stated goal. 11

Chapter 1. Introduction

12

Chapter 2

Survey of multi-robot task allocation This chapter offers an overview of principal approaches to task allocation, with a strong focus in solutions used in service or field robotics. Popular methodologies are identified and their strengths and shortcomings discussed. Critical properties of algorithms are described and examined in recently proposed architectures. A focus on algorithmic complexity and quality bounds of solutions is kept through all the survey.

13

Chapter 2. Survey of multi-robot task allocation

2.1.

Introduction

Robotic teams have several potential advantages over single-robot operations: some problems can be unsolvable for robots on their own, requiring cooperation of several agents; they may be able to complete a mission in less time; cooperative approaches that are more efficiently may exist; finally, they are more resistant to failure thanks to redundancy of components. The use of multi-robot teams introduces new challenges, and in particular a critical one is how to best take advantage from the variety of available resources. Agents in the environment must communicate and coordinate to determine what jobs have to be done and who is best suited to perform them, according to some work objectives. Interaction with human observers or controllers may be also a factor to be integrated as seamlessly as possible. Generalized use of wireless communications has broadened the interest in such technologies and has opened the gates for integration with other domains like mobile robotics. Affordable of commodity hardware in both of these fields has created a propitious climate for advancements in a common ground. This kind of research can be found under several labels like Mobile Ad-hoc NETworks (MANETs) or Networked Robotic Systems (NRSs). It may also be seen as a specialized branch of cooperative robotics where the communication is explicit via the wireless medium. Finally, it shares many elements of research with the field of Wireless Sensor Networks (WSNs). The distinguishing characteristics of each approach can be succinctly summarized as: MANET: Mobile nodes with ad-hoc routing. NRS: Mobile robots and static sensors wirelessly connected. WSN: Medium/large mesh networks typically using low-power nodes. From these salient traits, overlapping research interests are natural. The fields of application of such convergence of technologies are abundant and notable results can be already found. There are examples in cooperative mapping [Burgard 05, Zlot 02], autonomous surveillance of buildings [Gerkey 05], monitoring of forest areas [Lemaire 04] and teamwork in higher RoboCup leagues [Vail 03], to cite a few. Like the already pervasive Internet, the combination of networked static appliances and mobile robots seems an inevitable path for modern society. This will require the definition of standards and common protocols for interaction between such devices to properly exploit its full potential. Resource allocation strategies making use of these standards will aim to improve group performance beyond that of its individual members in isolation. This survey addresses the task allocation problem from a point of view of networked robotic teams, its objective being to optimize the utilization of the resources available. These resources can take the form of mobile robots with heterogeneous capabilities depending on the on-board sensors and actuators, fixed sensors already 14

2.2. Objectives and challenges

present in the environment, computational-only devices and in general any available networked device. The optimization criteria may be varied and tailored to the missions to perform. For example, one may want to minimize the time to explore some environment [Burgard 05], or distribute the robots in the best way to maximize the wireless coverage over some area [Gerkey 06, Batalin 02], or establish a real-time wireless chain between two distant points to relay a video stream [Tardioli 07]. Furthermore, conflicting objectives could arise in a same mission. It has been pointed [Gerkey 04b] that this field of research has several similarities with Distributed Agents of classical Artificial Intelligence. Also, it is not uncommon to see the terms Agent and Robot being used indistinctly, specially when not all the elements in the team have robotic characteristics (e.g. static sensors). Another inheritance from AI are HTNs, used to represent richer levels of task semantics. It has been shown (e.g. [Zlot 05]) that taking advantage of such information, i.e. addressing planning aspects together with allocation, can be advantageous. In general, work that systematizes the resource allocation problem has been given preference in this survey. General theories, frameworks and architectures provide abstractions that can be reused across problem domains. This enables the use of generic optimizations, being thus beneficial to any problem that can be modeled with these generic tools. These factors are stressed in this survey whenever applicable.

2.2.

Objectives and challenges

There are several properties that are desirable in the allocation mechanism of a robotic team. Not all of them are easily achievable, and they often entail trade-offs. Mathematical soundness: a well understood underlying mathematical theory gives the advantages emanating from the properties of said theory. We may find quality bounds, known expected time of computation, rate of convergence to solution, and so on [Gerkey 04a]. Distributability: understood as the capability to disseminate the work among the available devices, it allows a higher usage ratio of available resources, increased computational capabilities, and faster results [Gerkey 06]. It may make the system more vulnerable if there are no additional mechanisms providing redundancy, since the failure of any element might impair the proper working of the whole system. Decentralization: the elimination of centralizing elements or bottlenecks provides fault tolerance and resiliency [Gancet 05]. Replicating functionality across entities is a guarantee against the failure of any single element. Making roles transient or transferable [Long 03] is another way of achieving decentralization. 15

Chapter 2. Survey of multi-robot task allocation

Scalability: it is the property by which a system can properly function with arbitrary problem sizes. It is expected that the amount of networked elements will go up [Konolige 04] with the adoption of these technologies. Good scalability properties are thus of paramount importance in some domains. One way to achieve scalability is by means of locality: only locally available or nearby reachable information is needed for an agent decisional process [Kalra 05]. Local algorithms have the advantage of putting less stress in the network and are in general more scalable. Fault tolerance: the correct operation when facing partial failures is an always desired property, and is even more relevant when the amount of hardware that is being managed is respectable and occupying a large environment. Seamless integration and removal of agents in the system is another tolerance aspect with implications [Long 05, Konolige 04] in real deployments. Flexibility: different service missions may be performed in a same environment by using heterogeneous or flexible robots [Sanfeliu 08]. It is then interesting to invest into methods that allow an easy integration of new agents, goals or other related elements. Responsiveness: meaning low latency until achieving good results after (usually large) changes are introduced in the working environment or mission conditions, but also in response to task addition and removal [Ferreira 08, Scerri 05, Dahl 03], it is directly related with the good performance of the allocation strategies in dynamic environments.

2.3.

Definitions

This section explores some of the fundamental ingredients of allocation strategies, and how they affect the capabilities available when using them. Some of the points discussed herein are more thoroughly analyzed in [Gerkey 04a].

2.3.1.

Planning vs Allocation

There is some ambiguity in the use of the terms planning and allocation. Drawing terminology from [Zlot 05], some consider that planning is the answer to the question “what has to be done”, regardless of how this work will be distributed to the available agents. On the other hand, allocation is the answer to “who does it”. Other sources [Dias 05] prefer to refer to the first question as task decomposition, since generally the methodology involves iterative refinement of complex tasks into simpler ones. In that case, planning refers to all stages of the problem. There are several methodologies for planning [Dias 05]: 16

2.3. Definitions

Decompose and then allocate: in this approach [Caloud 90], the isolation of both stages can simplify the problem, often with the drawback that relevant available information during allocation is not fed back to the decomposition stage. Allocate and then decompose: complex tasks are first distributed to agents and they are responsible for the refinement of these tasks [Botelho 99]. Further reallocation may be necessary after these refinements. Hybrid approaches: the most common approach, in which there is no rigid separation of the stages described [Zlot 03]. Since all relevant information can be considered at once, they are in general more powerful but also more complex.

2.3.2.

Cost models

Several magnitudes have been used to characterize the tasks executed in the robotic domain. These are then used during the allocation or optimization process. The most common ones use these basic concepts: Cost: a characterization of the cost that it takes for a robot to execute a task. Examples are time to reach a goal, distance traveled, energy consumed, etc [Lagoudakis 05]. Special mechanisms may be necessary to model impossible (i.e. infinite cost) tasks. Fitness: a characterization of “how well” [Stone 98] an agent can perform a task. Usually a normalized range is used, in which 0 naturally represents tasks undoable by an agent. Reward: a characterization of the gain of completing a task, most often found in market-based approaches [Stentz 99]. It acts as an incentive to perform tasks. Priority: some characterization of the urgency of completing a task. Depending on how the system is designed, higher priority tasks can preempt all lesser tasks [Antonelli 05]. A combination of these is generally used to represent the ‘value’ of a task. The common term to refer to these combinations is utility. Examples of commonly found utilities are Utility = Reward − Cost Utility = Fitness − Cost Another commonly used trait is to floor the utility so it can never be negative:  Reward − Cost If result ≥ 0 Utility = 0 If result < 0 17

Chapter 2. Survey of multi-robot task allocation

This gives two opposite ways of thinking: Minimization: when reasoning purely in terms of costs, higher values are worse and the objective is to minimize the costs resulting from the allocation. This is often found in routing approaches where the bulk of the cost is goal reaching [Lagoudakis 05]. Maximization: when thinking in terms of utilities, higher values are better and the objective is one of maximization. These are more naturally found where the underlying problem is also of maximization, e.g. network quality coverage [Gerkey 06] or map accuracy [Yamauchi 98]. Another possibility is to associate values not to tasks, but to world states. This is notoriously the case in Markov decision processes (MDPs) or partially observable Markov decision processes (POMDPs) [Spaan 06, Schesvold 03, Bernstein 02, Guestrin 01, Peshkin 00], that apply this well-known mathematical framework to the multi-agent domain. In this case, a discounted (over time) or total (when there are final states) reward is computed by exploring the state space.

2.3.3.

Task models

In certain domains, tasks have a well-defined set of constraints (duration, starting and final conditions). These tasks are prone to sequential execution and the most common terminology used for them is usually simply jobs or tasks. In other domains (principally in team games) predefined positions such as “attacker” or “defender” are assigned to agents. These can be better referred as roles [Dias 05]. As such, roles name a set of related behaviors or recipes for actuation. Behaviorbased approaches are more suited to this usage, since they tend to make use of excitation/inhibition algorithms that result in some particular behavior (i.e. role) to be activated to temporarily govern the robot [Batalin 02].

2.3.4.

Execution models

Depending on robot capabilities, the execution of tasks can take several forms [Gerkey 04a]. Tasks can require a single robot or multiple robots to be executed. Robots can be able to just perform a single task or multiple tasks at the same time. The common combination of tasks requiring single robots which are only able to execute a task at a time is known as sequential execution model. Primitive tasks, as seen in the context of HTNs are defined as a task that cannot be further simplified. Usually, these tasks correspond to the single tasks in the sequential execution model. However, in domains with tasks requiring several 18

2.3. Definitions

robots (e.g. pushing of wide objects), they also correspond to these tasks [Lin 05]. A name found in the literature for this kind of task is joint task [Lemaire 04].

2.3.5.

Task constraints

An allocation algorithm has to accommodate tasks carrying several kinds of restrictions on its execution. The most commonly found constraints are: Partial ordering, in which a task must be completed before or after a set of others (but not necessarily immediately so, since in this case there is no planning flexibility). An example would be opening a door before another task can be accomplished in another room. Time windows, in which a task must be completed in a given time frame, or before a certain deadline [Larsen 04]. Coupling, in which two or more tasks must be executed at the same time (e.g. in coordinated box-pushing of wide objects). Incompatibility in which executing one task may preclude or obsolete the execution of others. As already said, a common and powerful tool that allows the expression of these restrictions are HTNs [Erol 94]. Problems may exhibit only a subset of these restrictions, simplifying the necessary framework.

2.3.6.

Other constraints

The constraints identified in the previous paragraphs are intrinsic to the tasks describing the problem domain. There are, however, other kind of constraints that can adversely affect the execution of a mission. These constraints arise from the use of limited technologies, or real world constraints. Mobility interferences, due to narrow spaces in relation to robot size or numbers [Zuluaga 08, Guerrero 06]. This is a problem often neglected in multirobot simulation that nonetheless becomes apparent in real experiments. Network range, due to limited coverage of infrastructure, ad-hoc devices with limited range or a need for line-of-sight [Kalra 07]. A problem that was bypassed initially by assuming ideal communications, it has become one of the critical points to address in large-scale deployments [Hsieh 08].

2.3.7.

Optimization objectives

Sensible objectives for allocation optimization depend on the mission at hand. Here are described the most relevant ones which are commonly found in the literature. Note that a minimization when using costs has a dual problem of maximization when using utilities. Nonetheless, they are enumerated explicitly. 19

Chapter 2. Survey of multi-robot task allocation

Minimize costs of the worst agent, also known as the M IN M AX criterion. This criterion is of interest in time critical missions, since it gives the shortest mission execution timespan [Lemaire 04]. Its focus is to optimize the execution of the worst performing agent. Maximize utility of the worst agent, also known as the egalitarian criterion in welfare related studies, is the dual optimization of the previous one. In this case we try to optimize the worst performing agent [Mouaddib 04]. Minimize the sum of individual costs, also known as minimization of the TotalSum or M IN S UM. It is relevant in efficiency contexts, for example optimization of fuel usage in transportation or delivery situations. Maximize the sum of individual utilities, dual of the previous one, is specially interesting in contexts where finalization of tasks gives some reward, since it gives the best possible allocation in this respect. It is usually found in market-based approaches because of the direct translation of economic profit characteristics [Stentz 99]. Stochastic models [Spaan 06] also try to obtain the maximum utility over a period of time, finite or not. Minimize the average cost per task, or latency when cost equals time. This objective measures the average time since a task appears in the system until it is completed. It has relevancy in domains where task completion is more important than aggregated global costs, e.g. in rescue situations where time to victim location may be critical. Maximize throughput, which is the dual of the previous one, and often used in industrial, manufacturing processes. There are interesting effects to be considered when using some of these criteria. For example, the M IN M AX objective is only concerned with the worst agent, and hence all costs below this threshold are “hidden” in the final result, unless somehow included in the optimization. By using a pure M IN M AX optimization process, grossly underperforming behaviors below the critical one may occur. Conversely, in a pure M IN S UM optimization, the bulk of the work may fall on a single agent, even if there is a number of idle robots. These issues, its root causes not always explicitly identified, are circumvented with additional elements in the cost functions, like the equity coefficient in [Lemaire 04] or the overlapping factor in [Simmons 00].

2.3.8.

Optimization of multiple objectives

It is not frequent to optimize several discrete criteria at the same time. The more common approach is to reduce all interesting aspects to a single value of utility or cost, as previously exposed, and with the associated problem of information loss. [Amigoni 05] is an example that specifically address this problem. 20

2.4. Computational complexity

Here, the several utility functions are kept apart, and the optimum is determined in terms of Pareto optimality. Pareto-optimal solutions have the property that no criterion can be improved without worsening another one. (This has been also called the egalitarian property [Mouaddib 04]). Problems satisfying certain conditions [Fudenberg 91] only have one Pareto-optimal solution (or several that are equivalent). This is called strong Pareto optimality. Pareto solutions are interesting because they simply bypass the problem of combining magnitudes which may have no common units or meaningful conversions between them, reducing the need for “tuning parameters”. The problem remains that a Pareto solution does not guarantee any minimums, and thus can be very bad for some of the magnitudes being optimized.

2.4.

Computational complexity

This section explores the topic of algorithmic complexity in task allocation. Not always discussed when presenting an algorithm or framework, complexity is key to determine scalability properties for an algorithm. Also, pathological worst cases may be identified in which apparently fast algorithms might fail to provide a timely answer.

2.4.1.

Deterministic approaches

The problem is of NP-hard nature [Gerkey 04a] except for the simplest models. That is, there is no expectancy of finding scalable, quick algorithms to solve the complete problem to optimality. Hence, most practical solutions have to make compromises. Nonetheless, there are known exact results in the P realm worth to be mentioned, which follow. 2.4.1.1.

Optimal Assignation Problem

When there is a maximum number of tasks at most equal to the number of agents (for example, in team games where there is a fixed number of roles to assign) and in a sequential execution context, the problem can be equated to the Optimal Assignment Problem (OAP). This problem can be solved in time O(n3 ) by means of the Hungarian Method [Kuhn 55] or linear programming. Also, a distributed algorithm using auctions can be constructed that converges to the optimal solution [Bertsekas 90] with rates directly proportional to the maximum utility and inversely proportional to the minimum bid increment. When there are more tasks than agents, OAP can still be used in its iterative form: in each computation, the optimal solution is found with one task per agent, leaving excess tasks unassigned. Assuming enough time elapses between task completions, this iterative algorithm can be used to find a new assignation upon each task completion. However, this approach is unsuitable for tasks that have an im21

Chapter 2. Survey of multi-robot task allocation

plicit cost when prematurely switched (e.g. if a retooling is necessary in a machine, or a complex process of grasping is ongoing). Experimental results in [Goldman 04] suggest that valid performance for realtime assignation using a centralized solver can be expected for up to hundreds of agents/tasks, obviously subject to improvement related with technological Moore’s Law [Schaller 97]. 2.4.1.2.

Greedy algorithms

Greedy algorithms can be trivially constructed in many situations for fast solution finding. The basic idea is to choose at each decision step the best available candidate. Only problems that satisfy certain conditions related with matroid theory [Korte 00] will be solved to optimality by a greedy algorithm, and this is rarely the case in allocation contexts. It is interesting however to be able to characterize the quality of the greedy solution for a given problem. In this regard, it is useful the concept of α-competitiveness, which means that a given solution will never have an utility less than α1 the one of the optimal solution. The greedy algorithm for the OAP is 2-competitive, or 3-competitive compared to the post hoc optimal solution if new tasks appear on-line and robots cannot be reassigned (in this later case, without a model of the new possible tasks this is the best possible solution [Kalyanasundaram 93]). As a consequence of the previous result, in the context of sequential task execution the greedy algorithm exhibits also a 3-competitive performance.

2.4.2.

Brief note on NP-hard models

NP-hard problems cannot be solved efficiently in the general case. A study of models exhibiting this property is available e.g. in [Gerkey 04a]. Not cited there, but subject to the same problem of complexity is the general technique of Distributed Constraint Optimization Problem (DCOP) [Yokoo 01], which can be found in task allocation in, for example, the work of [Ferreira 08, Scerri 05].

2.4.3.

Stochastic approaches

Comprised here are Markov Decision Processes (MDPs), Partially Observable MDPs (POMDPs), Stochastic Games (SGs) and Partially Observable SGs (POSGs). Tentative approaches for tight coordination using the previous techniques are presented in §2.5.2. An study of complexity is offered in [Pynadath 02b] but suffice it to say that the complexity of these frameworks is humongous (in the order of NP-completeness for centralized approaches, and NEXP-completeness for distributed ones [Bernstein 02]), so approximation strategies are being actively researched. Two representative frameworks are PEGASUS [Ng 00] and PERSEUS [Spaan 05]. 22

2.4. Computational complexity

Figure 2.1: A HTN with an abstract task and three primitive tasks. The horizontal link implies an AND relation, its absence an OR relation. That is, Root could be executed either executing subtasks 1 and 2, or primitive task 3.

2.4.4.

Hierarchical Task Networks

HTNs [Erol 94] have been extensively used for planning in artificial intelligence domains in the past. They have been imported to the robotic domain with success [Zlot 05, Gancet 04b, Belker 03, Zlot 03, Clement 99, Goldman 00]. Next follows a description of its characteristics and advantages and a summary of its applications in multi-robot research. An in-depth analytical study of their properties can be found in [Erol 94]. Results in there show that, except for the simplest HTN (totally ordered tasks and no variables), finding valid plans may range from NP-complete to undecidable complexity. HTNs are a particular kind of network, organized as a tree data structure (figure 2.1). The most salient component of a HTN are tasks, that are linked by means of parent-child relationships. As such, child tasks are a refined, more-detailed description of the execution of its parent. Tasks that are leaves are called primitive and completing execution of these tasks implies the completion of its parents. Tasks that are not leaves are called abstract or compound tasks. The domain knowledge is embedded not only in the tasks but principally in the methods, which are algorithms that, given an abstract task, can generate the required child tasks to complete the abstract task. These children need not be primitive, but eventually some descendants have to be for the network to be executed, since agents are only able to execute primitive tasks. Important benefits of HTNs are Simple visualization, that allows human operators to supervise and provide plans in an intuitive way. Generic mechanisms to manage domain specific knowledge, pluggable in the form of tasks and methods. Abstraction by means of top-bottom analysis given abstract goals, and reciprocally bottom-top construction starting at the agent capabilities. 23

Chapter 2. Survey of multi-robot task allocation

Abundance of research, implementations [Nau 04, Goldman 04, Nau 99] and applications. Some examples of application follow. HTNs were used in [Zlot 05, Zlot 03] to allow complex auctions of not only single tasks but partial subplans, augmenting in a structured way the auction mechanism. They are used to autonomously generate executable plans bridging the gap between abstract, symbolic plans and the sensed data in [Gancet 04b]. In [Clement 99] they are used for coordination and conflict avoidance at abstract levels of reasoning by using inconditions, that is, invariants that must hold during task execution. In [Goldman 00] the key ingredient is human interaction: HTNs are used as an effective way to quickly instruct teams of mobile robots, rapidly and accurately tailoring existing plans to novel situations.

2.5.

Solution models

This section presents different solutions to the task allocation problem, grouped by the kind of strategies used, in no particular order.

2.5.1.

Centralized models

A solution is said centralized when a single element in the system is responsible for managing all the available resources. Their strong point is that they can use the best known algorithms and usually have more information available than distributed or local algorithms. This strength is in return burdened by the risk of losing contact with the controlling element, introducing a single point of failure. Scalability is jeopardized by the network load required to gather the required inputs and the complexity of the algorithms used. Usually, hybrid characteristics are present [Causse 95] to minimize these problems. Centralized algorithms in pure form are sometimes seen in real robotic teams, although they are not favored. [Brummit 98] is an example of a purely centralized mission planner. In [Bowling 04] an entire RoboCup team is controlled by a central coordinator, because information is acquired globally by means of a zenithal camera. Central algorithms are also useful during simulation stages of research since all the information is readily available [Amigoni 05, Rosencrantz 03]. Many stochastic frameworks also exhibit centralized traits since, for example, the optimal joint policy must be centrally computed [Gmytrasiewicz 04] and then relayed to the agents. Market-based approaches also advocate for temporal and localized centralization as a way to obtain better deals [Dias 02]. Temporal leaders are opportunistically used to improve performance, but in their absence the system would equally function properly. 24

2.5. Solution models

2.5.2.

Stochastic models

Probabilistic frameworks are a way to provide optimal control in tightly coupled domains. The main obstacle for their use is that they quickly become intractable even for small sized problems. For this reason, approximation techniques are actively being investigated [Spaan 05, Ng 00]. In the framework of Markov Decision Processs, there are cooperative agents with perfect knowledge of its environment and a probabilistic model of the effects of its actions. The unrealistic assumption of perfect world observability is lifted in Partially Observable MDPs, being replaced for some probabilistic model of the sensory inputs. When the environment includes adversarial elements or agents with unknown policies we call them Stochastic Games or Partially Observable SGs. The principal advantage of these techniques is its promise of providing optimal control in tightly coupled missions with uncertain world dynamics and perceptions. Its drawbacks are, however, numerous: They are computationally very expensive; in most cases include some degree of centralization to compute joint policies; simulation results in very small sized worlds are not straightforward to carry to real-life missions; finding the optimal policies is an iterative process that can be slow and that may require repeated exposure to the environment in which the action takes place. Other approaches found in task allocation are optimization methods with an stochastic step. Simulated annealing [Gerkey 06] and genetic algorithms [Alba 04] are two such methods. 2.5.2.1.

Collaborative stochastic frameworks

The first advances [Peshkin 00] in multi-agent POMDP planning are of limited practical utility, being reduced to very simple models: usually small grid worlds are used, whose translation to reality is not apparent. In [Guestrin 01] factored MDPs are used to make tractable otherwise intractably sized MDPs, and are demonstrated in very small simulated discrete worlds. [Pynadath 02b] provides a generic framework for the study of the complexity of solutions based in this paradigm. It is a good starting point to fathom the huge computational complexity of this kind of approach. For the explained reasons, (PO)MDPs are more successful in environments where decisions are discrete and to be taken amongst few options. For example, a role assignation solution is presented in [Spaan 03], where the small number of roles and agents (three in each case) makes it suitable for a MDP approach. In [Schesvold 03] a combat UAV has to choose between search or strike actions and a POMDP is used to compute the policy. An exact algorithm has been developed for solving of POMDPs in [Hansen 04], although it is practical for very small problems only. Preliminary results for the PERSEUS framework are shown in [Spaan 06], where communication is explicitly integrated in the planning framework of a decentralized POMDP solved by approximation, although experimenta25

Chapter 2. Survey of multi-robot task allocation

tion is limited to small simulated grid worlds. 2.5.2.2.

Adversarial stochastic frameworks

Relevant concepts in the context of stochastic games are Nash equilibrium: All competing agents have one (or several) policies such no agent can improve its reward if the rest remain stationary in its policies. Rationality: If adversary policies are stationary, a learning algorithm will be rational if it converges to a best-response policy for those adversarial policies. Convergence: A learning algorithm has this property if it will necessarily converge to some final policy. It may depend in the algorithms used by adversarial agents. In [Bowling 02] results are presented to achieve both rationality and convergence and this is demonstrated in a small grid world. The Nash equilibrium objective is relaxed in [Gmytrasiewicz 04], being argued that it is an incomplete assumption because no clear way to choose amongst several equilibria exists, and periods off-equilibrium must be managed anyway. Experimentation with real robots is presented in [Emery-Montemerlo 05], where a Partially Observable Stochastic Game (POSG) is used for a pursuit problem using two robots. Clustering techniques are used to make the problem tractable.

2.5.3.

Auction models

This section summarizes (or hints at, since there is a very large body of research) results achieved using auction techniques. These are characterized by the use of a network protocol for the purpose of some economic-like negotiation between agents. Sometimes the negotiation closely resembles a real auction of goods and these solutions can be also referred as being market-based. However, this is not always the case and the negotiation can follow other patterns, so the term auctionbased is used for generality. The simplest working model for auction proposals is a three-step negotiation: Step 1: A task is published to the agents by some entity, usually called auctioneer. In many occasions the auctioneers are the agents themselves. Step 2: Agents suited to the tasks reply to the auctioneer with a bid on the task. Step 3: The auctioneer awards the task to some agent, after evaluating the received bids. 26

2.5. Solution models

Most variations specify additional rules or steps that provide some advantage over the basic algorithm, as explained later. The advantages of this approach are several and this is manifested in the abundance of research and experimentation with real robots: Simplicity: the idea behind most auction-based protocols is simple and intuitive. Implementation is also not difficult for the basic algorithms. Flexibility: as long as tasks can be represented by some utility value, the system is able to manage them. Expandability: the basic idea is prone to modifications for improved performance. Fault tolerance: as long as the seller keeps track of its sold tasks, they can be resold in the even of failure of the actual owner. The basic idea behind most auction-based algorithms can be already found in [Smith 80] in the 80s, and even there the roots can be traced back in time to the 70s. In this seminal paper, nodes providing tasks to be solved are called managers, and nodes bidding for them are called contractors. The contract net is exemplified with distributed computers and sensors instead of mobile agents, but the ideas remain the same. [Wellman 97] provides a good argumentation for the use of market-aware software agents, which can be seen as the intermediate step towards proper robots. [Stentz 99] is a representative report where a free market is advocated to control mobile robots under the premise that by maximizing individual profits, the global plan profits are maximized. This is an exploratory paper where many ideas are hinted without further development: already there are the ideas of combinatorial auctions, opportunistic centralization and some forms of learning. A first simulated implementation appears in [Thayer 00] in a mission of multi-robot exploration. Combinatorial auctions are implemented in [Hunsberger 00] for the purpose of role assignation and synchronized task execution. In this case, agents bid for combinations of tasks they are able to carry. The auctioneer must then decide the best combination of bidders for the mission at hand. It is noted that this is a NP-hard problem, but manageable for bid numbers in the order of thousands. Opportunistic optimization via leaders is studied in [Dias 02]. Here, a leader negotiates with groups of robots to centrally reason about their assigned tasks and reallocate them in more profitable assignations. The first embodied implementation using real robots appears in [Gerkey 02], where the terminology used is publish/subscribe. Proper working is demonstrated in a coordinated box-pushing task and in a randomly generated alarm-like uncoupled tasks scenario. Another robotic demonstration is shown in [Zlot 02] in a distributed mapping mission. The implementation used therein, named TraderBots, is subsequently improved in [Zlot 03] to use HTNs as a way to distribute planning and manage combinatorial auctions. In these upgraded TraderBots, bids can be 27

Chapter 2. Survey of multi-robot task allocation

for task trees at any depth level, hence allowing for more solutions to be explicitly evaluated and bought. TraderBots are examined again in the light of three kinds of robot malfunction in [Dias 04] with encouraging results. Tight coordination is summarily addressed in [Kalra 03], and more exhaustively explored in [Kalra 05] with a market-based framework named Hoplites. Passive and active coordination strategies are demonstrated in real robots performing a security sweep in a corridor with some obstacles. A form of learning is introduced via opportunity cost in [Schneider 05]. This opportunity cost reflects the expected earnings per second of a robot. This is used to modify the auction mechanism so robots try to maximize this opportunity cost. This has applications in time-discounted environments, where tasks lose value as time goes by. In [Bererton 03], a departure from purely economically inspired background is found, where the auctions are designed to distributively solve an approximated MDP, thus uniting the auction and stochastic paradigms. An excellent survey on market-based techniques is [Dias 05]. Another milestone is the work in [Lagoudakis 05, Tovey 05, Lagoudakis 04], which addresses computational complexity and cost bounds in auction-based allocation for multirobot routing. 2.5.3.1.

Optimality

Early implementations lack a detailed study about the optimality of solutions found, besides the intuitive idea that robots try to maximize profit. Furthermore, without a proper characterization of auction timings and task commitments not much can be said about mathematical bounds, even if experimental results are satisfactory. This was partially addressed in the work [Gerkey 03a], where auctions from [Gerkey 02] are characterized as a distributed implementation of a greedy assigner, consequently exhibiting 3-competitiveness. Other implementations like [Zlot 02] do not strictly follow the greedy algorithm, allowing for the continuous bidding of already assigned tasks. This presumably improves efficiency, but it is not clear if new bounds can be established. In [Lagoudakis 04] an auction-based method is presented that is 2-competitive in static environments. This is not surprising since this is the bound already known for greedy allocations in observable contexts. Further theoretical results about auction-based allocation for the vehicle routing problem appear in [Lagoudakis 05, Tovey 05].

2.5.4.

Behavioral models

Behaviors are patterns of actuation that are embedded into agents and that are enabled or disabled in response to certain stimuli. When a behavior is activated, it will motivate the robot to perform certain pre-programmed actions. Usually several 28

2.6. Architectures

behaviors can be active at the same time, so they must have rules for prioritization or combination. Ad-hoc solutions for some multi-robot domains have been proposed using behaviors. They are easily applicable to formations. A good example with explicit descriptions of behavior rules is found in [Balch 98]. Coverage is tackled in [Batalin 02], where robot spreading is achieved with a repulsive behavior combined with an exploratory one that is attracted to unknown spaces. Potential field based reactions are used in a RoboCup setting in [Mitsunaga 03]. In [Hayes 03], the foraging task is performed using behaviors inspired by ant pheromone trails. More relevant to resource allocation is the work [Parker 98], where the ALLIANCE architecture is presented, with a strong emphasis on fault tolerance. Low level behaviors implement the usual obstacle-avoidance techniques, while high level ones are used to perform the task allocation. The two critical elements are impatience —a robot gets impatient if he sees a task that nobody is executing, thus triggering its adequate behavior to perform it— and acquiescence —that makes a robot to relinquish a task if it detects that its performance is below expectancies. Robots broadcast periodically its current commitments to affect the behaviors of nearby teammates. L-ALLIANCE is an extension presented in [Parker 97] where the L stands for learning. The rates of impatience and acquiescence are dynamically adapted by reacting to the context and environment. Another architecture using a behavior-based approach is Broadcast of Local Eligibility (BLE) [Werger 00]. In it, robots broadcast their fitness (“eligibility”) for a task, and inhibit the behaviors of peers with less eligibility, thus claiming the tasks. MONAD [Vu 03] aims to provide a flexible architecture with off-line scripting capabilities for team design and programming, while relying in behavior-based techniques for the on-line control of the team. One intended goal is to allow for quick implementation of different team control architectures. The architecture is demonstrated in a robotic soccer domain.

2.6.

Architectures

There is notable early research in architectures for intelligent systems in, e.g., [Saridis 79], which advocates a three-level hierarchy based on the principle of increasing precision with decreasing intelligence. Task allocation would fit typically into the organization level therein described, above the coordination and execution levels. This section reviews in chronological order some relevant architectures which aim to provide a general framework for task allocation. They have been already mentioned in the respective general sections when appropriate. Optimality and network usage results are extracted from [Gerkey 03c] or their presentation papers. 29

Chapter 2. Survey of multi-robot task allocation

2.6.1.

ALLIANCE (1997)

This behavior-based architecture has been described in §2.5.4 and exhibits 2-competitiveness in the case of instantaneous assignment (i.e., all tasks are known at execution start). Bandwidth usage is in the order of Robots × Tasks.

2.6.2.

GRAMMPS (1998)

The GeneRAlized Mission planner for MultiPle mobile robotS is a “fieldcapable system which couples a general-purpose interpreted grammar for task definition with dynamic planning techniques” [Brummit 98]. It notably includes planners for the TSP and Multiple Traveling Salesmen Problem (MTSP) problems but, as these are intractable for large sizes, it also includes a simulated annealing route planner. It is thus an early example of hybrid approach, although it still depends on a Central Mission Planner.

2.6.3.

Broadcast of local eligibility (BLE, 2000)

Seen in §2.5.4, this behavior-based architecture is 2-competitive when working with instantaneous assignments. Bandwidth usage is in the order of Robots × Tasks.

2.6.4.

Murdoch (2002)

This first implementation in embodied robots of auction-based techniques has been described in §2.5.3. The quality of its produced allocations is 3-competitive for the on-line tasks problem (tasks that arrive at random after execution). Computational costs are very low and network resources used are lineal on the number of robots.

2.6.5.

TraderBots (2002)

Seen in §2.5.3, this is probably the most representative implementation of early auction-based techniques. It is expected a 3-competitiveness or better, since the reallocation introduced can only best any allocation produced by Murdoch.

2.6.6.

MONAD (2003)

Seen in §2.5.4, its main contribution is the flexibility of its scripting off-line description language that allows quick tailoring of new teams. However, it lacks [Vu 03] fault tolerance capabilities.

2.6.7.

COMETS (2004)

This project exhibits a comprehensive architecture which makes use of four planning and allocation modules and five levels of decisional autonomy. The four 30

2.6. Architectures

modules can be switched progressively from centralized to distributed operation, which configures a incremental delegation of decisional capabilities from the operator to the agents (which are Unmanned Air Vehicle (UAV)s in this case). Also, a symbolic planner is used for reasoning capabilities. The allocation scheme is derived from the Contract-Net protocol, from which a low computational complexity is inferred. However, in order to optimize the M IN M AX objective instead of the M IN S UM one, an equity coefficient is introduced. The authors do not discuss any competitiveness bounds, although they find that this coefficient improves the results for their particular objective.

2.6.8.

Hoplites (2005)

This market-based framework, seen in §2.5.3, addresses domains in which actions of robots are tightly coupled. The uncertainty inherent in these tasks also necessitates persistent tight coordination between teammates throughout execution, that is, active and high-frequency communication. There is no complexity discussion in the original paper [Kalra 05], and furthermore the planning components are left open by the authors, since they depend on the problem at hand. These issues are addressed in a later work [Kalra 07].

2.6.9.

PERSEUS (2005)

PERSEUS is a probabilistic framework which uses approximation techniques in order to solve a POMDP problem. The issues that these problems present for exact solving have been introduced in section 2.5.2, and PERSEUS proposal is to use a randomized point-based value iteration algorithm. This requires random sampling and thus introduces uncertainties on the actual quality of the solution over the optimal policy, which are not discussed by the authors [Spaan 05], although they note several advantages of their approach over other approximations, which allow the use of a greater number of points in the belief set.

2.6.10.

URUS (2008)

The URUS project [Sanfeliu 08] defines a standard set of services to be provided by any URUS robot. This allows the integration of any robot that implements them into the NRS. The allocation architecture of the URUS project comprises two principal subsystems: a centralized (although replicated in each robot) simulated annealing planner, and an auction-based allocator. The stochastic nature of the first subsystem precludes the guarantee of any quality bounds, although its complexity per step, due to the nature of the URUS missions, is constant, involving only task permutations. This algorithm is used as an opportunistic optimizer of known plans, and to quickly integrate large task sets into the system. 31

Chapter 2. Survey of multi-robot task allocation

The auction-based allocator makes use of a similar approach to that of Murdoch or TraderBots, and hence 3-competitiveness is the expected quality bound. Its complexity per bid is linear on the number of tasks, although more expensive heuristics like 2-op movements [Reinelt 94] can be used when the robot plan is small.

2.7.

Discussion

We have seen an overview of the principal aspects of multi-robot task allocation from a service robotics point of view. Firstly, common terminology and requirements have been presented. These requirements are often in conflict with each other, forcing solutions to make compromises in one or other direction, and to embrace hybrid or richer solutions. Algorithmic complexity is one of the critical issues in task allocation, since most problems are of NP-complete nature. Distributed algorithms may also raise the complexity, or transfer it from computation to network load. Authors not always make an effort to identify the complexity of their solutions. Selected studies have focused in this shortcoming, identifying the nature of typical problem models. Consequently, known significant results have been highlighted. Several main trends of solution have been described. Initial centralized planning approaches have given way to distributed ones over time, although many architectures retain centralized planners for opportunistic use. Stochastic frameworks start from a centralized learning process, so efforts have been devoted to their decentralization. In other direction, auction-based solutions are a strong current in task allocation, which started with simple market analogies to later incorporate sophistications like combinatorial auctions, hierarchical bidding, and learning of parameters that affect the bidding process. Finally, behavioral solutions have been proposed which are notable by their resiliency against individual robot failures. However, it has been noted that there is a functional equivalence between some behavioral and auction-based architectures. Finally, a historical review of generic, reusable architectures for task allocation has been presented. We can differentiate between the ones devoted to proof-ofconcept, which usually make use of a single algorithm or strategy to be demonstrated, and hybrid ones, in which several planning and allocation components are present. These tend to be more recent, as the field matures and simple novel ideas are exhausted.

32

Chapter 3

A comparison of metrics and auction algorithms As seen in the previous chapter, auction-based algorithms are a popular tool used for multi-robot task allocation. However, analysis of theoretical performance in regard to common optimization objectives have remained scarce until recent studies. Proposals from different authors have not been compared in common grounds and in light of these recent findings. This chapter is devoted to such a study: four algorithms and two intuitive optimization objectives, minimum total resource usage and minimum total time, are evaluated in three object searching missions. A method for flexible tailoring of the bidding rules is presented and new insight is gained on the effect of using hybrid criteria for the optimization objective.

33

Chapter 3. A comparison of metrics and auction algorithms

3.1.

Introduction

In the previous chapter, we have seen that many interesting cases of multirobot task allocation are a NP-complete problem [Gerkey 04a]. For this reason, approximated methods have been used extensively in order to achieve real-time decisions of acceptable quality. We focus on the popular auction-based approach [Dias 05, Lagoudakis 05], which is one of the most successfully tested solutions. Auction methods rely on the exchange of bids amongst robots; bids are offered for the announced tasks to be performed, and the winner of a task takes responsibility to carry it out to completion. Auction methods are decentralized in nature, since the auctioneer role can be adopted by any agent when necessary. They are also efficient bandwidth-wise because all relevant information is synthesized and exchanged by means of such bids, which usually take the form of single scalars. Experimental results have shown that they perform very well in real applications, often far above their worst case bounds when known. For all these reasons they have been the subject of abundant research. Implementations and performance. Numerous implementations have been described ([Zlot 05, Kalra 05, Lemaire 04, Gerkey 02] to cite a few) but efforts for systematic strong mathematical characterization of its efficiency have been scarcer [Lagoudakis 05, Lagoudakis 04, Gerkey 04a] until recent times. Also, although they are usually presented as flexible general task allocation methods not particularly bounded to a definite problem, the effects of different optimization criteria are seldom thoroughly analyzed. This chapter is devoted to shed some light in this regard, by comparing several canonical auction algorithms that use commonly desired optimization objectives. The study is done by means of simulation runs over a model of a real world large building in object searching missions. Also, randomized experiments are used to extract trends not confined to this scenario. For flexible mission objective tailoring, we use a technique that we call criterion descriptors, which allows the computation of several objectives without additional computational costs. We explore the effects of the different criteria on the team performance, and highlight the interesting good performance of hybrid objectives. We start examining the theoretical models underlying the optimization process in §3.2. Next, we present in §3.3 the auction methods we have implemented. We follow with simulations and its discussion in §3.4, to finally conclude with a discussion of the most relevant results.

3.2.

Optimization criteria

To minimize or to maximize, that is (not?) the question. Auction methods can be modeled for minimization (cost-based [Lagoudakis 05]) or for maximization (utility-based [Zlot 02]). In this thesis we will adopt the first approach; hence, the 34

3.2. Optimization criteria

lowest bid is the best one and cost will be the used metric. Bidding on costs is similar to bidding on utilities when the reward is constant for a given task. That is, if utility is defined (as usually it is) as Utility = max(0, Reward − Cost) it follows that finding the maximum utility implies finding the minimum cost. There is a difference in which tasks with zero utility would not receive bids, and thus would be not executed by any robot. However, in our approach to service missions, no task should remain undone, so we could consider negative utilities, and then there is total equivalence. Objectives. We focus on two intuitive and useful optimization criteria for robotic teams. The M IN S UM objective aims to minimize the total usage of resources (sum of individual costs). It is the most commonly one found in auction-based research, because it naturally arises in market-mimicking varieties where agents try to minimize their costs in order to maximize their own gains, and the team reward is defined as the sum of individual rewards [Zlot 02]. Other interesting criterion is the M IN M AX one, where the objective is to minimize the cost of the worst performing robot. This has a direct translation to finding the shortest timespan for a given mission [Lemaire 04] when the measured cost is time. A detailed mathematical analysis of these criteria is found in the work of [Lagoudakis 05] in the context of auctions for vehicle routing. Since many robotic endeavors consist of visiting places to there perform tasks of interest (analysis of ground, deployment of sensors, surveillance), this is of direct relevance. More formally, let us define R = {r1 , .., rn } as the set of robots in the team and T = {t1 , ..,tm } as the set of tasks to be executed. We also define a non-negative cost function c(p, q), p ∈ R ∪ T , q ∈ T , which denotes some abstract cost (e.g., distance, energy, time, etc.) This function gives the full cost of moving from place p (which can be either a robot initial location or a task location) to q place and completing the task located there. An allocation A = {a1 , .., an } is a partition of T where ai is an ordered list of tasks assigned to be executed by robot ri . That is, we are using a sequential execution model where robots are capable of executing a task at a time. The total robot cost C(ri , ai ) of a robot is the sum of the costs along its entire path while performing its allocation. Hence, the costs for the two team objectives are

CMM = max C(ri , ai ) i

CMS = ∑ C(ri , ai ) i

where MM and MS are abbreviations for M IN M AX and M IN S UM respectively. 35

Chapter 3. A comparison of metrics and auction algorithms

With these definitions, each of the two presented criteria aim to find the optimal allocation A O such as O = arg min C AMM A MM O M IN S UM : AMS = arg minA CMS

M IN M AX :

Bidding rules. In [Lagoudakis 05], bidding rules are defined for these two criteria, in a framework for optimality analysis. Performance bounds are given in Worst case cost the form of competitiveness ratios ( ) for the studied bidding rules. Optimal cost We have identified that several auction implementations use at least the M IN S UM bid construction rule, even if exact timing details differ. For the flexible switching between both criteria we have defined in [Mosteo 06b] criterion descriptors as a tuple (ω1 , ω2 , ...) where each ωk is a weight factor for some “pure” objective. More precisely, in this study, we have (ωMM , ωMS ), with ωMM being the weight factor for the M IN M AX cost, and ωMS the weight factor for the M IN S UM cost. The team allocation cost is computed as C(A ) = ωMMCMM + ωMSCMS A robotic team can use a unique bidding rule parametrized with a criterion descriptor for flexible switching between objectives. Let us assume t is a new task being auctioned. Let us denote by ai the assignation for ri before the task is added to its plan, and by a0i the best assignation ri is able to find including t in its plan. In [Lagoudakis 05] is shown that the bid bi of ri for each objective is to be: M IN M AX

: bi = CTi = C(ri , a0i )

M IN S UM

: bi = C∆i = C(ri , a0i ) −C(ri , ai )

where CTi stands for total cost and C∆i stands for incremental cost. Both quantities can be maintained and computed by a robot in constant time, so without performance losses we propose the following unique rule: bi = ωMMCTi + ωMSC∆i where a criterion descriptor is used to tailor the desired optimization criterion. In this study we experiment with the following descriptors: M IN M AX: M IN T IM:

(1.0, 0.0) (1.0, δ)

M IN S UM: M IN M IX:

(0.0, 1.0) (1.0, 1.0)

We name the descriptors after the objective they pursue. Here appear two notable additions, that we call hybrid for its use of both CMM and CMS : M IN T IM (where δ represents a small real1 ) is our attempt to provide minimum time but also improving the costs of robots not involved in the critical worst cost of a pure M IN M AX 1 E.g.

0,00001.

36

3.3. Implemented algorithms

optimization [Mosteo 06b]. The idea is that the dominating cost is still the CMM one, but the CMS scaled down cost is used to prefer less costly assignations for the remaining robots. On the other hand, M IN M IX uses a balanced combination of both costs in an attempt to achieve good performance for both objectives. The exact meaning in physical units could be debatable or merit further discussion if, e.g., M IN S UM measured distance and M IN M AX measured time. However, in this case we always use distances in our interest to properly ascertain if the saying “Jack of all trades, master of none” applies here or, in the contrary, it is a useful alternative. Note that all descriptors will be evaluated in terms of pure CMM and CMS performance, because these are the magnitudes with real meaning. In our discussions we will often refer to cost as time or resources. When the rationale is M IN M AX, time is a common measure of cost and refers to the mission timespan. When the rationale is M IN S UM, resources could represent the sum of distance traveled or power (batteries, fuel) consumed.

3.3.

Implemented algorithms

In this section we present the relevant auctioning techniques we have implemented for testing. We are particularly interested in techniques with proved performance bounds or, lacking this, high popularity. For this initial testbed we have preferred methods that are of similar complexity, specifically in the order of the one with known performance bounds. We have thus not tested complex methods like combinatorial or hierarchical auctions [Zlot 05]. The bidding rule is the one previously described except when noted, and usually what changes are the timing details and the individual robot allocation building.

3.3.1.

Parallel auctions with performance bounds

This implementation follows the directions given in [Lagoudakis 05] for auctions with proved performance bounds when the triangle inequality holds, e.g. in routing problems. We call it L AGO in reference to the first author. In these auctions, initially no tasks are allocated. In each round a task is allocated this way: Each agent in parallel computes bids for all the remaining tasks, using an insertion heuristic. The lowest bid of all robots wins and the appropriate task is awarded. After m rounds, all tasks in T have been awarded and the auction ends. In our implementation robots and tasks have an implicit order, used to resolve ties. We also test a variant identified as L AGOTAIL. Now, a task can only be added to the tail of a robot plan (instead of considering all possible insertions). This kind of auction does not have any quality bounds that we know, but is tested here because it is extremely cheap to compute when robots have very long task lists. 37

Chapter 3. A comparison of metrics and auction algorithms

Figure 3.1: Left: an example of randomly generated graph for the randomized worlds. Right: a quarter of the floorplan used for the large grid simulation.

3.3.2.

Single item auctions

In this case, agents do not consider bids for all tasks in each round, but only for a single task that in our implementation is chosen at random. Again, m auction runs are performed until no task remains unassigned. We call this S INGLE in our tables. S INGLE E XT, in turn, refers to the allocation found after prolonging for one extra minute the auction process. During this extra time, in each round some robot auctions one of its already owned tasks. See Fig. 3.4 for an example video. In our implementation, auctions occur at a rate of approximately 60 auctions per second, so roughly 3600 additional rounds are performed. In real robotic teams, usually auctions are always running during mission execution, in order to find better allocations. This also resembles mechanisms used when not all the tasks can be known in advance, or where tasks are generated by the robots during the mission. In this case, to prevent entering in conflicting bids, robots bid for a single task at a time. A possible example of this kind of implementation is [Zlot 02].

3.3.3.

Allocation on availability

This method is partially inspired in [Gerkey 02], which is considered the first auction-based embedded implementation tested in real robots. Its singular characteristic is that robots can have just a task in their plan; when a new task is introduced, the best suited idle robot wins it via regular bidding and abstains of further bidding until it is idle again. At that point, it will choose among the unallocated tasks the one for which it has the lowest bid. (In [Gerkey 02], task reintroduction was not implemented.) In other words, this is an initial greedy allocation where the first robot to become idle executes the cheapest available task. We will refer to our 38

3.4. Simulations and comparatives

Figure 3.2: Grid extracted from building floor plans. Each vertex is a reachable place. Vertices not completely surrounded by reachable space are goal places to be visited. The robots are initially positioned at the four leftmost vertices. implementation as F IFO (because the first robot to become idle is the first one to choose a task).

3.4.

Simulations and comparatives

The first kind of scenario are graphs composed by a hundred randomly placed locations in a bounded area, with the restriction that not all locations are reachable from each other, but a locality constraint is imposed (e.g. like free spaces in an urban environment). See Fig. 3.1 for an example. Robots start at a random node and all remaining nodes are goals that must be visited. This setup aims to be a generic test case. The second scenario is a real building floor plan (see Fig. 3.1), modeled as a discrete grid of over 1200 cells (see Fig. 3.2). Each cell adjacent to a wall is considered a goal task that must be visited by some robot2 . Real examples of application comprehend: vehicle location in a parking, explosive sniffing in buildings, inventory maintenance with RFID tags. Unity is the cost of traveling from cell to cell, thus implying holonomic robots. Cells have 4-vicinity as depicted in Fig. 3.2. Robots travel using the shortest path in the grid graph and start in the four leftmost vertices (building entrances). All simulations use four robots because this is the size of our real robotic team, 2 There

are 1023 of these cells.

39

Chapter 3. A comparison of metrics and auction algorithms

CMM

CMS

1000

400 Cost

Cost

600

200

500

0

0

Lago LagoTail Single SingleExt

Lago LagoTail Single SingleExt

Algorithm

MinSum MinMix MinTim MinMax Criterion

Algorithm

MinSum MinMix MinTim MinMax Criterion

Figure 3.3: Average CMM and CMS costs in the random worlds. with which we intend to carry real experimentation in the future in the real building. Experiments in §3.4.1 aim to extract general trends using multiple random graphs, and to observe if these trends appear also in the grid map. §3.4.2 and §3.4.3 use the best solutions found in the grid map to evaluate mission variants.

3.4.1.

Costs of full allocations

The objective in this experiment is to visit all given targets in a map. For the random graphs we performed a hundred runs, each in a randomly generated world like the one shown in Fig. 3.1. Average costs are shown in table 3.1 and Fig. 3.3 (results in bold are best ones). In the grid map (table 3.2), only S INGLE and S INGLE E XT results are averages of a hundred runs. This is because their algorithms have a random step, while the others are deterministic for a given initial world state. We have computed the allocations for all combinations of implementations and descriptors. The F IFO case is special, because its single task rule precludes the meaningful use of descriptors, so only one solution exists. In the grid map we also have computed the optimal MTSP solution for the CMS cost using the Concorde solver [Applegate 03]. The leftmost column names the combinations, while the topmost row names the allocation evaluation (i.e. team performance). As advanced before, this is not the rule evaluation, but the real team evaluation in terms of pure cost functions, which would represent real physical magnitudes. We next discuss the findings in these results. 3.4.1.1.

Descriptors

Clearly seen in Fig. 3.3 is that any descriptor with nonzero ωMM performs comparably well for the CMM cost, with small differences in the 5% range. For the CMS cost differences are smaller; M IN S UM is the best descriptor, M IN M AX and M IN T IM are comparably worse and M IN M IX is in middle ground. 40

3.4. Simulations and comparatives

Table 3.1: Mean costs of full allocations in random scenarios.

Algorithm + Descriptor Best found L AGO M IN M AX L AGO M IN T IM L AGO M IN M IX L AGO M IN S UM L AGOTAIL M IN M AX L AGOTAIL M IN T IM L AGOTAIL M IN M IX L AGOTAIL M IN S UM S INGLE M IN M AX S INGLE M IN T IM S INGLE M IN M IX S INGLE M IN S UM S INGLE E XT M IN M AX S INGLE E XT M IN T IM S INGLE E XT M IN M IX S INGLE E XT M IN S UM F IFO

3.4.1.2.

Allocation evaluation MinMax MinSum Ratio Ratio CMM to best CMS to best 228 1.00 816 1.00 262 1.15 943 1.16 259 1.14 935 1.15 272 1.19 903 1.11 463 2.03 849 1.04 269 1.18 962 1.18 270 1.18 971 1.19 272 1.19 904 1.11 387 1.69 868 1.06 259 1.13 998 1.22 260 1.14 1000 1.23 249 1.09 932 1.14 399 1.75 847 1.04 240 1.05 949 1.16 235 1.03 924 1.13 228 1.00 869 1.06 385 1.68 816 1.00 320 1.40 1159 1.42

Worst cases

Observe that using the M IN S UM criterion causes a huge overcost for CMM (in the range of 42% to 200%, within a same algorithm), whereas using a descriptor with nonzero ωMM gives a worst overcost for CMS in the 5%-18% range only. The huge overcost of the M IN S UM-CMM combination emanates from the fact that, when tasks are easily chainable one after another, it is more resource-effective to just move a robot along a path than to move several of them. Thus, is possible for a single robot to have a high cost which is also a great percentage of the total team cost. See Fig. 3.5 as an example, where a robot is visiting only two goals. 3.4.1.3.

M IN M IX

This descriptor ranges from being the best one in many cases to a 5% overcost for CMM , and has a penalty of 1%-10% for CMS . From the previous points, it seems clear that using a criterion with ωMM is better for general purpose robotic teams, since they perform very well for both CMM and CMS solutions. Furthermore, in 41

Chapter 3. A comparison of metrics and auction algorithms

Table 3.2: Costs for the large grid world.

Algorithm + Descriptor Optimal Best found L AGO M IN M AX L AGO M IN T IM L AGO M IN M IX L AGO M IN S UM L AGOTAIL M IN M AX L AGOTAIL M IN T IM L AGOTAIL M IN M IX L AGOTAIL M IN S UM S INGLE M IN M AX S INGLE M IN T IM S INGLE M IN M IX S INGLE M IN S UM S INGLE E XT M IN M AX S INGLE E XT M IN T IM S INGLE E XT M IN M IX S INGLE E XT M IN S UM F IFO Optimal MTSP M IN S UM

Allocation evaluation MinMax MinSum Ratio Ratio CMM to best CMS to opt. 1285 1.00 371 1.00 1455 1.13 449 1.21 1723 1.34 451 1.22 1789 1.39 441 1.19 1735 1.35 1331 3.59 1709 1.33 377 1.02 1497 1.16 411 1.11 1609 1.25 371 1.00 1467 1.14 560 1.51 1455 1.13 458 1.23 1825 1.42 452 1.22 1802 1.40 433 1.17 1721 1.34 1113 3.00 1627 1.27 448 1.21 1788 1.39 438 1.18 1748 1.36 420 1.13 1673 1.30 1115 3.00 1610 1.25 491 1.32 1783 1.39 675 1.82 1285 1.00

our tests, M IN M IX wins over the other two ωMM criteria in more cases. M IN S UM is only justified when the only relevant metric is CMS (which leaves out missions involving time, for example). 3.4.1.4.

Algorithms

We observe that the S INGLE E XT auctions are the best ones in most cases. This is due to the extra auction runs, as evidenced by the improvement over the S INGLE results. This strongly suggests to use cost-bounded L AGO auctions for the initial allocation when possible and continue with S INGLE E XT rounds for the entire mission duration. F IFO is worst in all cases but when the least suited descriptor is used for an objective (e.g. M IN S UM for the CMM cost). Since this configuration makes little sense on purpose, F IFO can be ruled as the worst allocation algorithm in setups of this kind. 42

3.4. Simulations and comparatives

cost Table 3.3: L AGO sol. optimal ratios compared to their theoretical bounds for n robots.

Bidding criterion M IN M AX M IN S UM

3.4.1.5.

Team objective CMM CMS 1.21 ≤ 2n 1.34 ≤ 2n 3.59 ≤ 2n 1.33 ≤ 2

Anomalies

L AGO performs notably worse than L AGOTAIL in the grid world, even if the latter algorithm is simpler. However, the runs in randomized worlds show that this is not a general trend.

3.4.1.6.

Bounds

The algorithm with theoretical proved bounds, L AGO, is well within these bounds in the grid run. Table 3.3 shows the solution costs compared against their theoretical bounds, as per [Lagoudakis 05]. Note that we know the optimal CMS cost, but not the CMM one, so its column reflects the ratio against the best known solution (showing thus a too optimistic value). Also note that all runs by the other algorithms are within these bounds too.

Figure 3.4: The best M IN M AX solution found via auctions. Link 3.1: http://robots.unizar.es/data/videos/expres/auct-sngext-mix.mp4

43

Chapter 3. A comparison of metrics and auction algorithms

1 2

3 4

Figure 3.5: The best M IN S UM solution found via auctions. Note that the second robot from top (bolder) only performs two tasks.

3.4.2.

Single object finding

In this and the next section we measure costs of the best solutions found for each objective in the previous simulations in slightly different applications that can be also of interest in the context of service missions. In this one we highlight some results on the mission of finding an object randomly placed in the grid world. Now we measure the cost until the object is found instead of the total plan cost. The two box diagrams to the left in Fig. 3.6 show the distribution of costs for the best CMM and CMS solutions found via auctions (L AGOTAIL-M IN M IX and L AGOTAILM IN S UM combinations respectively), over 1000 runs. When the team evaluation is to minimize time, the CMM solution performs patently better. However, when the team evaluation is to minimize resource usage, we observe a similar performance from both solutions. This confirms the previous findings about the convenience of descriptors with nonzero ωMM weight, even for ωMS objectives. The histograms to the right of that same figure reveal that the distribution that appears when repeating the mission 30.000 times is roughly uniform for the implemented algorithms, except for the F IFO case. We understand this inspecting Fig. 3.7, which shows the distributions of each algorithm over 1000 runs. We see that all of them but F IFO have a uniform appearance, while F IFO has a lower median compared to its mean. This is caused by its greedy nature: F IFO tends to perform better toward the beginning, but this short-sightedness causes a higher cost at the tail of the plan, where far apart goals will remain. We will observe this again in the remaining simulation. 44

3.4. Simulations and comparatives

3.4.3.

Objects found in limited time

In this experiment we place a hundred objects in random places of the grid world, limit the mission time, and count how many of the objects are found. Three variations are presented:

3.4.3.1. CMM vs. CMS We compare the best solution of each class, when the mission deadline is fixed to half the CMM time (note that in half the CMS time, the CMM solution would almost finish the exploration). As shown by Fig. 3.8, the shorter exploration timespan of CMM allows to find more objects. Even if this seems evident, we consider this of relevance to multi-robot exploration where reward maximization is used: this gives Area explored a good ratio Distance traveled , but exploration time can be highly suboptimal if our observations were confirmed in these problems.

3.4.3.2.

Halved algorithm solution time

We test each algorithm taking half the time of their best solution as mission deadline. Not surprisingly, the algorithm with lowest median, F IFO, finds more objects (Fig. 3.9). This confirms the rationale argued in [Gerkey 02]: the instantly greedy F IFO strategy is the best when there is no a priori information on task arrival, since greedy allocation will give the lowest average cost per task and robot. Thus, for infinite horizon problems with unbounded task arrival, F IFO would likely be the best algorithm for the CMS objective.

Figure 3.6: Left: The best CMM and CMS solutions, evaluated for both mission time and total resources objectives. Median, quartiles and extreme values are shown. Right: histogram of cost for the best CMM solution with time evaluation, CMS solution with resource evaluation and F IFO with time evaluation. 45

450

1800

400

1600

350

1400

300

1200

250

1000

Cost

Cost

Chapter 3. A comparison of metrics and auction algorithms

200

800

150

600

100

400

50

200

0

0 Lago

LagoTailSingleExt

Fifo

Lago

MinMix criterion / MinMax evaluation

LagoTailSingleExt

Fifo

Minsum criterion /MinSum evaluation

Figure 3.7: The four auction algorithms compared. Left: Mission time evaluation. Right: Resource usage evaluation. 3.4.3.3.

Fixed global time

Since our mission has a finite environment with finite objects, another meaningful possibility is to make all algorithms share the same deadline: half the time of the globally best known solution. This is a fairer comparison of absolute performance, since in real problems deadlines are generally determined by the problem and not by the algorithm used. As shown by Fig. 3.10, F IFO is no longer the best, since its solution is more costly. The short-term advantage is neglected by the shorter plan of the best solution obtained by the L AGOTAIL algorithm. It would be interesting to pursue this point to determine at which problem sizes or mission durations F IFO is surpassed by the other algorithms. As a final observation, the Gaussian curves found in Figs. 3.8, 3.9 and 3.10 arise because the distributions in figs. 3.6 and 3.7 satisfy the properties from which the central limit theorem arises. Particularly, for the two roughly uniform distributions, there are k objects to be uniformly found during mission total time. This hence

100 Occurrences (out of 1000 runs)

MinMax MinSum 80

60

40

20

0 25

30

35

40

45 50 55 Objects found

60

65

70

75

Figure 3.8: Objects found by the best CMM and CMS allocations. 46

3.5. Discussion

Occurrences (out of 1000 runs)

100 Lago Tail Single Fifo

80

60

40

20

0 30

40

50

60 Objects found

70

80

Figure 3.9: The four algorithms with M IN M IX descriptor in search of 100 objects limited to half their CMM time. follows a Poisson distribution with λ =

3.5.

CMM . k

Discussion

We have tested several auction techniques in a large simulated real scenario and in smaller random worlds. We have applied several criteria to optimize the M IN M AX (mission time) and M IN S UM (resource usage) costs. These criteria are linear combinations of both costs. Multiple runs have been performed to extract general trends, for several mission kinds like goal visiting and object finding. The principal insight is that the M IN M IX criterion, which combines equally time and resource costs, is very good in all cases. It is best for the time objective in a majority of occasions, better than any other time criterion for the resource objective, and very close to the M IN S UM one for the resource objective. In the other hand, M IN S UM is only good for its intended resource objective, being notably bad for mission time. This finding strongly suggests that multi-robot teams should carefully justify the use of M IN S UM auctions, due to its narrow applicability. Indeed, there are examples in the literature (e.g. the equity factor in [Lemaire 04]) where extra steps are taken in the bid to avoid what we have observed here. Of all tested algorithms, the only one with a patently different performance is the myopic F IFO one, which only considers one task ahead. Our experiments confirm early literature results where it is best suited for situations where tasks outnumber robots and task arrival is ongoing. For other kinds of problems, like finite maps and tasks, it is not a competitive alternative against long-term planning. 47

Chapter 3. A comparison of metrics and auction algorithms

Occurrences (out of 1000 runs)

100 Lago Tail Single Fifo

80

60

40

20

0 20

30

40

50 Objects found

60

70

Figure 3.10: The four algorithms with M IN M IX descriptor in search of 100 objects limited a same fixed mission time for all. We have also observed that ongoing auctions after the first full allocation noticeably improve the mission plan. Thus, an initial bound-guaranteed auction (L AGO) followed by ongoing auctions of already assigned tasks (S INGLE E XT) would couple all the observed advantages.

48

Chapter 4

A stochastic solution for improved task allocation The previous chapter evaluated some basic auction methods which, albeit very popular, have some limitations that this chapter firstly discusses. This chapter is devoted to an application of simulated annealing to task allocation which, among other things, addresses these issues. This stochastic method presents several advantages when applied in the planning and allocation context: we propose means to easily tailor the optimization objective, and with the use of HTNs, we are able to deal with complex plans, constraints between tasks, and large solution spaces. This approach does not discard viable solutions, hence the optimal one for the model may be eventually found. Large amounts of tasks can be very quickly integrated in the global plan, contrarily to auction methods which need many auction rounds to do so. Other related topics of relevance are discussed, like the feasibility and usefulness of full replication of data in critical missions. We present a working implementation in simulation and indoor experiments, and also an implementation using auction-based techniques for comparison against our proposal.

49

Chapter 4. A stochastic solution for improved task allocation

4.1.

Introduction

While auction-based methods have been successfully demonstrated in many occasions, there are some characteristics of the allocation problem that do not fit well in the auction paradigm. One intrinsic problem is that, in many setups, each task requires a round of bidding, which can take noticeable time in real setups. If several tasks are simultaneously but independently auctioned, conflicts may arise with overlapping winners or inconsistent plans. In order to speed up this whole process, quick bid computation functions and awarding algorithms are thus preferred, which in turn favors the use of computationally cheap heuristics that lead to suboptimal global solutions. In other words, there is a struggle between allocation speed and quality of solutions. Improving auction-based methods. This has caused the appearance of more complex auction methods, enhanced e.g. with combinatorial bundling [Zhang 05, Berhault 03], hierarchical planning [Zlot 05, Stone 98], or other information aggregation optimizations [Dias 02]. These enhancements typically take place at auctioneer or bidder decisional steps, and are basically a centralized method opportunistically applied. Hierarchical Task Networks (HTN) [Erol 94] are also used to expand planning capabilities and to explore alternative allocations. Even with such improvements, auction-based techniques are often limited to possibly suboptimal solutions by the heuristic nature of the algorithms employed or the locality of the information used. Constraints between tasks are also a difficulty for simple auction-based methods: partial information available at the robots may difficult the building of valid plans. Strategic planning, by its very global nature, is difficult to be decentralized, as evidenced by the centralized planners present in multi-robot architectures with advanced planning capabilities [Gancet 05, Brummit 98]. Simulated annealing approach. In this chapter we propose a solution for the simultaneous solving of the planning and allocation problems which addresses the above highlighted shortcomings. It allows flexible problem modeling using HTNs and arbitrary task ordering constraints. There are no explicit restrictions on the quality of the solutions found, because the complete solution space is potentially reachable (although in unbounded time). This is achieved by using the simulated annealing optimization technique, used before in robotic path planners [Brummit 98]. It is applicable to heterogeneous robots by using a general MTSP model [Helmberg 99]. It is applicable to various optimization criteria via the criterion descriptors described in §3.2. Many tasks can be added simultaneously to be quickly allocated, since the annealing process is not impaired by communications. All these advantages come at the price of it being an essentially centralized approach. Thus, we cannot claim it to be by itself a stand-alone solution to the task allocation problem. However, we show a decentralized (albeit not distributed) 50

4.2. Problem statement

implementation with consistency mechanisms which guarantees no single point of failure. Furthermore, this decentralization allows the use of this method at the local robot plan level and the global team level (e.g. in computational-only agents), as demonstrated in the URUS project [Sanfeliu 08]. The simulations presented in §4.6 back these claims and also show that this is a viable solution for real-time use in physical robots. More precisely, we believe this is a complementary solution that can be used along any other task allocation mechanism which suffers from the described drawbacks, and in order to obtain better solutions, due to its global search scope, that might be not found otherwise. In order to illustrate these points, we have implemented a HTN auction-based method in the line of [Zlot 03]. Its details are shown in §4.5. Both approaches are compared in several sized problems which highlight the areas in which the stochastic solution is advantageous over the auction-based one.

4.2.

Problem statement

The general problem we want to solve is the choosing of executable plans in a fully expanded HTN and the allocation of its primitive tasks to robots in such a fashion that the solution is as good as possible according to the minimization criterion, that can be M IN M AX, M IN S UM or a combination of both, bearing in mind the following difficulties: 1. The HTN can model an exponential number of executable plans, caused by the use of OR nodes [Zlot 05]. 2. The optimal assignation of any of these plans is an instance of the MTSP NP-hard problem [Gerkey 04a]. 3. Introduction of new tasks can occur on-line, causing replanning and reallocation of tasks. 4. The full mission is considered critical, that is, its completion depends on completion of all tasks and no task loss should occur. The particular problem used in simulation to evaluate our solutions is the complete exploration of a parking (Fig. 4.2) and urban area in minimum time by a robotic team, including to find the best entry point in the area to be explored for each one of the available robots. Real life applications include vehicle finding, vehicle cataloging, explosive detection, etc. (Fig. 4.3) For the problem at hand, we are making the following assumptions and simplifications: We have a floor plan of the area to be explored. This could be obtained from an aerial photography, a database of critical targets or the national body in charge of buildings and roads. 51

Chapter 4. A stochastic solution for improved task allocation

Figure 4.1: Task tree for a simplified example with a single robot. In the left there is a parking with two rows of cars that can be explored in one pass (segment ST ). There are two entry points to choose from, A and B. In the tree, double-line boxes are abstract tasks and single-line boxes are primitive tasks. There are four possible executable plans, caused by the two OR nodes.

Entry points are not arbitrary, but we have a predefined set of these to choose from. The robots are always connected forming a MANET1 [Chlamtac 03]). We are not concerned about other robotic problems like localization and navigation. Also, we presume robots are equipped with any necessary sensors for the particular mission at hand.

4.2.1.

Hierarchical tasks

HTNs are a powerful and expressive planning tool. Their advantages have been already summarized in §2.4.4. Fig. 4.1 shows an schema of the tasks used to solve the exploration problem in our simulations. We have a first set of tasks related to the exploration problem: Main exploration: This represents the exploration of the entire area. This compound task can be expanded by using coverage algorithms [Choset 01] (though for our purposes we have done it by hand) in a set of straight paths [Choset 97, Latimer 02] or, in the case of streets, each street entails a single task. The rationale is that a robot, while walking down such a straight path, 1 Mobile

ad-hoc network

52

4.2. Problem statement

is capable of detecting with its sensors some relevant features of a nearby vehicle. This has been demonstrated in our laboratory in a plaque identification mission in a parking (Fig.4.3). Explore Undirected Segment: These are each of the segments result of the coverage expansion. Since we do not want to discard any possibilities, we assume that a robot can explore them in any of the two ways. Each of these tasks are expanded as two OR-related Directed Segment tasks. That is, each segment has to be visited once, in any of the two ways. Explore Directed Segment: This primitive task means that a robot must cover the segment in an explicit direction while sensing its surroundings for relevant information. We also have tasks whose purpose is to find the best entry point for the robots if the area has several of them, so they can be deployed optimally: Choose Entry Point: This compound task is a collection of all the allowed entry points. There are as many tasks of this kind as robots in the team, since each one will have to be assigned to a single robot. This task is refined in the following primitive task. Use Entry Point: This primitive task simply means that the robot has to be deployed at the given Entry Point. Each compound Choose Entry Point is trivially expanded as an OR node with an underlying Use Entry Point for each allowed entry point (see figure 4.1). The entry tasks carry a constraint so they must be scheduled as the first one in any robot. For this reason, since there are as many tasks of this kind as robots, each robot will have assigned to it no more and no less than one entry point, as is to be expected. Given the previous description, entry points can be shared by several robots.

Figure 4.2: Aerial shot of a parking like the ones modelled in simulation. 53

Chapter 4. A stochastic solution for improved task allocation

4.2.2.

Cost model

We are using the same sequential task execution model as seen in the previous chapter, in which a robot can only execute a task at a time but can have a list of assigned tasks of arbitrary length. The reduction to the MTSP is as follows: Cities represent the ending of execution of a task, and arcs represent the cost of executing it. Thus, the cost of traveling from a city to another is the cost of performing a task, coming from the previously finished one or the starting point. Solving this MTSP gives in effect the solution to the task allocation. To define the problem, the robots have to publish a cost-matrix of size m2 to the solvers, being m the number of primitive tasks. This is enough to evaluate any allocation. We are using the most general variety of the problem: costs per task and robot [Helmberg 99] (since robots can be heterogeneous or have different sensed data) with asymmetric costs. A classical optimization objective for the MTSP is the M IN S UM one, in which the sum of costs is minimized. An example domain is transportation (e.g. fuel optimization). The homogeneous instance of the MTSP, where all agents have a same cost matrix, can be solved transforming it to a single-traveler TSP [Reinelt 94]. However, per [Helmberg 99] the heterogeneous case is not solvable this way and is furthermore a very hard problem. This is the common case in robotics with heterogeneous robots. We have seen in the previous chapter that using M IN M AX or a mixed optimization is likely to be more interesting in general service robotics, since minimum time is a very desirable objective, and the penalty for the sum of costs is quite small. We thus concentrate on M IN M AX optimization for the evaluation of the methods presented in this chapter. As has been already pointed out, in a pure M IN S UM optimization, the agent plans that are not the worst one can be suboptimal in the M IN S UM sense, since they are completely “hidden” by the worst case. Our stochastic proposal makes use of criterion descriptors as well, and in practice we do not use the pure M IN M AX descriptor (0, 0). Instead we resort to a mixed descriptor like (1, 0.001) (M IN T IM) that will correspond to a M IN M AX optimization where lower M IN S UM costs are favored. This solves the problem of suboptimal non-critical agent plans, since they are now actively taken into account during the

Figure 4.3: A robot equipped with onmidirectional camera for color detection (center) and pan-tilt camera for plaque identification (right). Link 4.1: http://robots.unizar.es/html/ficha_resultado.php?id=73 54

4.3. Data replication in critical missions

M AX S UM: M AX M IN:

80$ > 40$. min(100$, 0$ + 40$) > min(100$ + 80$, 0$)

Figure 4.4: The M AX S UM criterion would give task C to R1, even if R2 is idle, whereas the M AX M IN criterion will award C to R2. Note that these are the equivalents in terms of reward to the objectives in terms of cost.

minimization. This way, we achieve a dominating M IN M AX cost, where the nonworst agent costs are M IN S UM optimized. It is important to note that many market-based allocation schemes are implicitly implementing a M IN S UM optimization in terms of utility. This is usually justified saying that a robot, by trying to maximize its profits, is maximizing the whole team profit [Zlot 03, Zlot 02]. This can be alleviated by introducing a profit discount proportional to time, although this introduces new issues like tuning of these parameters in a somewhat abstract “market”. By working purely in terms of cost, we avoid clouding the real magnitudes being optimized. In our simulations, the cost for each primitive task is: Explore Directed Segment: The time required to travel the segment plus the time needed to arrive to the starting segment pose from the current robot pose. Use Entry Point: Zero when first task in the agent plan, infinite otherwise.

4.3.

Data replication in critical missions

The two algorithms that are later presented are representatives of two opposite approaches: simulated annealing is basically a centralized algorithm, whereas auctions are proud of their decentralized nature. However, it is clear that, in most typical service or field situations, at least some central authority must exist, to allow human monitoring and intervention. Thus, we will generally need a robust, efficiently computable algorithm, but also the possibility of integrating this central authority. This has implications for both approaches. 55

Chapter 4. A stochastic solution for improved task allocation

Purely distributed algorithms like auction-based allocation can incur in data loss if the robot owning a task is lost. To avoid this, it has been suggested [Dias 04] that robots should track tasks they have sold to reclaim them if circumstances deem it necessary. This, ultimately, means that a chain of ownership is formed to some initial seller so a kind of centralization is introduced into an otherwise distributed solution. For small teams like ours, and taking advantage of the broadcast nature of the wireless medium, we believe that a total replication of the critical data (like pending tasks and best solution known) amongst all the connected devices is a solid answer to avoid any data loss (also suggested in [Dias 04]). The goal is to have a copy of each shared information in each node; copies are synchronized during normal operation. In abnormal circumstances, nodes can have old non-updated data but this only means that, temporarily, sub-optimal solutions could be computed due to missing updates. There are obvious concerns about bandwidth and latency constraints with this solution. Results in geographic routing [Stojmenovic 04] suggest that efficient broadcast in large networks can be achieved with latency costs O(n log n) and resource costs O(n), being n the number of nodes. To ensure integrity of the shared copies we have implemented a lightweight, decentralized, data-replicating objectoriented database with version control features to ensure data integrity. Further details are given in the section on implementation architecture and in chapter 7, devoted to our software architecture. This database is obviously not the solution to all problems. In particular, it relies in assumptions about data access patterns: either there are few concurrent write accesses to the same object, or many accesses are distributed across different objects. Also, access to obsolescent data (which has not been updated according to the latency previously indicated) should not compromise the algorithm results. This, however, is the usage pattern necessary in our simulated annealing solution that makes use of this database. To summarize, these are the perceived advantages and disadvantages of this decentralized replicating database: Advantages: No dependence on some central authority. Transparency for the data user (i.e. robot code), who simply sees a regular local database. Immediate availability of all required data which simplifies algorithm development. Drawbacks: Bandwidth and latency constraints. Would require specialized broadcast techniques for “large2 ” teams. 2 We

have not actively researched this point to properly characterize this “large” size.

56

4.4. Simulated annealing solution

4.4.

Simulated annealing solution

Our main proposal relies on simulated annealing [Kirkpatrick 83], a probabilistic meta-heuristic commonly used for problems which are intractable by traditional means and which fits well with some combinatorial problems. It has been successfully used in the solving of traditional NP-hard problems like the TSP [Cerny 85] and also in robotic path planners [Brummit 98]. Simulated annealing is inspired by the metallurgy process of annealing, which involves heating and cooling cycles during which matter tends to settle into lowenergy states. In simulation, an initial solution is built by any means as long as it is valid. From this point onwards, the current solution is replaced by some variation of it, which will be accepted with a certain probability p(Cold ,Cnew , Temperature). This probability depends on the difference of quality of the solutions and on the current temperature. The acceptance probability typically is Cnew < Cold :

1

Cnew ≥ Cold :

e(Cold −Cnew )/Temperature

albeit there is no hard rule about this. When temperature reaches zero, only better solutions are accepted (greedy behavior), whereas during hot stages worse solutions can replace the current one. This allows to escape from local minima. A fundamental element of the annealing problem is the generation of neighbor solutions. We can leverage this point not only to solve the MTSP problem, but also to explore alternative plan expansions; hence we are simultaneously solving the planning and allocation problems. This is the key idea to leverage the HTN expressive power in this approach. Another advantage of the annealing approach is that we can address the most complex and general m-cost asymmetric MTSP model with arbitrary constraints on the task ordering or execution time windows. This is easily handled avoiding the generation of invalid solutions when feasible or, failing that, by giving infinite cost to any solution violating constraints. The generality of our approach has two principal drawbacks: we lack solution quality bounds, unless the particular problem at hand could provide these; and although the optimal solution is reachable, it can take any amount of time to find it given the probabilistic nature of the annealing process and the exponential size of the solution space. While the annealing process is centralized in principle, by using the replicating database this can be turned into a partial advantage by running an identical algorithm with different random seeds in every robot (or connected computing device). This way, more agents available mean a larger chance of finding a good solution.

4.4.1.

Execution stage

In order to start the mission, an initial solution has to be computed. Since first solutions could be highly suboptimal (specially if the entry point is badly cho57

Chapter 4. A stochastic solution for improved task allocation

sen), one minute of “warm-up” is given in our tests for the computation of the first solution. Robots are then committed to the best solution found and start task execution. However, the annealing algorithm can continue, taking into account the current robot commitments. Executed portions of the plan are “frozen” and only the remaining tasks are used for the generation of new solutions. This way, better plans can be found at any moment. We can also benefit from the decreasing size of the remaining problem, which means that the final stages will be likely solved to optimality. Also any deviation in the sensed environment from the a priori data can be integrated and taken into account. In essence, each robot and connected resources can continually work in the search of a better solution. It is not critical to communicate complete solutions until a robot is about to finish its current task, so network usage can be minimized by just broadcasting the cost of the best solution known.

4.4.2.

Neighborhood generation

In HTN parlance, methods are what encapsulate knowledge about the specific tasks to be solved. By providing different methods to expand a same abstract task, several alternative plans can be created in the hope of finding efficient solutions. The simulated annealing process includes a step in which a new solution must be derived (“mutated”) from the current one. Randomly choosing from one of the available methods is a natural option in this context. There are no alternative methods in our simulations, since the coverage is manually given. However, there are other mutators, which also are not problem specific (that is, we can generally use them with any HTN): Random reinsertion. This movement will randomly choose a task from any agent, remove it from the owner plan and insert it again at a random point in the plan of a random agent. Random swap. Two tasks are chosen and its positions swapped. Random plan switch. In this movement, an OR branch is chosen and replaced randomly by one of its siblings. Again, this is a natural fit between HTNs and simulated annealing. Worst-case movements. This applies one of the previous movements, but in this case a task or branch from the most costly agent plan is chosen, in a guided attempt to improve the worst case. 58

4.5. Market-based solution

Heuristics. In this class of movements we have two cheap TSP heuristics, based in greedy insertion [Reinelt 94]. We have found that this mix of purely random movements and heuristic movements works well. In one hand, pure random movements allow the exploration of far apart areas in the solution space. On the other hand, guided and heuristic movements quickly find good solutions in a given local area. With the use of appropriate data structures, random and worst-case movements can be performed with cost O(log max(n, m)), being n the number of robots and m the number of tasks. Heuristics are computationally more expensive and can cause the search to “sink” to a fixed solution state from a large number of states, so we give them a very small probability of being performed. Our implementation using the Gnat Ada [AdaCore 04] compiler explores around 2000-6000 solutions per second in a standard PC, depending on the problem size. As always, these numbers are to be taken with a grain of salt and subject to Moore’s law [Schaller 97]. Another advantage of this approach is that the probabilities of mutations can be tweaked at will. This allows to have a large stock of mutations, maybe some highly specialized ones, that can be enabled or disabled if the particular problem requires it.

4.4.3.

On-line replanning

If new tasks were to appear on-line, that is, during mission execution, they can be quickly planned in several ways. The one we chose is to perform a simple insertion heuristic: each task is tried sequentially in the task list of the agents, and left in the first valid slot found. Then, the annealing process will optimize this likely bad plan. In practice this is an almost instantaneous operation, of at most O(mcurrent m0new ) complexity. In contrast, an auction-based solution needs some time for each auction round, which may take some time to be completed.

4.5.

Market-based solution

We will use a solution akin to that in [Zlot 03] for comparison with our annealing proposal, but using the bidding rules described in [Tovey 05]. This modification, in practice, turns the marketplace into a heuristic for the MinMax criterion. In our implementation, robots auction subtrees entirely owned by them. Since in this particular problem all OR nodes consist of single primitive tasks (see figure 4.1), the auctions will always be for OR nodes and never for primitive tasks. A bidder will consider each branch of the OR node, and bid using the best one for his current plan. The strategies tried to insert a new task are Best insertion point. 59

Chapter 4. A stochastic solution for improved task allocation

Optimal TSP solution when the plan has less than 20 tasks. Complete greedy replanning of all owned nodes. Finally, the Choose Entry Point subtrees are not used in our market simulations, since robots need to know where they are to properly bid. Hence, this solution does not address the choosing of entry points.

4.6.

Evaluation

This section presents the results obtained by the two proposed algorithms. There are three simulated scenarios of different complexity that translates into solution space size. In every one of them, the tasks for the market-based execution are initially auctioned in random order. After completing the auctioning, enough full rounds of auctions are run by each robot until no plan improvements occur, so in other words we are running the auctions to exhaustion of improvement. In the annealing planner, we simulate the complete running period, with the restriction that a started task cannot be removed from the robot that is executing it. Multiple planners are not run in parallel, but a single one is used for the simulations. The cooling schedule is a exponential one, which is cyclically applied over the complete simulation time. The initial solution is constructed using a greedy allocation after a random executable plan is expanded. The first scenario is a small parking roughly inspired in the one of our workplace, which has 12 lanes to explore as seen in Fig. 4.5. This gives 212 , that is

Figure 4.5: Left: The small parking scenario, with Explore Undirected Segment tasks shown as arrows. Right: The city schema used for the large scenario, with dots signaling entry points. Link 4.2: http://robots.unizar.es/html/ficha_resultado.php?id=45 Link 4.3: http://robots.unizar.es/data/videos/expres/anneal-city43.avi

60

4.6. Evaluation

700

3900

600

3800

Cost (seconds)

650

Cost (seconds)

Cost (seconds)

1500 1450 1400 1350

3700 3600 3500 3400

1300

550 SA SAfix Market Opt

3300 SA

SAfix Market Opt

SA

SAfix

Market

Figure 4.6: Costs of solutions for the 12-, 32- and 43-lane problems.

4096, possible executable plans. Two entry points are considered for this parking, one at the center top and another one at the center bottom. The second scenario is a bigger model with the same structure, with a total of 232 possible plans, which we consider big enough to be regarded as intractable by full enumeration of solutions. The third scenario (Fig. 4.5) is a model of the streets in a small area of our city, with a total of 243 possible plans, and some additional difficulties like dead-ends and a more irregular disposition of the segment tasks. This scenario has seven possible entry points. We shall call these three scenarios small, medium and large. Note that even if the two parking scenarios have apparent symmetries in the disposition of tasks, there are small differences that suppress them. We have done this on purpose to avoid multiplicity of solutions and problem simplification, although on a real implementation we would of course want to leverage these when possible. Fig. 4.6 shows the minimum, maximum, median and quartile costs over 10 runs of each algorithm with a team of four robots, which is the size of our real robotic team. The SA column gives the cost computed by the annealing planner having to chose entry points; the SAFix gives the cost computed by the annealing planner using from the start the best entry points previously found. The Market column gives the cost computed by the auction-based simulation with the same entry points as SAFix . Finally, the Opt column gives the optimal solution cost, which we know for the parking scenarios by full solution enumeration. These costs do not include the time given for the start-up stage or the auction rounds, but are the final solutions computed by the algorithms. As we can see, the simulated annealing solutions are consistently better than the auction-based ones, which is plausible from a centralized theoretically complete3 algorithm over an heuristic one. SA solutions are within a 0%-4% of the optimal known solution, whereas the market ones are about 15%-18% off which is, nonetheless, a good result, specially for problems of this complexity. Note that the small scenario has been solved to optimality in all runs by the annealing algorithm, while only a few times in the medium scenario and probably none in the large one. 3 with

infinite running time.

61

Chapter 4. A stochastic solution for improved task allocation

4800

Cost (seconds)

4600 4400 4200 4000 3800 3600 3400

1

10

2

3

10

Time elapsed (seconds)

10

Figure 4.7: Evolution of the best solution found over a complete run of the algorithm in the large parking problem with 4 robots. The vertical line indicates execution stage start at second 60. Better solutions found after this point can still be applied.

Since we don’t know the optimal solution of the urban scenario, there are no assurances on quality solution, but again we see a better performance from the annealing solver. Table 4.1 gives the ratio mean/best for each algorithm and problem size. These ratios also indicate a decrease in the quality of solutions when the problem grows in size for the annealing solver, whereas the gap with the auctionbased solution shrinks. Further experimentation should be carried to see if this trend reaches a point where greedy heuristic like first-round auctions are as good as our annealing proposal. Table 4.1: Ratio of mean to best solution. Scenario Small Medium Large

SA 1.00 1.04 1.05

SAFix 1.00 1.02 1.06

Market 1.18 1.15 1.13

Table 4.2 shows the quality of solutions for the different scenarios when considering different number of robots for the SA algorithm. The values are the ratio mean/best solution cost. Columns show the already noted decrease in quality of solutions as the problem gets larger, whereas rows highlight that adding robots increases the number of solutions and so the performance degrades over a larger solution space. 62

4.7. Proofs-of-concept

Table 4.2: Team size & SA algorithm. Scenario Small Medium Large

2 robots 1.00 1.00 1.03

4 robots 1.00 1.04 1.05

8 robots 1.00 1.05 1.06

In Fig. 4.7 we can see the progress of the best solution cost obtained by the SA algorithm in a typical run. We observe that in the first seconds there is a quick evolution, which is to be expected since the first plan was chosen at random and can be highly suboptimal. After that, slow improvements can be obtained over all the mission duration, and specially when the problem size reduces in the final period.

4.7.

Proofs-of-concept

A series of simulations were performed in order to illustrate some particular advantages of the simulated annealing proposal, and to validate it prior to experimentation in real robots. These simulations were performed using a setup equivalent to the one deployed later in the robotic team and which is presented next in §4.8.1. Namely, there are four robots (detailed in §C.1), which in these simulations translate into four independent processes running in different computers, each of them controlling a simulated Pioneer 3 robot within the Player/Stage simulator [Gerkey 03b]. These computers use the replicated database described in §4.3. One such simulation showing a small parking exploration is linked from Fig. 4.5. Two kinds of tasks were used in these simulations: goal visiting (represented by circles), in which a single spot has to be visited by a robot, and segment traversal (represented by arrows), in which an undirected segment such the ones already seen in the evaluation section has to be traversed in any direction. The arrow shows the chosen traversal orientation in the current plan, but does not mean that the traversal must necessarily happen in that direction. The background color identifies the robot which found the best allocation, which is propagated to all the robots via the replicated database. The same colors are used to represent each robot’s plan. The simulations can be downloaded from the links shown in Fig. 4.8.

4.7.1.

On-line addition of tasks

The purpose of this simulation was to demonstrate that new tasks created during the ongoing mission could be quickly and satisfactorily integrated in the robot plans, just as in any auction algorithm (Link 4.4). 63

Chapter 4. A stochastic solution for improved task allocation

Figure 4.8: Simulations of situations with the annealing algorithm. The global plan is shown in the left pane, while the simulator view is shown in the right one. Link 4.4: http://robots.unizar.es/data/videos/expres/anneal-dynamic.avi Link 4.5: http://robots.unizar.es/data/videos/expres/anneal-lost.mpg Link 4.6: http://robots.unizar.es/data/videos/expres/anneal-changing.avi

4.7.2.

Robot failure

In this simulation, a robot (displayed in red) was forcefully taken down in order to represent a malfunction. The purpose was to observe the detection of this failure and subsequent replanning of the robot tasks. While auctions can take a while to reallocate a large number of tasks, the annealing algorithm does this very quickly by inserting them at random and leaving the plan optimization to the annealing process (Link 4.5).

4.7.3.

On-line change of criterion descriptor

This simulation illustrates another variant of quick replanning. In this case, an initial M IN M AX descriptor is later changed to a M IN S UM one. Using auctions, this would involve auctioning all the tasks again in order to get new updated bids for them. With the simulated annealing solution, the change of descriptor immediately translates into new cost evaluations for the candidate plans. As shown in the video, a solution using the new descriptor becomes immediately available, which is subsequently updated with improved ones as the annealing continues (Link 4.6). 64

4.8. Deployment in real robots

Figure 4.9: Two robots testing the simulated annealing algorithm Link 4.7: http://robots.unizar.es/data/videos/expres/anneal-demo.mpg

4.8.

Deployment in real robots

4.8.1.

EXPRES project

Our simulated annealing algorithm has been tested in the framework of two projects. The first one is the EXPRES 4 project, which is framed around the Urban Search And Rescue (USAR) principles and aims to provide robotic help in distress situations where human intervention is dangerous. The objective was to build a cooperative robot team which can perform complex tasks in civil aid operations with a high degree of autonomy. Examples of missions that were studied in the EXPRES project are victim location in road tunnel accidents (with emphasis in maintaining a completely connected mobile MANET) and the one which has been examined in this paper: complete exploration of a parking or, more generally, urban area for quick location of interesting vehicles, objects or people. One motivation for such a task is that it is a common tactic of some terrorist groups in our country to advertise the planting of a bomb car at some location, often supplying critical data about the vehicle such as the plaque or the model. The rationale seems to be that this modus operandi creates distress amongst the civil population, but only police lives are put at risk in the deactivation mission. We have implementing the algorithm for use with our team of Pioneer 3-AT robots (described in §C.1). A proof-of-concept experiment using two robots was carried out in a simple vehicle-routing mission: alerts were triggered at random intervals of time in random spots of the indoor experimental area which one robot had to visit. During the mission, both robots were running the same code and performing the simulated annealing optimization in parallel, sharing results via the replicated database. This is reflected in the background color of the plan, which changes according to the robot which has found it. Fig. 4.9 shows two snapshots of the experiment, which demonstrated that the simulated annealing approach is a viable solution for real-time use with our hardware. 4 T´ecnicas

de EXPloraci´on avanzadas en tareas de REScate, DPI2003-07986

65

Chapter 4. A stochastic solution for improved task allocation

Figure 4.10: The Tibi and Dabo Segway-based robots in the Campus Nord of Universidad Polit`ecnica de Barcelona .

4.8.2.

URUS project

The URUS5 project is the second one in which this method has been demonstrated. This project studies the idea of deploying a network of agents (robots, intelligent sensors, devices, etc) in order to improve life quality in such urban areas. Four such robots (see §C.2) were fitted with this algorithm: two Segway platforms, used for good delivery missions (Fig. 4.10); one instrumented golf cart, used as an autonomous taxi for transportation of people; and a four-wheeled differential drive low-profile robot, used for the guiding of people in cooperation with tracking cameras in fixed environment locations. More details about the software architecture used to deploy our algorithms are provided in §7.5. In this project, the simulated annealing planner was used as an opportunistic optimizer, always running in the background, whereas initial allocation of mission requests was done by a first-round single-item auction algorithm. Final experiments focused on the proper integration of all the software modules by the different partners and effective demonstration of the heterogeneous robots working in the experimental area. Because of this and the limited duration of such experiments, we could not gather enough data samples to verify the results described in §4.6. We could, however, demonstrate that the algorithms can be used in real-time in a variety of robots besides ours. Furthermore, these robots were running a full assortment of algorithms for localization, perception, navigation and in general all required capabilities for a realistic service robot, making CPU sharing a not negligible issue. 5 Ubiquitous

Networking Robotics in Urban Settings, EC IST-1-045062-URUS-STP

66

4.9. Discussion

4.9.

Discussion

This chapter started discussing some of the limitations of auction-based allocation, to continue with a proposal addressing these limitations. This simulated annealing algorithm simultaneously tackles the planning and task allocation problems. It performs well in the tried examples, often finding the optimal solution for small-medium sized problems. In larger problems, it still outperforms the auctionbased solution for the examples studied. The use of a replicated implementation, as demonstrated in simulation and experiments, proves that it is a valid method for real service teams. Since it also has its drawbacks, namely the need for as much global information as possible, we think it is best suited to be used as was demonstrated during the URUS project, as a companion for opportunistic optimization with another, naturally decentralized method like auctions. When compared to such auction methods, we have seen that this proposal addresses their shortcomings quite naturally. The fast solution exploration is well suited for the integration of large quantities of tasks in a short timespan. Furthermore, the global scope of the annealing method, which is not impaired for the need of quick bid computation, is able to explore a large amount of candidates from the solution space, even finding the optimal solution in respectably sized problems. The use of HTNs offers good planning and allocation opportunities by taking advantage of OR nodes in the neighbor generation function and contemplating constraints in the solution cost evaluation. Both solutions, annealing and auction-based, require a network spanning over the robot team. For the first, it is mandatory for information propagation (cost matrices and solution assignation), and for the second to allow auctioneers and bidders to communicate. Until this point, we have assumed that the network is a given, but this is not always the case, as we advanced in the introduction. From this point onwards, this thesis focuses on task allocation methods that take into account the networking constraints. We will see that some results can be reused, while also some specific algorithms will be presented that are specifically tailored to this problem.

67

Chapter 4. A stochastic solution for improved task allocation

68

Chapter 5

Reactive allocation in constrained networking environments A typical assumption made in prior work on multi-robot teams is that robots are able to communicate uninterruptedly at all times independently of their locations. In this chapter, we investigate the multi-robot task-allocation problem under communication constraints, reflecting on the fact that real mobile robots have a limited range of communication and the requirement that connectivity must remain intact (even through relaying) during the entire mission. A principal factor impacting link quality is robot routing and, more precisely, the relative positions between robots. This chapter presents a reactive task-allocation framework that works on top of a low-level navigation module. The latter has ultimate control over robot movements, and is tasked with preserving a connected network at all times. This navigation module restricts robot movements when it is mandatory to avoid a network split, and thus can impair robot movements and therefore task execution. The allocation framework herein presented takes advantage of the network and navigation configuration in order to perform assignments that are compatible with both the mission tasks and the navigation configuration, ensuring steady mission progress. Furthermore, this algorithm includes a configurable step in which traditional allocation algorithms can be reused.

69

Chapter 5. Reactive allocation in constrained networking environments

5.1.

Introduction

Teams of robots can be advantageous over single robots; they offer greater flexibility through dynamic team coordination and reorganization, greater efficiency through parallel task execution, and greater reliability through resource redundancy. Recent research progress in robotics has allowed the use of robot teams in various real-world applications, such as search-and-rescue missions, area exploration, and field demining. Robots have to move to complete their tasks while maintaining communication among themselves and with their human operators, in many cases without the aid of communication infrastructure (hostile environments, emergencies and rescues, disaster recovery, special events, battlefield, space, marine applications and so on). Guaranteeing connectivity enables robots to explicitly exchange information needed in collaborative task execution, and allows operators to monitor or manually control any robot at all times. Otherwise, their interruption may imply impaired team performance, loss of robots and, ultimately, mission failure. Network paths can be multi-hop, so as not to unnecessarily restrict the team range. This fundamental issue is now an emergent field for robotic missions in real scenarios. A recent example is the DARPA Landroids initiative to autonomously cover areas with Wi-Fi [Mandelbaum 07]. In other words, an integral part of mobile robot missions is the visiting of places, when the tasks comprising the mission are spatially distributed over a geographical area. The problem of routing robots over available paths between target locations (corresponding to specific tasks) is known as multi-robot routing. The objective is to find a route for each robot, so that each target location is visited exactly once by exactly one robot (no waste of resources), all target locations are eventually visited by some robot (mission completeness), and the entire routing mission is accomplished successfully in the best possible way (optimization of performance). More generally, in a context of task allocation, the cost of a task is comprised by the cost of reaching the task location, plus any task-specific cost. Early research on multi-robot routing and allocation, implicitly or explicitly, commonly made the assumption that the robots of the team are able to communicate at all times independently of their present location. This assumption holds true when the mission takes place over a small geographical area like, for example, inside a building. However, in many interesting real-world applications missions are extended over large geographical areas, where network connectivity cannot be taken for granted. Mobile robots typically use a wireless connection to communicate with the other team members; it is, therefore, natural to assume that the communication range of each robot extends to an approximately circular area around its current location up to a certain radius; any communication outside this area is not possible. Maintaining full connectivity between team members does not necessarily imply that each robot communicates directly with all other robots, but rather that any robot can reach any other robot either directly or by relaying messages through some other robots. Therefore, it is required at all times that, at least, a 70

5.2. Related work

spanning tree of valid network links spans over all robots. In this chapter, we take limited communication constraints explicitly into consideration during the task allocation process. Such an approach is deemed necessary by the fact that each target may not be reachable independently by a single robot, without the support of other robots acting as relays. In effect, each distant target is eventually served by a group of robots. The task allocation framework is part of a three component approach to the subject, each tasked with a critical part of a multi-robot team. First, a network subsystem is in charge of hard real-time communications with changing topologies, enabling precise real-time cooperative control of robots, and providing link quality measurements between node pairs. Second, a cooperative navigation subsystem generates coordinated speed commands in order to move the robots. The navigation takes into account goals, obstacles and network integrity by means of virtual spring-dampers that link robots. Finally, our task allocation module takes advantage of the network and navigation topology in order to continually choose goals for each robot which ensure autonomous mission completion. This chapter focuses in the task allocation subsystem, albeit a brief introduction of the communications and navigation subsystems is necessary to properly understand the complete system. Full details on these modules can be found in their respective appendices. The following section presents a brief review of related work on the techniques involved in this paper. In §5.3 a system overview is outlined, where the whole picture and interactions between subsystems are explained. §5.4 presents the task allocation algorithm. Finally, simulations and real experiments are discussed.

5.2.

Related work

Limited communications introduce new complexity to the already NP-hard [Gerkey 04a] task allocation problem, although they bring the multi-robot routing problem closer to reality. Basic approaches opportunistically take advantage of network connectivity when available [Burgard 05], but are not specifically intended to avoid network splits. A possibility is to include link quality in the motion [Mostofi 08] or goal generation functions [Rooker 07]. E.g., in exploration, goals may be decided as the result of cost functions that depend on signal quality [Rooker 06]. This is difficult to carry over to more flexible service missions, since tasks are conditioned by external requirements (e.g. visiting an injured person, and in general visiting arbitrary goals), and may be totally unrelated to those generated by the team in order to preserve communication. These approaches are, furthermore, prone to local minima problems. Approaches where task generation is not controlled and connectivity is an explicit requirement are scarce. In the past a behavioral approach has been proposed, where connectivity maintenance is addressed, but is not guaranteed [Wagner 04]. Other approaches do not measure the actual signal, but rely on models like circu71

Chapter 5. Reactive allocation in constrained networking environments

lar radius [Vazquez 04] or constant decay. There is growing evidence [Hsieh 08, Quin 03] suggesting that such models can badly misrepresent the behavior of real signals [Sridhara 07], leading to temporary connectivity losses. Even building a multi-hop chain to reach a single target is challenging, involving quality and resource compromises [Burdakov 10, Stump 08]. A recent auction-based proposal appears in [Viguria 08], where robots can subcontract the placement of relays in order to increase its network range. While an explicit mechanism is given for circular deadlock prevention, complex interactions that can lead to resource starvation are not discussed. The allocation method presented in this chapter couples two advantageous aspects: reactive allocation based on observed signal conditions, considered the critical constraint, and allocation of externally given tasks, suitable for a general purpose team not tied to a particular problem. Additionally, as part of our algorithm, we use switchable strategies derived from unconstrained approaches such as the Hungarian method [Kuhn 55], auction-based [Mosteo 07a] and traveling salesman heuristics [Reinelt 94]. Most of the early cited related work presents results based on simulation and with a fixed radius signal model. Our proposal, although also evaluated primarily by means of simulations, has also been tested in real experiments, as described in §5.6.5.

5.3.

System overview

In this chapter we use the same cost model presented in previous chapters and detailed in appendix A. To summarize, our system consists of a set of robots equipped with wireless interfaces. Tasks are modeled as goals to be visited by one robot. The mission definition is to visit all goals with any robot minimizing some of the team objectives while the network remains permanently connected. The main assumption that drives the system is that maintaining network integrity is a primary constraint never to be violated. Network integrity is defined as the existence of, at least, a spanning tree of links of sufficient quality linking all nodes; that is, robots and base. Since the underlying network protocol supports changing topologies and multi-hop routes, in effect the existence of such a spanning tree guarantees that any node can communicate with any other node. Thus, a measure of the communication link quality is needed. Based on this measure, the robot movements are restricted if necessary. This approach relies on a modular solution (Fig. 5.1), replicated in each robot: Cooperative Navigation Module (CNM): generates velocity commands for the robots, based on link qualities and robot goals. This controls the motions to ensure that no robot becomes disconnected. COmmunication Module (COM): it provides multi-hop, real-time communication among robots, and also measures the communication link qualities 72

5.3. System overview

Figure 5.1: Modules and information flows. among robots. Task Allocation Module (TAM): assigns tasks to robots in such a way that ongoing mission progress is achieved while honoring network constraints. The information flows between modules are depicted in Fig. 5.1. The CNM is responsible for robot motion and, principally, preventing connectivity losses. It keeps the network connected using a coordinate motion strategy for all the robots. The solution is based on virtual Spring-Damper Systems (SDSs) among robots that change dynamically according to the quality of the communication network links. A set of virtual pulling forces produced by the current goals and others generated by the SDSs as a function of the link quality between nodes and by the environment acts on the system in a coordinated manner controlling its motion. In total, there are three possible forces acting upon a robot: a pulling force towards its assigned goal, a pulling force resulting from any SDSs necessary to maintain network integrity, and a force pushing away the robot from near obstacles. The resulting aggregated force is translated into velocity commands in an overdamped, smooth behavior. Full details for the navigation subsystem are described in appendix B.1. Note that, by definition of the constants involved, the network integrity forces dominate the system if necessary, and take precedence over goal reaching. This has implications for task allocation which will be discussed later. The COM fulfills the role of providing information transport. This module provides real-time traffic support. It also supports frequent topology changes while offering multi-hop capability. Finally, it provides both point-to-point delivery and efficient broadcast to network users. The network layer, however, is not locationaware, and the movement of the robots could cause connectivity losses due to the 73

Chapter 5. Reactive allocation in constrained networking environments

limited range of the wireless devices. This is a typical characteristic even in relatively small environments. Indeed, the 802.11b range used in the experiments is usually considered to be about 150m outdoors, although that distance can only be covered at 1 or 2 Mbps. At the maximum data rate, the distance is 45 meters at most [Kapp 02]. Consequently, if an application requires a certain bandwidth, this fact must be taken into account. The link quality among nodes is continuously monitored by the COM module, which provides a link-quality matrix to the CNM with this information. Full details for the COM used in this work are described in appendix B.2. We assume in this work that the robot team is moving in workspaces in which no sudden falls or changes in signal strength occur. In open and uncluttered environments, such as those selected for the experiments, the signal strength changes smoothly, and so the proposed techniques are able to enforce connectivity. In cluttered and confined environments (e.g. tunnels) complex propagation patterns occur and serious signal fading can appear. Additional strategies should be applied to move the robots in order to recover good signal quality. The TAM module assigns tasks to the robots so that they visit all goals as efficiently as possible. Since in spite of any given robot goals (e.g. too far apart) the CNM module never generates movements that would cause a network split, traditional allocation techniques cannot be readily applied. Hence, the TAM algorithm takes into account the CNM state to ensure compatible task allocations that will eventually lead to mission completion, as detailed in §5.4. The COM, besides providing the usual networking services, acts as a data source, supplying the navigation module with link qualities. The CNM additionally obtains the local robot pose and propagates it using the communication broadcast service. The spring-damper systems and poses are supplied by the navigation module to the TAM, which computes goals that are sent back to the CNM. The modules are identical in all the robots; in this sense the system is decentralized, although the efficient1 COM broadcast propagates the same data to all nodes.

5.3.1.

Outline of operation

Robots forming the MANET are allowed to move only in ways that do not break network connectivity. This is achieved by continuous monitoring of all robot-to-robot signal qualities (done as part of the real-time RT-WMP protocol [Tardioli 07] used by the COM) and building a virtual route along the spanning tree of maximum quality. The objective is to enable communication between any two robots at all times through this spanning tree. Motion is governed by the simulation of a physical spring-damper mesh model (Fig. 5.2) continually computed by the CNM. Any link in the network spanning tree that falls below a safe quality level (Fig. 5.3) becomes a virtual Spring-Damper System (SDS) that exerts attracting force between the robots that are about to cause a network split. As long as the link is above the safety level, no SDS and, thus, 1 O (4n)

with n being the number of nodes.

74

5.4. Task allocation module

Figure 5.2: An example of a MANET with virtual springs over links of low quality pertaining to the maximal quality spanning tree. Note that any single robot is not necessarily within the communication range of all the other robots. The robot clusters for task allocation derived from the connected components over springs are {R1 , R4 }, {R2 , R3 , R5 }, {R6 }. no attracting force between the two robots appears. Robots move in reaction to the sum of forces exerted by SDSs upon them. They may appear because of insufficient link quality, but also because of goals, which act with a fixed attracting force, and because of obstacles, which exert a repelling force. These forces are translated into velocities that match the robot real capabilities (including non-holonomicity). This model enforces (a) smooth, jerk-free robot motion, (b) MANET connectivity maintenance, as there is always a spanning tree covering the entire network, and (c) maximal freedom of movement, as the spanning tree contains the minimum number of required links (and, ultimately, SDSs) to maintain a connected graph.

5.4.

Task allocation module

This section details our reactive task allocation algorithm, implemented as a software component responsible for allocating pending tasks to robots. The inputs available to the TAM are the poses of all robots and the current SDS set in use by

Figure 5.3: Typical degradation of the radio signal quality as a function of the distance between transmitter and receiver. When quality falls below a safety threshold (st), a spring is activated to prevent network disconnection. 75

Chapter 5. Reactive allocation in constrained networking environments

the CNM (see figure 5.2). In turn, the TAM outputs a set of robot-goal pairings to be used by the CNM as attractors. Note that the TAM module has no influence over which SDSs exist. Instead, it reacts to SDS changes by generating compatible allocations. In the following discussion we will use the terms task/goal and cost/distance interchangeably.

5.4.1.

Context

The use of SDSs introduces constraints that traditional allocation methods do not face. A key issue is that of deadlock states, in which no goal is ever reached due to SDS forces and faulty allocation. Deadlocks must be avoided because they can render the entire team useless (total deadlock involving all robots) or highly degrade its performance (partial deadlock involving some robots). We can distinguish between two classes of deadlocks. On the one hand, equilibrium deadlocks happen when a standstill is reached between SDS forces and goal attractors (Fig. 5.4, R1 and R2 ). In this case, no robots are moving and no goal can be reached. On the other hand, dynamic deadlocks (or livelocks) appear when robots are moving, yet indefinitely fail to complete any task (Fig. 5.4, R3 ). There are at least two approaches for solving this issue. The on-line approach would try to detect a deadlock once it occurs and adopt measures to escape from it. The off-line approach, which is used in this work, implements an allocation method that is, by design, deadlock-free. Another point of concern, when there is a static base, is that maximum reachability is achieved with chain configurations (Fig. 5.4, R4 and R5 ). Inducing these configurations ensures the completion of tasks that are within team range.

5.4.2.

Definitions

In order to explain the TAM some concepts are defined: A cluster is each set of robots that are connected, at any hop distance, by SDSs as currently reported by the CNM. Robots without SDSs attached form a trivial mono-robot cluster. A supercluster (S-cluster henceforth) is a set of clusters which are aggregated by the TAM algorithm, as we will see. A trivial S-cluster is formed by a single cluster. An intertask timespan is the time elapsed between the consecutive removal of two tasks (either by completion or ultimate abandonment if unreachable).

5.4.3.

The allocation algorithm

We now present the complete TAM subsystem (see algorithm 1). The intuition behind this algorithm is that robots linked by SDSs (clusters) have the same goal 76

5.4. Task allocation module

Figure 5.4: Several task allocation related situations: • R1 and R2 are deadlocked by forces in equilibrium. • Let us suppose a faulty allocation policy, with robots abandoning their goal if a SDS is attached to them. R3 could attempt to reach G3 , but once its SDS appears, the goal would be abandoned and R3 would move to R03 due to SDS pulling force. At R03 , the SDS is not needed anymore and disappears, so G3 could be attempted again by R3 . This cycle could repeat indefinitely. • Robots linked by SDSs in chain form (R4 and R5 ) have maximum reachability. • The resulting force of goal assignation and SDSs causes R7 and R8 to move to R07 and R08 . At that point, one of the two is able to move forward using the other as a relay: a chain will have been formed. Link 5.1: http://robots.unizar.es/data/videos/urus/ctree-react-dlock.mp4 Link 5.2: http://robots.unizar.es/data/videos/urus/ctree-react-oscil.mp4 Link 5.3: http://robots.unizar.es/data/videos/urus/ctree-react-melee.mp4 Link 5.4: http://robots.unizar.es/data/videos/urus/ctree-react-nidle.mp4 in order to avoid competing forces and thus equilibrium deadlocks. However, clusters change often, growing and shrinking as a result of robot movements and link quality changes. The changing of allocations of tasks to clusters could cause cyclic behaviors (e.g. Fig. 5.4, R3 ). In order to prevent this, monotonically growing sets of clusters (S-clusters), which also share the same goal, are maintained within an intertask timespan: this ensures the absence of cyclic cluster formations and thus of dynamic deadlocks. Variables Plan is the robot-goal allocation. PendingTasks is the set of unvisited goals. StCNM is the SDS set in use by the CNM at that moment. StTAM is a SDS set local to 77

Chapter 5. Reactive allocation in constrained networking environments

Algorithm 1 TAM algorithm. Inputs: PendingTasks / StTAM ← StCNM 1: Plan ← 0, 2: while PendingTasks 6= 0/ do 3: if GoalsReached(Plan) 6= 0/ then 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22:

PendingTasks ← PendingTasks \ GoalsReached(Plan)

StTAM ← StCNM Plan ← 0/ else if UnreachableTask(Plan) 6= 0/ then PendingTasks ← PendingTasks \ UnreachableTask(Plan)

StTAM ← StCNM Plan ← 0/ else t StTAM ← St−1 TAM ∪ SCNM if StTAM 6= St−1 TAM then Plan ← 0/ end if end if if Plan = 0/ then Plan ← Allocate(PendingTasks, StTAM ) Plan ← GreedyFill(PendingTasks, Plan, StTAM ) end if SendGoalsToCNM(Plan) end while

the TAM that is used for cluster aggregation into S-clusters. Clusters, being directly derived from the StCNM set, are each maximal SDS-connected subtree. S-clusters are conversely derived from the StTAM set. Initialization and main loop (lines 1–2) The one-time initialization prepares an empty plan and a StTAM set of SDSs matching the one reported by the navigation module. Then the main loop starts and only exits once all goals have been visited or discarded. Goal completion (lines 3–6) When a goal is reached, the pending tasks are updated, the SDS set of the TAM is reset, and the plan is voided. This is done in order to build a fresh plan with the new S-cluster configuration. Note that goal removal (either visited or abandoned) marks the end of the current intertask timespan, and is the only instance in which S-clusters are reset. Discarding of unreachable goal (lines 7–10) The same actions as in goal completion must be performed when a goal is 78

5.4. Task allocation module

considered unreachable. This can be detected because the necessary conditions are that a single S-cluster comprising all the robots exists, that the network topology is a chain and that all SDSs are at maximum elongation. S-cluster maintenance (lines 11–15) If no goal has been reached, the StTAM is updated by merging any new links appearing in StCNM . That is, StTAM stores growing SDS sets that, in the worst case, will span the full network as a single connected graph. By the StTAM updating (line 12), S-clusters are always a non-strict superset of clusters. From this it follows that a robot belongs to a single cluster and S-cluster, and that a cluster is always fully contained in some unique S-cluster. Note that SDSs memorized in TAM have no dynamic effect for the CNM, which moves robots according to its instantaneous SDS set StCNM . Replanning (lines 17–20) When the plan is void (either at the start or because of goal completion or StTAM growth), a new one is computed. This is done by the Allocate subprogram. This is a swappable piece of code (conferring useful flexibility to the TAM) that must conform only to the following postcondition: robots in a same S-cluster share the same goal. Otherwise, there are no limitations on this particular algorithm. Details about the specific allocation strategies we have implemented and tested are described in §5.6. Chains (Fig. 5.4, R4 and R5 ) are naturally formed as a result of using the same goal for all robots within a cluster (Fig. 5.4, R6 , R7 and R8 ). GreedyFill ensures that no robots are idle when there are fewer pending tasks than S-clusters: any S-cluster that has no goal after Allocate is assigned its closest goal in this subroutine. Thus, these properties hold for each plan sent to the CNM: 1. Robots in the same S-cluster have the same goal. 2. Robots are never idle (without a goal). We reset the S-cluster set (lines 5, 9) on goal completion because otherwise, once a S-cluster has covered the full team, it would remain so forever. This, coupled with the S-cluster goal sharing, would mean that only one task at a time would be executed. Resetting the S-clusters on task completion allows a new reallocation to be started with one S-cluster per cluster and more tasks being executed in parallel. Goal transmission (line 21) Finally, the goals are sent to the CNM for use in its subsequent movement generation. 79

Chapter 5. Reactive allocation in constrained networking environments

5.4.4.

Boundedness of intertask timespan

In order to show that the TAM algorithm achieves mission completion, we show that, from any configuration, at least one goal is always eventually reached or permanently discarded; i.e. intertask timespan is always bounded. Since one task is removed after each intertask timespan, a bounded intertask timespan implies a bounded mission timespan. 1. Equilibrium deadlocks cannot occur because of the use of a same goal within each cluster. (Allocate postcondition.) 2. Dynamic deadlocks cannot occur because of the use of a same goal within each S-cluster (Allocate postcondition) and the monotonically growing size of S-clusters within an intertask timespan (algorithm line 11), which prevents the repetition of S-cluster configurations. 3. All robots are moving towards some goal (GreedyFill postcondition). Two cases are then possible: a) All robots and base belong to a same S-cluster. Owing to SDS forces (Fig. 5.4, R6 , R7 and R8 ), eventually they will form a chain if necessary. This chain has maximum reachability; either the single goal will be reached (algorithm line 3) or will be eventually discarded (algorithm line 7). In either case, the intertask timespan ends and we are done.  b) There are two S-clusters or more. Only one S-cluster can include the static base (no node can belong to two S-clusters). Thus, all but this S-cluster are free to move unconstrainedly away from the base at this time in pursuit of their goals. At this point only two events can develop: 1) An S-cluster reaches its goal without growing (i.e. without new network constraints appearing), and we are done.  2) Two S-clusters are sufficiently apart to force a new SDS to appear. These S-clusters are merged, becoming a new larger S-cluster. At this moment we are back to point 3, but note that the monotonically growing size of S-clusters within an intertask timespan makes this kind of occurrence bounded by the number of robots. Intuitively, either some S-cluster reaches its goal, or eventually a S-cluster that comprises all robots plus base is formed. This final S-cluster, which in the worst case is a chain, is either able to reach its goal or to diagnose an unreachable goal to be discarded. In summary, our task allocation strategy is characterized by the following properties: (a) it is reactive, as it reallocates tasks whenever the spring mesh changes, (b) it degrades gracefully, as it attempts simultaneously one task per S-cluster and guarantees execution of at least one task at each time, and (c) it is complete, as it eventually finishes the whole mission. 80

5.5. Timing and implementation issues

5.5.

Timing and implementation issues

To compute the cooperative movement based on the spring-damper analogy, nodes have to share their kinematic information (absolute pose and velocities) to compute the forces acting on the system using the same set of data. However, due to the distributed nature of the system and the communication needed to share these data, a perfect synchronization is not possible and thus robots have to work with slightly different information. The time-displacement among kinematic data on different robots must be sufficiently short to guarantee a correct system dynamic. The task of propagating the shared data is carried out by the broadcast service of the COM. In each iteration of the control loop, the CNM publishes its kinematic information that is propagated by the network. This introduces a (known) propagation delay. Full details are available in appendix B.2.4. In our system, the hardware provides new pose information each 100ms and the control loop iterates with the same period. For our real robot team (4 nodes network, 11Mbps data rate and maximum data unit of 1500 bytes for unicast messages and 14 bytes per robot of kinematic information), the maximum time displacement is of 112ms approximately. The maximum time-displacement between hardware data and control loop data is, instead, about 100ms for the local node and 212ms for the remote nodes, in the worst case. These values are completely assumable by the system since the dominant dynamic time constant of the system is about one second. We have experimented with two execution modes for the TAM. In one, only the TAM of one robot is active in providing goals for all the team, while the rest act as backups. In the case of failure of the robot that executes the allocation algorithm, any other robot would assume its role. In other, every local TAM computes the allocation with the data locally available (which, by network broadcast, should be the same in all nodes but for time skew) to be used by the local CNM.

5.6.

Simulation results

We describe here four tested implementations for Allocate (algorithm line 18). For the following descriptions, let SCt be the number of S-clusters at time instant t, when the subroutine is invoked. The term gathering in the following descriptions refers to the fact that the strategy tries to keep robots together by assigning close tasks.

5.6.1.

Hungarian-based allocation

The well-known Hungarian method [Kuhn 55] computes the optimal pairing of tasks/workers. It only assigns one task per worker, leaving any excess tasks as pending for future allocation. We applied it to S-clusters instead of robots to respect the Allocate postcondition. 81

Chapter 5. Reactive allocation in constrained networking environments

Firstly, Hungarian is applied using all robots; Secondly, the robot with the less costly goal within each S-cluster is kept for the next step and this assignation is discarded; Thirdly, the Hungarian method is used again with the SCt kept robots; Fourthly, each assigned goal in the third step is copied to S-cluster mates.

a) Hungarian method.

b) M IN M IX.

P

c) Greedy with gathering.

d) TSP with gathering.

Figure 5.5: Snapshots of simulation runs for the allocation strategies. Solid lines linking robots indicate S-clusters, while dashed lines from robots to goals indicate assignations. Absent X in some snapshots are already visited goals. Hollow squares are obstacles. In a) it can be seen that robots do not attempt to remain close to one another, and only one task per robot is allocated. In b) can be seen the complete M IN M IX plans for each S-cluster. In c) can be seen the greedy pairing P and how all the remaining allocated tasks are the closest ones to the task in P. In d) can be seen the global TSP solution and how the first tasks in it are assigned to S-clusters. Link 5.5: http://robots.unizar.es/data/videos/urus/ijrr-taskalloc.mp4 82

5.6. Simulation results

5.6.2.

M IN M IX

allocation

Given that the M IN M IX objective showed good properties for general-purpose allocation in some of our previous research, we tried it here. It uses a balanced linear combination of the M IN M AX and M IN S UM costs to drive the assignation. In this work the algorithm was adapted to the insertion heuristic [Reinelt 94], applied using S-clusters instead of robots: when a robot receives its first task, each S-cluster teammate receives the same one and is removed from further allocation (in order to honor the single task per S-cluster postcondition).

5.6.3.

Greedy allocation with gathering

Let P be the pairing robot-goal of minimum cost, which is allocated for starters (hence the greediness). S-cluster mates of the robot in P also receive that goal. Then, the previous Hungarian-based method (see §5.6.1) is applied to the remaining S-clusters, considering only the SCt − 1 closest goals to the one in P.

5.6.4.

TSP-based allocation with gathering

This algorithm is the only one with a persistent component. While the others always reallocate remaining tasks from scratch, this one will compute on its first run a traveling salesman solution for the one robot with a closest goal. This solution, computed with the insertion heuristic, is kept. Then, and in each subsequent call, the first SCt pending goals of this persistent solution are allocated using the Hungarian-based method to the SCt S-clusters. The idea behind this algorithm is to investigate how useful global planning is in this highly dynamic and reactive system.

5.6.5.

Evaluation setup

For the simulations, we chose the Player/Stage platform [Gerkey 03b] in which it is possible to simulate the Pioneer P3 robots that were used in the real experiments. Initially, simulations exclusively devoted to verify the proper operation of the spring-damper systems were performed and are available in appendix B.1. The purpose of the following set of simulations, which comprised several tests, was to evaluate the allocation strategies described in §5.6. The tests had in common a set of eight mobile robots and a static base, located at the bottom-left origin in the scenario. In all cases, the robots were initially placed abreast in line formation on top of the base, and fifty goals were randomly placed in the XY-positive quadrant. The mission to accomplish was to visit all the goals, and total time was measured. Three scenarios were tested. A long-range one, in which goals were placed at a maximum range of nr (n being the number of mobile robots and r the SDS rest length), in order to test performance in extreme conditions; a short-range one where the maximum goal range was nr2 , in order to test performance when robot 83

Chapter 5. Reactive allocation in constrained networking environments

Figure 5.6: Mission time. Box plots show quartiles of the regular simulations. Stars and squares show the average of simulations without springs and with obstacles, respectively, for comparison.

supply is ample in respect to scenario size; and an intermediate one where goals were placed at a maximum range of 3nr 4 . In all cases, the SDS rest length was set at four meters, as if using low power radios while requiring good signal quality. In addition, these tests were also run without any SDS constraints, as if the network range were infinite. In this case, since no SDSs appear, each robot is a trivial S-cluster. Additional runs with sparse random obstacles were also run to evaluate their impact on execution time. Fig. 5.5 shows a snapshot of the four strategies running in the simulator in the medium range test. Ten runs were performed for each scenario, using unique random seeds. In each scenario, all strategies were tested using the same seed and

Figure 5.7: S-cluster changes give an idea of conflicts between navigation and task allocation goals, while preemptions characterize the longevity of task assignments. 84

5.6. Simulation results

Time

thus on equal grounds. Fig. 5.6 summarizes the time results. In the short-range test we observed a small time penalty due to network constraints when compared to unconstrained runs. However, the strategies followed the same trends in all cases, the nongathering ones –Hungarian and M IN M IX– being the best performers in both constrained and unconstrained runs. This is understandable, since in the short-range scenario there is less area to cover, and thus fewer SDS occurrences. Hence, strategies without gathering do not artificially constrain robot spreading. The time penalty was greater in the middle-range test. It affected the nongathering strategies in particular, in a trend that was even more pronounced in the long-range test. We saw that the strategies that performed better in the unconstrained runs no longer did so in the medium and long range tests. When range to base is greater, it becomes more difficult to maintain many S-clusters executing tasks in parallel. Gathering strategies are able to keep more S-clusters for longer periods, speeding up mission execution. This is evidenced by the lower count of S-cluster changes in Fig. 5.7. Gathering strategies also incur a smaller time penalty when compared to the unconstrained runs. The same figure offers insights into other properties of the strategies. It evidences that the greedy strategy may wildly change the tasks assigned from one allocation to the next (high count of preemptions). This does not translate into decisive time penalties, which suggests that preemptions are a secondary factor in mission time when compared to S-cluster changes. The Hungarian method, being locally optimal, is less prone to cause highly differing allocations. Using a long-term plan (TSP strategy) has the additional 25% effect of noticeably reducing preemptions. This is a consequence of trying always to allocate the same tasks in time20% nearby allocations. Any time penalty linked to preemptions is thus mitigated with a global plan. We can also note that 15% the TSP strategy has the lowest count for both preemptions and changes, apart from being the quickest (median-wise) 10% once the range is not short. This suggests that other kinds of global plans are worth investigating in conjunction with 5% the spring-damper scheme. Finally, in order to get some characterization of the amount of concurrency during task execution, we cumula0% 1 2 3 4 5 6 7 8 tively sampled one run of each algorithm in the medium Tasks in execution sized scenario, which is the underlying data for the hisFigure 5.8 togram shown in Fig. 5.8. As this figure shows, the team degenerated into a single S-cluster (that is, moments where there is no longer task concurrency since all robots are assigned the same target) only below 10% of the mission time. Indeed, the most common occurrence was the pursuit of 4 or 5 targets simultaneously. A full study and comparison to the small and large simulation scenarios, in order to get a clearer understanding of concurrency performance, could be a subject of future work. 85

Chapter 5. Reactive allocation in constrained networking environments

5.7.

Experimental results

This real experiment was intended to test the whole system behavior in a real scenario using the real link quality signal to maintain connectivity. Twelve goals were placed to be visited by three Global Positioning System (GPS) equipped Pioneer 3-AT robots supervised by a fixed base station (in this case a laptop). For the TAM layer we chose the Hungarian strategy because we expected the signal quality to be in the short/medium range scenario of the simulations. In these ranges, either Hungarian or TSP were expected to perform comparably well. Previously, exhaustive experiments designed to test the COM and CNM modules were performed which are detailed in §B.1. Fig. 5.9.a shows the paths followed by the robots, and three superimposed snapshots of the existing SDSs at different times. t1 shows the activation of the first SDS, t2 reflects the first instant in which two SDSs exist, and t3 shows the moment in which the maximum number of SDSs is active for the first time. Fig. 5.9.b shows the quality of the links that form the Minimum Spanning Tree of the communication network. The tree is composed of the minimum number of links with the maximum quality that are indispensable for maintaining the network connectivity. The three plots of the figure correspond to the three necessary links in this case. As can be seen, the links in the spanning tree change during the experiment depending on the relative quality (note the changes in the line types in all the plots). When the quality of a link of the spanning tree overcomes the threshold, a SDS is created (instants t1 , t2 , and t3 ). No active link exceeded the forbidden threshold; thus, as intended, no active link was ever broken, which fulfills the objective of enforcing connectivity. Finally, Fig. 5.9.c shows the distances between all nodes. Here we can observe that distance and quality do not perfectly correlate, which suggests that using distance as a substitute for quality will not always provide the desired results. Studying the three graphs at once, we can dive into the details of how things happened. Prior to t1 , there was no need for SDSs, and there were several link switches that did not affect the team movements. At t1 appeared the first SDS, linking base to R2 . However, since all the robots were at a comparable distance to base, link quality changes prevented any strong preference for one of them as a relay, and thus we can observe that the base used for short periods of time any of the three robots as a relay for the other two. Eventually, R1 lagged behind and consolidated as the stable relay between base and R2 , R3 . When these robots moved away from R1 , eventually a new relay was necessary: at t2 , R2 was linked with a SDS to R1 . Finally, at t3 a third SDS appeared when R3 took the lead towards the farther goal. We can observe that, with the parameters used, SDS were quite short when created (circa 10m) but the robots still retained a good level of mobility as the SDS elongated. It was not until shortly before t2 , as link quality decreased, that R1 was stopped by the SDS force pulling from base. Also, R2 did not stop moving (Fig. 5.9.c) until the very end of the mission, which reveals that its SDS to R1 86

y (m)

ϒ(s)

87 0

20

40

60

80

100

120

0

10

20

30

40

-8

-6

-4

-2

0

2

4

6

0

Base-R1 Base-R2 Base-R3 R1-R2 R1-R3 R2-R3

t1

t1

20

40

c)

time

b)

time

a)

x (m)

60

t2

t2

80

t3

t3

100

Base-R1 Base-R2 Base-R3 R1-R2 R1-R3 R2-R3 ϒₒ Forbidden

R1 R2 R3 Goals SDS t1 SDS t2 SDS t3

Figure 5.9: a) Paths followed by the robots and SDS at the time of their creation. b) γ of the links composing the Minimum Spanning Tree of the network during the complete experiment. c) Distances to base and between robots. Link 5.6: http://robots.unizar.es/data/videos/urus/ijrr-final.mp4

d (m)

8

5.7. Experimental results

Chapter 5. Reactive allocation in constrained networking environments

was not yet at equilibrium. This shows that using distance as the SDS trigger would be difficult: either we could err by being too conservative, or we could try to allow more mobility and lose connectivity due to signal oscillations. Since different environment conditions could greatly change the expected maximum length of a link, this again confirms that using link quality is a better approach.

5.8.

Simulations in light of optimization objectives.

A further study of additional strategies for the Allocate (algorithm line 18) subroutine was carried, in a slightly different simulation setup. In this case, the focus is on the evaluation of performance with respect to already familiar optimization criteria described in previous chapters (M IN M AX and M IN S UM), and a new addition: the M INAVE objective, which studies the average cost per task (referred to as latency in certain contexts). A detailed formulation can be found in appendix A. The new allocation strategies are:

5.8.1.

Greedy allocation without gathering

Greedy algorithms are the simplest approach to the allocation problem. In our implementation, the closest task to any robot is chosen and is propagated/allocated to all its S-cluster mates. This process is repeated until all robots have a goal. This algorithm is the only one presented in this section that does not make any attempt at long-term, global planning. It serves as the comparison baseline for distributed algorithms. It differs from the one described in §5.6.3 in that this one does not uses tasks close to the one first chosen, but looks for the best purely greedy robot-task pair.

5.8.2.

TSP-Based allocation

The motivation behind this algorithm is to avoid robot spreading, which leads to spring appearance and performance degradation. As a first step, a single-robot TSP solution is computed. (In this work, the problem instance sizes permit the use of and optimal solver [Applegate 03]; otherwise we would use any of the many known heuristics for good approximate solutions.) Let SCt be the current S-clusters count at time t. At each reallocation, the first SCt goals in the global TSP plan are allocated to the SCt S-clusters using the well-known Hungarian method. This way, tasks are consumed by the team in the order dictated by the global TSP plan. This implementation differs from the one in §5.6.4 in that it uses an optimal solver instead of an insertion heuristic.

5.8.3.

Clock allocation

Previous simulations show that, many times, the robots naturally follow a sweeping behavior while maintaining an approximate abreast formation, as will 88

5.8. Simulations in light of optimization objectives.

be seen in §5.8.6. The Clock Allocation algorithm tries to explicitly induce this behavior, by precomputing a plan in which tasks are ordered by their angle in polar coordinates, starting at the closest one to any robot. The origin is placed at the middle of the working area, where the base is located. This plan is subsequently allocated to S-clusters and targets are consumed as explained in the TSP-Based Allocation algorithm.

5.8.4.

Auction allocation

With this algorithm we aim at influencing directly one of the three metrics presented in appendix A. Drawing from past literature on auction-based multirobot routing methods [Lagoudakis 05], we use auctions to preplan the order of tasks, according to an appropriate bidding rule that relates to the desired metric [Tovey 05]. While, in principle, we could use an auction to generate a single plan, like TSPBased Allocation, this algorithm goes further and builds several smaller plans to be executed in parallel by different S-clusters according to a schedule geared towards optimizing the desired metric. This parallelism must be carefully exploited. Some tasks may be so distant that it would be impossible to reach them in parallel; an attempt would merely force the degradation of the team into a single S-cluster and completion of only one task, while others would be postponed. In order to tackle this issue, the auction algorithm attempts to predict the number of robots needed to reach each goal from the initial (base) location. This prediction is based on the assumption that signal quality remains satisfactory within a known distance L, which when exceeded causes a spring to appear, if no other network route exists. With d denoting the goal distance  d to the base, tasks are classified in sets defined by the predicted number N = L + 1 of robots required to reach them from the base. It should be noted that this prediction, based on expected spring length, does not invalidate the properties of the basic task allocation strategy; it merely acts as a guess, which can only degrade performance, if wrong. Tasks are auctioned sequentially set after set, using increasing  N  = 1, 2, . . . , n, where n is the number of robots. For each set, at most SN = Nn plans can be carried out simultaneously in parallel for reaching targets in that set, since otherwise we would need more robots than available. Initially, for N = 1, the auction will result in the formation of up to n partial parallel plans. From those, only up to n/2 plans will participate in the auction set for N = 2. In general, the number of partial plans eligible to win tasks during the auction decreases with each farther set of tasks; within an auction set N, as soon as SN different partial plans from the previous set win tasks, the remaining partial plans are finalized. Only one plan will survive for the tasks in auction set N = n. By construction, these plans can be carried out in parallel during execution; as short plans are completed, robots are freed up to join another robot cluster assigned to a longer plan, and so on, until all robots form a single cluster assigned to the longest plan. 89

Chapter 5. Reactive allocation in constrained networking environments

a) Greedy

b) TSP-Based

c) Clock

d) Auction M IN S UM

Figure 5.10: Two snapshots of each algorithm in action, at initial (top) and intermediate (bottom) mission execution. Lines indicate task ordering. Link 5.7: http://robots.unizar.es/data/videos/urus/ctree-react-algs.mp4

90

5.8. Simulations in light of optimization objectives.

5.8.5.

Simulation setup

In this study, we consider multi-robot missions where robots are deployed from a single base point and mission tasks cover a circular geographical area around this point. A single robot (or a station) remains at the central point and serves as the communication base, whereas all other robots can freely move around the area constrained only by the connectivity requirement. We only consider uniformly random distribution of tasks and no obstacles in the physical environment, representing exploration or sample collection scenarios over a large terrain. These choices represent an attempt to filter out bias coming from structured task distribution and/or specific obstacle layout in this study of the feasibility of multi-robot routing under communication constraints. Even though there are no physical, static obstacles in the environment, our robots are equipped with obstacle avoidance capabilities to dynamically avoid collisions with each other. For any given mission, our software measures the actual team performance with respect to all metrics defined. More specifically, in our experimental setup, there are n = 8 mobile robots and m = 100 targets. The targets are uniformly randomly distributed over a circular area with a radius of 24 units. We study four communication ranges between the robots: (a) L = ∞, where there is no constraint and any robot can reach any target, (b) L = 8 units, where at most 3 robots are needed for reaching each target, (c) L = 4 units, where at most 6 robots are needed to reach the most distant targets, and (d) L = 3 units, where all 8 robots are necessary for reaching the most distant targets.

Figure 5.11: Total distances traveled by all robots for each algorithm. Mean values and the 95% confidence intervals are shown in each case. 91

Chapter 5. Reactive allocation in constrained networking environments

5.8.6.

Simulation results

Performance for each of the proposed algorithms is measured in terms of the criteria previously described. These metrics are (a) total distance traveled by all robots combined (M IN S UM), (b) mission timespan (M IN M AX), and (c) average task completion time (M INAVE). In addition, for comparison purposes, we assessed the performance of classical multi-robot routing with auctions in the absence of communication constraints. Fig. 5.10 shows two snapshots of each algorithm execution. Greedy has a characteristic spreading-out behavior, while TSP-Based and Clock exhibit the sweeping motion already described. Four parallel plans generated by Auction (with the M IN S UM bidding rule) are visible at startup. Fig. 5.11 to Fig. 5.13 show the average performance of each algorithm with respect to each metric. We observe that TSP-Based and especially Clock are the most insensitive to the connectivity range L, with almost no penalty for L = 8. On the other hand, Greedy and Auction for all bidding rules quickly degrade with decreasing L. The plans built by Auction (using any of the three bidding rules) are difficult to adhere to in our reactive setup, because spring appearance often disrupts the predicted task execution order. We notice that good performance under no constraints (L = ∞), for example in the Greedy and Auction case, does not necessarily carry over when communication is limited. TSP-Based and Clock are worse in the absence of constraints, but degrade at a lesser rate than the rest of the algorithms as constraints come into play. This is seemingly due to the characteristic sweeping pattern these algorithms induce. Furthermore, to its favor, Clock is computationally negligible (O (m log m)),

Figure 5.12: Total mission completion time for each algorithm. Mean values and the 95% confidence intervals are shown in each case. 92

5.9. Discussion

unlike TSP-Based (NP-hard) or Auction ((O((n + m)3 ))). The good performance of the sweeping behaviors suggests that partial parallel sweeps may improve the obtained results. In contrast, the parallel plans of Auction require more precise control over the spring spanning tree configuration to be of better use. In terms of objectives, M IN M AX and M INAVE behave similarly, which has been observed also in other works on robot routing without constraints. This is due to the influence of maximum time on the average time. It is worth noting that there is little impact on M IN S UM for all tested values of L in the case of sweeping behaviors like TSP-Based and Clock; this fact may be useful in scenarios where predicting power consumption is important.

5.9.

Discussion

This chapter has presented a complete system that safeguards network connectivity in multi-robot teams. In order to do so, it relies on three modules: the network, that provides link quality measurements; the cooperative navigation, which generates low-level control commands in order to steer the robots and forcibly prevent network splits; and, finally, the presented reactive task allocation module that gathers information from the navigation module in order to generate assignations that ensure steady mission progress, without deadlocks or livelocks caused by conflicting interests from the navigation and task fronts. The main advantages of the spring-damper system are that it guarantees both smooth control and network integrity. Its main drawback is that it is not read-

Figure 5.13: Average task completion time for each algorithm. Mean values and the 95% confidence intervals are shown in each case. 93

Chapter 5. Reactive allocation in constrained networking environments

ily suited for use in cluttered environments. In such working areas, network links change its quality more erratically and, furthermore, competing forces between obstacle avoidance and connectivity maintenance may generate conflicts that would require high-level planning such as found in [Urcola 08] but which, in turn, does not addresses task allocation. There are several notable characteristics of the presented task allocation module. Of particular note is that it acts as an “advisor” for the robot activity, since it has no binding authority to generate robot movements, which is the responsibility of the navigation module. On the one hand, this is the basis of the guarantee that network splits should not happen. On the other hand, the allocation module is designed such as to also guarantee steady progress towards mission completion, as has been discussed in the proof section. The combination of these two aspects are particularly remarkable and possibly the most relevant contribution of this method. Other significant aspect is the combination of the reactive foundation of the algorithm with a pluggable generic allocation algorithm for customization. The first makes the algorithm specially appropriate for the highly dynamic changing conditions of network links, taking advantage of the clusters naturally formed by the dynamics of the cooperative navigation module. The second enables the reuse of allocation algorithms and some degree of tailoring towards specific optimization objectives. We have evaluated common metrics used in multi-robot routing problems, such as team energy consumption, mission timespan, and average task completion time. Several high-level allocation strategies were tested, concluding that for medium and long-range missions a global plan and keeping robots together are advantageous. This minimizes changes in network topology and task preemption, thus allowing better task parallelization. The whole system has been implemented and tested by means of simulations and real multi-robot experiments. Results show that the proposed solution fulfills the objective of maintaining network connectivity at all times while completing the assigned tasks. From the experiments we verify that the use of the link quality information, together with a virtual spring mesh, is a valid solution for maintaining network connectivity, specially in open spaces. The penalty inflicted by the communication constraints is within one order of magnitude in all studied cases, and increases in most occasions with the inverse of the communication range. These findings show that effective multi-robot routing can be achieved even under limited communication range with moderate loss compared to the case of infinite communication range. We have identified weaknesses and strengths of our algorithms with respect to these metrics and possible venues for improvement. In particular, the TSPBased and Clock algorithms exhibit small sensitivity to the actual communication range, which makes them appropriate for situations where the actual communication range is not known in advance. However, our attempts at using auctions to optimize particular metrics did not achieve competitive results. Some runs of the auction algorithm indicate that improvements over the TSP-based and Clock ones are possible, mainly with high communication range, as auctions take into account 94

5.9. Discussion

explicitly the team performance metric, as well as the distribution of targets. Nevertheless, to exploit successfully this advantage some degree of control over the topology of the underlying connectivity spanning tree is necessary, which is not available yet in this work. The next chapter presents a method which addresses a notable drawback of this one: it is specifically designed to be used in environments with arbitrarily complex obstacles. In such environments, the technique presented in this chapter might fail to complete the mission due to a lack of strategic planning in regards to large obstacles, local minima and varying signal behaviors.

95

Chapter 5. Reactive allocation in constrained networking environments

96

Chapter 6

The C ONNECT T REE strategy for cluttered environments Robots have a limited communication range in real-world missions over large geographical areas, and signal propagation conditions may be unpredictable due to obstacles or interferences. This chapter is devoted to a multi-robot routing algorithm well-suited for cluttered environments under limited communication range. The focus is placed on the aspect of route planning in order to guarantee continuous connectivity and mission completeness, while offering a guarantee on performance for the obtained allocation under the team objectives already seen in previous chapters. A sequential execution strategy is explored first, in which performance bounds can be identified. Next, a concurrent execution enhancement is presented which offers improvements while retaining these bounds.

97

Chapter 6. The C ONNECT T REE strategy for cluttered environments

6.1.

Introduction

As we have introduced in the previous chapter, there is a growing trend to drop the assumption that the robots of a team are able to freely communicate since, in many interesting real-world applications, network connectivity must be explicitly considered and preserved. Furthermore, while some domains may tolerate temporary losses of communication between team members, in others it is mandatory to maintain connectivity at all times: monitored surveillance, rescue missions, tightly coordinated cooperation, etc. These include domains where uninterrupted contact to a stationary monitoring base or command center is commonly required. The method presented in the previous chapter addressed such needs, but was not well suited for use in cluttered environments. In such working areas, network links change its quality more erratically, due to e.g. interferences, multipath reflections or fading. Furthermore, competing forces due to obstacle avoidance and connectivity maintenance might generate navigation conflicts that would require a higher level of planning. This chapter presents an approach tailored to these concerns which provides a routing solution suitable for dense environments and prevents network splits with minimal assumptions on the signal propagation. In effect, each target is eventually served by a group of robots without breaking their connectivity requirement to a stationary base. We study algorithms that yield efficient routes and we prove that the quality of these routes is within a factor from the optimal ones using graphtheoretic arguments. The proposed approach is studied in a large realistic scenario using the Player/State robot simulator. Experiments with real robots validate this proposal and show insight into the behavior of real signals. The chapter is organized as follows: §6.2 reviews the issue of limited connectivity and how other researchers have addressed it. §6.3 describes the proposed strategy, which theoretical properties and quality bounds are proved in §6.4. Empirical results from simulation are shown and discussed in §6.5. Implementation details are discussed next. Sections 6.7 to 6.10 introduce and study concurrency improvements over the secuential strategy. Real experiments are detailed in §6.11. Finally, we discuss future work and conclude in §6.12.

6.2.

Multi-robot routing under limited connectivity

The limited connectivity issue is a growing concern in multi-robot research. We have already alluded in the previous chapter to related work in this field. Here we briefly remember the approaches most similar to the one being presented in this chapter. In particular, work from Pezeshkian [Pezeshkian 07, Nguyen 04, Nyugen 04] is grounded in the same basic concept of using static relays, although focusing more on hardware issues, whereas we are more concerned with allocationrelated performance issues. In this work we extend this idea, adding a key ingredient: robots are forced to 98

6.3. The C ONNECT T REE algorithm

Figure 6.1: A possible path tree visiting four goals. Any depth-first traversal gives minimum backtracking and thus mission timespan. However, 3-2-1-4 would minimize average cumulative cost. follow fixed entry and exit paths. By using this premise we can focus on route planning and execution ordering in a way that offers some cost bounds over baseline, unbounded solutions. This approach offers a solution for field robots in scenarios with arbitrarily shaped obstacles. Reactive operation based on measured signal enables us to drop most assumptions on the signal propagation model. In practice, we only require that the decay function is reasonably continuous, meaning that, between two consecutive measurements of signal quality made by a moving robot, the signal will not skip a controlled zone that serves as a buffer between the safe and the forbidden zones (Fig. 5.3). As a bonus, our approach in this work is readily applicable to tethered robots [Marques 06] and teams of robots. The mission begins with all robots located at some base location and completes when all targets are visited and the robots return to base. The problem of multirobot routing is to find an allocation of targets to robots and a path for each robot that begins at the base, visits all targets allocated to it, and ends back to the base, so that the mission is completed and a team objective is minimized. Three intuitive team objectives, corresponding to minimizing energy, makespan, and latency respectively, have been already described in previous chapters and detailed in §A.

6.3.

The C ONNECT T REE algorithm

This generic strategy for multi-robot routing under limited communication range in dense environments is based on the following ideas. Robots move together as a whole towards a target, monitoring the network link to the base; once this link falls under the safety threshold and into the controlled zone, one robot is stopped at that point to act as a relay, and the rest continue advancing while monitoring now their link to the stopped relay robot. Using this pattern repeatedly, the robots can reach far away from the base without ever breaking the link to the base. 99

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Note that the robots do not need to know their communication range; the decision to stop a robot as a relay is made dynamically as dictated by the locally measured signal quality. Once the target is visited, the robots are allowed to retreat from the target only along the path they used to reach it to avoid potential deadlocks due to the density of obstacles and the connectivity constraint. This retreat mirrors the deployment, and a relay can only start retreating when all moving robots reach the relay location. In practice, the set of paths used by the robots is a tree rooted at the base station (Fig. 6.1). Note that two paths to different targets may share a portion in the upper levels of the tree. Therefore, after visiting a target, robots do not need to retreat all the way to the base to move on to the next target. Instead, it is sufficient to retreat up to a common ancestor point between the two paths. Thus, during the entire process, critical network links lie within the tree and the relative robot ordering along any path remains the same when going towards a target and when retreating back. This would allow the use of this approach in conveniently tethered robots, as well as in a single robot capable of deploying and collecting static relays. Alg. 2 shows the pseudo-code for the C ONNECT T REE algorithm. It can be argued that it is more M IN S UM-cost efficient to begin moving only one robot, and move relays one by one on demand once a weak signal is detected between any robot pair. This scheme is not recommended because of network safety issues; it involves two moving endpoints in the network links, instead of just one, increasing the risk of sharp network dropouts. Moreover, it does not improve M IN M AX and M INAVE costs. There are two key subroutines affecting performance in the C ONNECT T REE algorithm. AllocateFrom (line 2) decides the next target to visit. One advantage of using a tree structure is the fact that any depth-first traversal will have minimum backtracking along the branches, and thus gives minimum M IN S UM and M IN M AX cost for the given tree. We used depth-first traversals in this work, while future research may specifically address traversal orderings for the M INAVE objective. The most influential subroutine, however, is BuildTree (line 1), which builds the tree routes of the robots. Depending on the depth of the tree from the base, the robots may or may not be able to traverse it in its entirety. If the depth exceeds the maximum range they can reach collectively, the remaining target(s) along that branch will be abandoned. To avoid such potential failures, one may use a straightforward scheme: construct a star-like tree where each target is attached directly to the root (base) using the least-cost path between them. Despite its simplicity and the potential inefficiency of the resulting tree, this scheme guarantees that the mission will be accomplished, if it can be accomplished. No target will be abandoned, unless it is beyond the ability of the team to reach it. We refer to this simple variation of the C ONNECT T REE algorithm as STAR. Further improvements studied in this paper focus solely on tree construction with the goal of achieving provably good performance. It should be clear that the total cost of the constructed tree is indicative of the team performance during the mission. Additionally, the depth of the tree plays a key role in the success of the 100

6.3. The C ONNECT T REE algorithm

Algorithm 2 Description of the C ONNECT T REE strategy. Variable i defines the range i..n of moving robots, while 1..i − 1 represents the stopped relays. r0 , for simplicity, denotes also the Base. ωi, j denotes link quality between robots i, j (or Base and r j for ω0, j ). UT is the upper threshold of the controlled zone. Variations are denoted by s ∈ S = {STAR, TSPT, MST, DLST} Inputs: Base, T , n, s 1: PathTree ← BuildTree(s, Base, T ) // Construct the initial tree of paths. 2: CurrentTarget ← AllocateFrom(PathTree) // Choose first target. 3: i ← 1 // Initially, all robots move. 4: while PathTree 6= 0/ do // Main mission loop, targets remaining. 5: while ¬Reached(CurrentTarget) do // Advancing towards target. 6: StepTowards(who ⇒ {ri . . . rn }, where ⇒ CurrentTarget) 7: if ωi−1,i ≤ UT then // Link exhausted, ri must stop. 8: if i = n then // Check if ri is our last available robot. 9: PathTree ← BuildTree(s, Base,T ∩RemainingTasksIn(PathTree)) 10: exit // Tree rebuilt, backtrack to new path. 11: else // Stop ri to act as relay. 12: StopRobot(i); i ← i + 1 13: end if 14: end if 15: end while 16: if Reached(CurrentTarget) then 17: PathTree ← PathTree \ CurrentTarget // Remove reached target. 18: end if 19: if PathTree 6= 0/ then // Select next target from remaining ones. 20: NextTarget ← AllocateFrom(PathTree) 21: BackPoint ← CommonAncestor(CurrentTarget,NextTarget) 22:

// Deepest common node in the retreat path and the new path.

23:

else NextTarget ← BackPoint ← Base // Final comeback to base. end if while ¬Reached(BackPoint) do // Retreating from current target. StepTowards(who ⇒ {ri . . . rn }, where ⇒ BackPoint) if Reached(ri−1 ) then // Retreat starts for deepest stopped relay. i ← i−1 end if end while CurrentTarget ← NextTarget // Start advancing to next target. end while

24: 25: 26: 27: 28: 29: 30: 31: 32: 33:

101

Chapter 6. The C ONNECT T REE strategy for cluttered environments

mission. Therefore, it makes perfect sense to try to keep the total cost and the depth of the tree low. Towards this end, we suggest three different ways of building such trees based on solving a Traveling Salesman Problem (TSP), finding a Minimum Spanning Tree (MST), and finding a depth-limited spanning tree (DLST), respectively. TSPT: The first proposed algorithm is based on computing or approximating an optimal open TSP tour, which visits all targets beginning from the base exactly once, but does not return to the base. Despite the NP-hardness of the problem, there are several efficient algorithms and heuristics for finding optimal or near-optimal TSP tours for real-world problems in reasonable time. An open TSP tour is a degenerate tree with just a single branch, therefore C ONNECT T REE can easily traverse it. However, its depth may be well beyond the reaching ability of the team and the mission may fail even though the robots might have the ability to accomplish it. To guarantee mission completeness, we propose an iterative replanning scheme: As soon as an unreachable target is detected during tree traversal, a new open TSP tour is constructed from the base to the remaining non-visited targets. The robots will retract back, in the worst case all the way to the base, to traverse the new tree. The first target on the new tree will be connected to the base with the least-cost path, therefore at least one target will be visited at each replanning and the mission will progress steadily. Under this replanning scheme, the mission will always be completed, if it can be completed. We refer to this variation of the C ONNECT T REE algorithm as TSPT (TSP tree). MST: The second proposal is based on finding a minimum spanning tree over all target nodes and the base. Using Prim’s algorithm [Prim 57] with the base as the seed node, a minimum spanning tree is easily constructed in polynomial time and can be traversed by the C ONNECT T REE algorithm. Even though its total cost is kept low, the depth can be arbitrary, but certainly no more than the depth of the TSP tree. Again, to guarantee mission completeness, we adopt the same replanning scheme. As before, the mission will always be completed, if it can be completed. We refer to this variation of the C ONNECT T REE algorithm as MST. DLST: Our last proposal is based on finding a depth-limited spanning tree with the goal of minimizing the possibility of a costly replanning step. If the robots have some estimate of the maximum depth D they can collectively reach, then we can construct a spanning tree whose depth will not exceed D. This is achieved by a simple modification to Prim’s algorithm: at each step we only consider those edges between the partial tree and the remaining nodes that do not result in a tree depth greater than D. In a situation of abandoned targets due to an optimistic estimate of D, we proceed with the replanning scheme as above. We refer to this variation of the C ONNECT T REE algorithm as DLST (Depth-Limited Spanning Tree). In summary, all four variations of the C ONNECT T REE algorithm are safe (network integrity is maintained at all times) and complete (mission is successfully accomplished) for any number of robots, any number of targets, and any communication range per robot, unless the mission is a failure by design (some target is out of range even using the optimal path with all robots). Note that, nonetheless, 102

6.4. Performance bounds

targets out of range can be safely abandoned if they are the first one in the plan, which would mean they are not reachable using the optimal path. Then, the mission can continue for the remaining targets.

6.4.

Performance bounds

The problem of finding the “best” tree to be traversed by C ONNECT T REE, from a graph-theoretical point of view, is equivalent to a minimum Steiner tree problem [Hwang 92]: given a weighted graph G = (V, E, w) and a subset U ⊆ V , find a tree in G that interconnects all vertices in U with the least cost. Notice the difference to an MST problem over U; a Steiner tree may contain additional vertices not in U that help reduce the cost. The Euclidean Steiner tree problem considers the interconnection of |U| points on the plane with the minimum cost using any intermediate points on the plane. In our case, the Steiner tree would be rooted at the base with all targets at the leaves. The minimum Steiner tree problem is known to be NP-hard [Garey 97], therefore we abandoned the idea of finding an optimal Steiner tree in favor of approximation algorithms. In this section, we analyze the proposed algorithms and relate their performance to a baseline performance determined as the M IN M AX cost (or the M IN S UM cost; they are the same for a single robot) of a closed TSP tour of a single robot that begins at the base, visits all targets, and returns to the base without any communication range limits. We refer to this ideal solution as CTSP (Closed TSP) and we define the performance ratio of any algorithm as the ratio of its performance according to some metric over the baseline performance. Our main results are stated below. Theorem 1. The performance ratio of the TSPT and MST variations of C ONNECTT REE without replanning is at most 2n for the M IN S UM, at most 2 for the M IN M AX, and at most 2 for the M INAVE objectives.1 Proof. Under our tree traversal conventions, the mission cost according to the M IN M AX metric of any variation will be exactly twice the total cost of the underlying tree. Clearly, the open TSP tour cannot be worse than the closed TSP tour and the M IN M AX performance of TSPT is exactly twice the cost of an open TSP tour. Therefore, M IN M AX(TSPT) ≤ 2 M IN M AX(CTSP) and the M IN M AX performance ratio of TSPT is bounded by 2. Similarly, it is known that in metric spaces, like ours, the total cost of a minimum spanning tree does not exceed the cost of a closed TSP tour. The M IN M AX performance of the MST variation is exactly twice the cost of a minimum spanning tree. Therefore, M IN M AX(MST) ≤ 2 M IN M AX(CTSP), and the M IN M AX performance ratio of MST is also bounded by 2. The M IN S UM cost of any variation cannot be determined precisely, however it will be within a factor n of the M IN M AX metric (all robots traverse the entire tree in the worst case), 1 As

of this time, we have not been able to obtain bounds on the performance of DLST.

103

Chapter 6. The C ONNECT T REE strategy for cluttered environments

that is M IN S UM(X) ≤ n M IN M AX(X). Similarly, the M INAVE metric for any variation will not exceed the M IN M AX metric (almost all targets are placed at the leaf of the tree visited last in the worst case, therefore each target is visited just before the end of the mission), that is M INAVE(X) ≤ M IN M AX(X). The remaining bounds are now obvious. Theorem 2. The performance ratio of the TSPT and MST variations of C ONNECTT REE with replanning is at most 2nm for the M IN S UM, at most 2m for the M IN M AX, and at most 2m for the M INAVE objectives. Proof. Each planning or replanning of a tree eliminates at least one target, therefore in the worst case m trees will be build. For each of these m trees the previous bounds hold, therefore it suffices to include a factor of m. Despite the fact that these results represent a worst-case, pessimistic analysis that yields loose bounds, it is striking that the additional cost introduced by the limited communication range cannot really grow arbitrarily. If the need for replanning is avoided, then performance is guaranteed to be very close to the ideal CTSP performance. Our experimental results demonstrate that performance in practice is well below the theoretical bounds.

6.5.

Simulation results

The simulation map is a 135m×86m realistic scenario (Fig. 6.2), modeled after a shopping mall close to our lab, where obstacles are the product stands. A collaboration agreement has been reached with the owners of the shopping mall to conduct real experimentation on-site in the future. This scenario is interesting, not only because of the amount, size and density of obstacles, but also because of the large central obstacle: its particular configuration and its relative position to the deployment point would trap robots in deadlock situations (which cannot happen in our approach) when using methods based on local utility functions. Visiting randomly distributed targets in such a scenario may represent tasks of object inspection or delivery, service requests, or sample collection missions, for example. The simulation environment was the Player/Stage [Gerkey 03b] robot simulator which offers realistic mobile robot kinematics. The robot team consisted of nine simulated Pioneer 3-AT robots, one of which served as the communication base and stayed at the initial point of deployment. High-level path planning was achieved by applying standard A∗ search to an 8-vicinity grid of fixed size over the mission area. Paths obtained this way were used for constructing trees over the grid. There were n = 8 mobile robots and m = 50 targets in the mission. The targets were uniformly randomly distributed over the free cells of the map. The communication range was modeled as a circular area of fixed radius around the robot. The value of this radius was not known to the robots at execution time. Instead, they 104

6.5. Simulation results

(a) STAR

(b) TSPT

(c) MST

(d) DLST

Figure 6.2: Simulation scenario with overlaid path trees. Twenty targets are shown for clarity, although fifty were used in simulations. Link 6.1: http://robots.unizar.es/data/videos/urus/ctree-seq-x8.avi

105

Chapter 6. The C ONNECT T REE strategy for cluttered environments

queried a function that returned a simulated measurement of link quality (ωi, j in Alg. 2) derived from this radius. More specifically, the radius length L was set to the values of L = 25m and L = 50m, typical of Wi-Fi transmitters. The shortest range is barely enough to reach the farthest places in the map, thus making the abandonment of targets a significant concern, and was chosen to gain information on performance under extreme conditions. The longest range, on the contrary, provided insight on the improvements that can be gained, when we are not too tight on resources. For the DLST variant, we set D = nL. Finally, the CTSP cost of the ideal closed TSP tour was measured to serve as a comparison basis with respect to the theoretical bounds; CTSP was obtained using a single robot without any network restrictions. Ten runs were performed for each algorithm and each value of L, using a different seed for each run; nevertheless, the same set of seeds was used across the different algorithms, so that they are compared on the same basis.

6.5.1.

Results

Fig. 6.2 shows two initial trees for 20 targets only (for clarity). It can be easily seen that the STAR tree is more balanced than the MST tree; the MST tree eventually might require a replanning to complete the mission. Figs. 6.3 to 6.5 show mean values and the 95% confidence intervals for each of the three performance metrics in all cases. We observed that DLST with L = 50 generated the same solutions as MST. This means that the branch depth limit was not hit, and thus both are conflated in the figures. We highlight the following observations: • Traversal of the STAR tree requires a noticeable additional amount of backtracking, which results in the worst performance in all cases. • MST is the best performer for M IN M AX and M INAVE. Of special interest is L=25m

16000

L=50m 14000

L=Inf.

MinSum (m)

12000 10000 8000 (x8)

6000 4000 2000 0

ST

AR

TS

PT

T T MS DLS

ST

AR

PT ST TS T/DL MS

Figure 6.3: M IN S UM evaluation. 106

CT

SP

6.5. Simulation results

the fact that it outperforms DLST for L = 25, despite DLST making use of the D limit. This is a consequence of the pessimistic prediction of DLST: paths along the branches of the tree are usually longer than the Euclidean distance between stopped relay robots (the signal propagates in straight lines). Consequently, DLST discarded branches during tree construction that were in fact valid. • In the M IN S UM case, we observed that TSPT and MST performed comparably well. Together with the previous point, this makes MST the best choice among the tested strategies. • Comparing the performance of the algorithms to the ideal CTSP baseline performance (shown only in Fig. 6.4), we see that all instances with known bounds are well below these bounds. Thus, the typical performance is noticeably better; the worst case represents rather pathological cases. • Note that the M IN S UM performance is specially affected in the C ONNECT T REE strategy; we could say that it is its “Achilles’ heel” and the price it pays for network integrity. Since all robots not acting as static relays are moving together all the time, this causes the M IN S UM cost to grow by a factor of n in the worst case. • To allow for a fair comparison, we have plotted the actual M INAVE cost of CTSP in Fig. 6.5 instead of the baseline cost. Similarly, in Fig.6.3, the baseline CTSP cost would correspond to a single robot performing the closed route. However, to make the comparison more meaningful, in this figure the CTSP cost is n times that of the single robot. • We did not run the algorithms with L = ∞ because no replannings were necessary for L = 50. Consequently, we would have observed the same M IN M AX and M INAVE costs, and an even higher M IN S UM cost caused by larger numbers of moving robots.

Figure 6.4: M IN M AX evaluation. 107

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Tree TSPT MST

AvgRep 1.5 0.6

NumRep 10 3

MaxRep 2 3

Table 6.1: Replanning occurrences for L = 25. 6.5.1.1.

Replanning occurrences

TSPT and MST might need replannings if some targets could not be reached. Table 6.1 shows results in the L = 25 case, which is the only case that triggered replannings. Data shown is the average number of replannings per run (AvgRep), the number of runs (out of 10) that required a replan (NumRep), and the maximum number of replannings in any single run (MaxRep). As expected, TSPT suffered more from this problem, since it generates a single long branch. Note also that the maximum number of replannings in any given run is well below the pessimistic bound (m = 50) used in the theoretical analysis.

6.6.

Implementation considerations

In order to implement the C ONNECT T REE strategy in realistic simulations and for use in our robotic team, certain practical issues had to be accommodated. Some of them arise from the fact that while the original algorithm considers the pack of moving robots a single entity, in reality these robots need their own space to freely move around without conflicts. Early experiments showed that navigation in confined spaces, like corridors and doors, required explicit management of the relative locations between robots. Otherwise, obstacle avoidance maneuvers caused by the robots themselves quickly lead to conflicts, collisions and general confusion.

Figure 6.5: M INAVE evaluation. 108

6.6. Implementation considerations

We tackled this issue by keeping the robots in a convoy, in which a robot never approaches its predecessor or successor more than a certain minimum distance threshold. Similarly, and in order to maintain the abstract concept of a single pack of robots from the algorithm, two consecutive robots not acting as relays never can separate more than a given maximum distance. (This is visible in the simulation linked from Fig. 6.2.) This kind of controller works very well in practice, as we have seen in our trials with real robots. Typical limits that we have used are in the range of 1–4 meters. Another practical aspect is that this strategy is best suited to holonomic robots or, at least, to differential drive ones (such as the ones we use). Since, after each visited target, robots have to retreat using the same path, the ability to turn in place is a must have. Because of this modus operandi, a large number of robots are able to operate in confined spaces, were general multi-robot approaches would inevitably require space negotiation algorithms. Localization was accomplished in our case either with the stock Player gridbased particle filter (in the indoor experiments later presented in §6.11.1) or EKF localization [Castellanos 99] over a segment map (in the tunnel experiment described in §6.11.3). The second was necessary because a grid map of the necessary precision in such a featureless space as is a tunnel was prohibitively memoryhungry. The initial fixed-size grid used for path planning in the simulations in §6.5 was later replaced, when moving to real robots, by an adaptive quadtree-based grid. While either are valid for the purposes of tree building in C ONNECT T REE, the latter provided the same or more precision than the fixed grid with much lower memory requirements. Memory is scarce in the embedded computers of our Pioneer robots, which “only” carry 512MB of RAM. Another key requirement of the C ONNECT T REE approach is the link quality measurement. While the RT-WMP (see Appendix B.2) protocol offers all the features we needed, it is perhaps overkill since Hard Real-Time capabilities hare not really necessary for the C ONNECT T REE strategy (although applications running on top of it could very well benefit from it, e.g. remote manipulation). However, it is noticeably difficult to find protocols readily usable with commodity Wi-Fi that offer individual peer-to-peer link qualities, instead of a general quality value per node. One such protocol is the OLSR [Force 03] implementation at http://www.olsr.org, which can be easily set up for use. However, we found that the link information was actualized quite slowly, and certainly too slowly for C ONNECT T REE operation, in the order of many seconds. Furthermore, it is derived from lost packet detection, which is in itself something we do not want to happen. On the other hand, RT-WMP offers sub-second updates based on hardware RSSI measurements. However, RT-WMP is limited to small node numbers (around one score). The bottom line is that C ONNECT T REE is not tied to a particular network protocol, but in practice there is limited choice at the time of this writing.

109

Chapter 6. The C ONNECT T REE strategy for cluttered environments

6.7.

Concurrent C ONNECT T REE strategy

From this point onwards, we enhance the basic C ONNECT T REE strategy by analyzing the implications of concurrent tree traversal while preserving the advantageous features of the basic sequential algorithm, which only attempted to visit one target at a time. To do so, §6.8 studies the properties of the new problem. §6.9 describes algorithms to approximate this complex optimization challenge, while statistical results are shown and discussed in §6.10. Finally, we discuss future work and conclude in §6.12. In the sequential model, only the head robot is used to visit targets. We now want to visit several targets simultaneously. To do so we allow the mobile pack of robots to split at any branching point of the tree. The split pack divides thus in two (or more) packs of some number of robots, each of these packs monitoring now their link to the previous common relay (or base). Each pack then proceeds down the tree towards a different target, and robots in the pack may stop to act as relays as needed, like in the sequential case, exclusively for the sake of connectedness of its own branch. This split operation may happen recursively as long as there are robots available. When one pack retreats after visiting its target, it must stop and wait at the common relay for the other pack(s) to come back. When all packs sharing a relay are reunited at the relay point, a conceptual merging occurs, and the original pack is reformed. This pack can now resume its way back to the upper areas of the tree. See Fig. 6.6 for an example. The need for all packs to wait at the deepest common relay for a merge is due to the same considerations detailed in the sequential case which allow C ONNECT T REE to provide cost bounds and mission completeness. In consequence we can see that, from the point of view of robots going down a

a)

b)

c)

Figure 6.6: An example with four robots. In a), one robot has already stopped to act as relay (square). At the branching point, it has been decided that two robots will continue along the upper branch and one along the lower branch (triangles). Their links to the previous relay are represented by thin lines. In b), the robot below has reached its target (circle) and it is starting its way back. Meanwhile, above, another relay has been stopped. In c), the small pack of one robot is waiting for the other pack of two to come back, in order to merge and continue up the tree and towards the base as a four-pack. 110

6.8. Problem characterization

branch, nothing has changed in regard to the sequential case: they move within the tree, monitoring its link to the relay immediately above in the tree. Also, like in the sequential case, the retreat phase requires all robots below a relay to be reunited before this relay can return to its mobile status in order to move up, even if now robots come back in different packs. With the introduction of the split action, we now have several decisions to make at each branching point: Should the pack split here? If so, how many robots would go down each branch? We study these questions in §6.9.

6.8.

Problem characterization

In this section we address the algorithmic complexity of obtaining optimal solutions for each of the team objectives. In order to analytically evaluate the cost of the obtained solutions, we assume a known and fixed value for the maximum length L of the link between two robots. This is the simplest most used model for wireless links, which is more tractable for a first approach. Promising results would grant a more sophisticated stochastic model with detailed simulation of wireless propagation as, for example, in [Sridhara 07]. Our early problem was the building of trees, which is an instance of the Steiner problem [Hwang 92] (also NP-hard). Thus, in this part we start from an already constructed fully traversable tree and try to minimize the objectives for that tree. However, concurrency cannot in any case improve M IN S UM cost since the total weight of a tree is fixed and the optimal (and polynomial on the size of the tree) solution is any sequential depth-first traversal. Therefore, for the remainder of this study, we focus exclusively on the other two objectives. Counterexamples in Fig. 6.7 show that depth-first traversals are no longer always optimal. This particular problem of concurrent traversal of a tree has ties with optimal covers (for the selection of compatible branches that can be visited concurrently) and scheduling (because of the timing aspects of concurrent execution). We next show, using reduction, that the problem is at least NP-hard for the M IN M AX objective. We start from the NP-hard problem of multiprocessor scheduling (SS8) in [Garey 79]. Its inputs are a set T of tasks, and for each task t in T its execution time Ct , and a global deadline D. The question asked is, is there a 2-processor non-preemptive schedule for the tasks that completes within deadline D? That is, is there a way of mapping tasks to start times in such a way that no more than two tasks overlap and all of them finish before D? We can transform this problem into one instance of our concurrent tree traversal this way: we set n = 2 (two robots that take the role of processors), L = ∞ (no practical network restriction), and construct a tree shaped as a star (every leaf connected to the root) where each leaf contains a target t from T , with edge length set as C2t (so visiting each leaf and coming back takes exactly Ct ). By solving this instance, we could give a solution to the original problem, since each departure of 111

Chapter 6. The C ONNECT T REE strategy for cluttered environments

i)

ii)

iii)

Figure 6.7: Let it be the circles the places were relays stop and, for evaluation purposes, L=1. Root is at top and targets are located at the labeled leaves. White circles represent nodes that violate the depth-first property in the optimal solution. In i), with n = 5, the optimal solution would be to visit concurrently a and b, merge at the root, visit c and return to base, for a M IN M AX cost of 16. The best depth-first solution requires a cost of 18, since b cannot be visited concurrently with any other target. A similar reasoning gives optimal cost of 18 for ii) with n = 7 and 30 with n = 6 for iii), respectively. a robot from the root is equivalent to the scheduling time of a task. The transformation from optimization to decision problem is trivial. For the second part of the proof, given a solution to our problem, we can verify in polynomial time (by simply simulating it) that the constraints and deadline D are honored, thus completing the proof of NP-hardness.  A conservative upper bound of the brute force algorithm would be: at each branching point, we can divide our robots into p(n) integer partitions2 [Hardy 18]; the tree contains at worst m−1 branching points. Each one may split into at most m branches leading to a task. These branches may be visited in m! different orders. At most, if one task is accomplished per visit, m complete retries have to be performed at each branching point. This gives a pessimistic upper bound of O(p(n)(m − 1)m!m) or, simplifying, O(p(n)m!). Indeed, the subject of determining a tighter bound for the brute force case is not trivial and has been recently studied for a related problem in [Baker 07].

6.9.

Algorithms

We call a target internal when it is not located at a leaf, by analogy with internal tree nodes. These targets have relevancy because, while they are as important as the rest for the M INAVE objective, they can be ignored for the M IN M AX objective. In the latter case, it can be intuitively seen that visiting an internal target to immediately retreat, without visiting any leaf targets below it, is suboptimal. In effect, to reach the leaf targets below at a later time, we need to pass again over the internal node of such a target. Thus, the first visit is wasted. We will take advantage of this √ 2n/3 2 p(n) ∼ eπ √ 4n 3

as n → ∞ [Hardy 18].

112

6.9. Algorithms

fact for the design of our heuristics that will be presented in §6.9.2.

6.9.1.

Tree terminology

In order to better describe our heuristics, we define here some terminology about trees. A tree is a well-known class of graph G = {V, E} defined by its vertex set V (the nodes) and edges (E), characterized by the absence of cycles. For our purposes, we are mostly interested in the former set V . Let C(v) be the set of direct children of a node v. Let B(v) be the set of descendant nodes below v at any depth that are also branching nodes. Let T (v) be the set of leaf target nodes below v. Let v be the closest branching node above v (that is, the first ancestor of v with more than one child) or the root node as fallback. Let v be the farthest (in branch length) target below v, or v itself if v is a leaf. For simplicity, we also define the following functions that provide values of direct interest for the C ONNECT TlREE mstrategy. Let l(v) be the length of the branch from root to v. Let then d(v) = l(v) be the number of robots needed to reach v L from root, comprising one mobile robot and d(v) − 1 robots that at some point will become relays.

6.9.2.

Deterministic heuristics

Given the previous complexity result, it is reasonable to find ideas that could provide good heuristics. Most obvious, perhaps, is when to split a robot pack. Two possible options are: a) Early split: to split as soon as we know that we can reach the farthest targets in all descendant branches concurrently, even if this implies visiting sequentially targets within each of these branches. More formally, we split the team if, when k mobile robots arrive at branching node v,



min(1, d(c) − d(v)) ≤ k

c∈C(v)

The minimum of 1 robot per branch is necessary to model that at least one robot has to go towards each child c. From the splitting condition we can also see that we need to send min(1, d(c) − d(v)) robots towards each child c branch. b) Late split: keep together until robots can visit all remaining targets below the current split point concurrently. That is, we split the team when for a branching node v ∑ min(1, d(w) − d(w)) ≤ k w∈{B(v)∪T (v)}

Again, we have the 1 term for the minimum robot per branch, while the rest of the expression represents any additional relays needed due to connectivity constraints. 113

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Thus, even for a fully balanced tree, there are enough robots to supply all the needed relays and visit all the targets at the same time. It is necessary to send ∑w∈{c∪B(c)∪T (c)} min(1, d(w) − d(w)) robots towards each child c ∈ C(v). There is no clear winner for these strategies. See optimal solutions for each case in Fig. 6.8. Another idea is which branch to visit next when there is no split. Inspired on TSP heuristics [Reinelt 94], we may choose the next farthest target or the next closest target. The combination of these two ideas gives the four next combinations: farthest-first with early split (FAR E ARLY in the figures), closest-first with early split (N EAR E ARLY), farthest-first with late split (FAR L ATE), and closest-first with late split (N EAR L ATE). One particularly interesting aspect of the sequential C ONNECT T REE strategy is that it offers cost bounds over the optimal unconstrained TSP solution. We can retain these bounds in our heuristics if we impose some restrictions on the solutions attempted. As we have seen in previous examples (Figs. 6.7 and 6.8), cost gains are obtained when some two (or more) branches are executed concurrently. If we thus allow only depth-first traversals, while opportunistically looking for some other branches that can be concurrently executed in less time than the current depth-first branch, it follows that we cannot do worse than the original sequential traversal. Hence, any bounds that it had are retained. This group of heuristics implements this property. Also, they run in O(m) time, with no perceptible delay for the amounts of tasks tested. For comparison, the basic sequential depth-first closest-first traversal (S EQ DF) is used here as baseline for measuring improvements in the evaluation runs.

6.9.3.

Randomized traversal with timeout

R ANDOM 10/R ANDOM 60: This heuristic has the potential of finding the op-

i)

ii)

Figure 6.8: In i), with n = 6, the optimal solution involves sending all robots down the leftmost branch, in order to visit e.g. a and b concurrently (late split), then c and finally d and e concurrently, for a M IN M AX cost of 14. If we attempted an early split at the root, the four robots remaining for the leftmost branch would require 16 time units to return. Contrarily, in ii) with n = 5 an early split gives the optimal cost of 8 against the 10 units of a late split in the leftmost branch to visit a and b concurrently. 114

6.9. Algorithms

Figure 6.9: Mean and 95% confidence interval of cost reduction for the M IN M AX objective over the sequential depth-first closest-first traversal.

timal solution at the price of sacrificing any cost guarantees. At each split point, we randomly choose the next branch and how many robots to send, between the minimum needed for the closest leaf target (early split), and the maximum robots available (a late split). This process is repeated iteratively with any remaining robots in the same split point, unless a coin toss dictates to leave them unused. In other words, at each juncture we randomly choose one of many possible actions. Thus, with sufficient running time, this algorithm could theoretically explore a large portion of the solution space. The best known cost serves to discard the complete exploration of solutions that are definitely known to be suboptimal. The number in the name is the timeout in seconds for the algorithm. The best solution found on timeout is reported.

6.9.4.

Optimal solution by brute-force

O PTIMAL: In our effort to characterize the results obtained, we implemented a brute-force algorithm in order to obtain some optimal solutions, even if for small problem instances. This algorithm explores every possible choice, given enough time and memory. A bread-first search is performed in which every known state is ordered by either current M IN M AX or M INAVE cost, depending on the objective. When the earliest state is final, it corresponds to the optimal solution. In practice, we were able to solve problems ranging from 12 to 17 targets, depending on the complexity of the solution path tree. 115

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Figure 6.10: Mean and 95% confidence interval of cost reduction for the M INAVE objective over the sequential depth-first closest-first traversal.

6.10.

Evaluation

6.10.1.

Analytical setup

In order to implement and evaluate these heuristics, we constructed an exact grid simulator. The parameters of the test missions, unless explicitly stated otherwise, were as follows. The simulation map is the same 135m × 86m realistic scenario from the sequential tests (Fig. 6.2), modeled after a shopping mall close to our lab, where obstacles are the product stands. Visiting randomly distributed targets in such a scenario may represent tasks of object inspection or delivery, service requests, or sample collection missions, for example. There were n = 8 mobile robots and m = 50 targets in the mission. The targets were uniformly randomly distributed over the free cells of the map. The communication range was modeled as the branch length between robots, being, by default, L = 35, which is not far from the minimum needed to reach the farthest points in the map using all robots. Ten runs were performed for each algorithm, using a different seed for each run; nevertheless, the same set of seeds was used across the different algorithms, so that they are compared on the same basis. With these starting points, in each test we modified one variable of interest, as detailed in the following subsections. 116

6.10. Evaluation

Figure 6.11: Mean and 95% confidence interval of cost reduction for the M I NAVE objective over the sequential depth-first closest-first traversal with both nonclustered and clustered random targets.

6.10.2.

Network range influence

This test was set up to explore the influence of network range on mission performance. It is to be expected that requiring more robots for the farthest targets would reduce the opportunities for concurrency. Hence, the link length L was set to the values of L = 25m and L = 50m (typical of Wi-Fi transmitters). The shortest range is barely enough to reach the farthest places in the simulation map. The longest range, on the contrary, is quite generous, thus allowing us to find which improvements can be expected in such favorable conditions. This test was performed for both the M IN M AX and M INAVE objectives. The results are, respectively, in Figs. 6.9 and 6.10. In these and similar figures, the C OMBINED values were obtained by taking the best result in each run obtained by any heuristic. Therefore, they represent our best realistic expectations of improvement over the sequential strategy. Fig. 6.9 already shows some interesting results: even in such tight conditions, some modest improvements are possible. However, it is with L = 50 that we see to which extent we can expect improvements, with instances running close to 20%. Furthermore, it made patently clear that the late split strategy is better than the early split one. This is also interesting because in future real experiments, with the possibility of requiring replannings, late splits will minimize the penalty for a replanning. Turning to Fig. 6.10 for the task averages we see less optimistic results. This is not entirely surprising, since we are not, in truth, attempting any M INAVE specific 117

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Figure 6.12: Mean and 95% confidence interval of cost overhead over the optimal solution found by brute force, for both cost objectives. optimizations, besides when greedily going for the nearest targets. This is patent in the L = 50 portion of the figure, where both near-first heuristics fared better. Another point of interest is that the random heuristic seemed lost when given a larger solution space to choose from, and in fact could not perform better than the sequential approach.

6.10.3.

Clustering influence

Clustering of targets has a definite importance for M INAVE costs. In effect, algorithms that fail to detect target clusters (e.g. greedy strategies) can be dramatically outperformed by more potent optimizers. In this test we tried to ascertain if, particularly the random strategy, could take advantage of the wider solution space explored. Tasks were either uniformly distributed or following a Gaussian random distribution around a random but small (no less than one, no more than 4) number of clusters. By looking at Fig. 6.11 we can see that, again, the short time allotted to the random heuristics was not enough to observe any particular advantage for them. However, the existence of clusters did augment the chances of improving the sequential results, because late splits resulted in large amounts of targets visited in short periods of time.

6.10.4.

A peek into optimality

For the sake of our brute force solver, we set in this trial m = 12, allowing us to obtain truly optimal mission costs for both objectives. Albeit with a reduced solu118

6.10. Evaluation

Figure 6.13: Randomly generated scenario.

tion space, the results are quite revealing (Fig. 6.12). We confirmed the tendencies seen in §6.10.2 that favor late split for the M IN M AX case. Also, we see that the small problem size allows the randomized solver, just like in the L = 25 case, to come ahead of the other heuristics. Results are worse in the M INAVE case but, still, by using the combined results of all heuristics, we can get acceptable results given the irregular performance of the heuristics. These results point that further work in M INAVE specific algorithms is granted, since average task latency is arguably as potentially useful for service robotics as

Figure 6.14: Mean and 95% confidence interval of cost reduction for the M IN M AX objective over the sequential depth-first closest-first traversal in both scenarios. 119

Chapter 6. The C ONNECT T REE strategy for cluttered environments

80

5

4 60 50

3

40 2

30 20

Quality Threshold X Packets lost

10 0

0

20

40

Packets lost

Quality (%), X position (m)

70

1

60

80

100

120

140

0

Time (s)

Figure 6.15: Single-robot test with internal antenna and high threshold. minimum mission time, and there is a larger gap to close.

6.10.5.

Random scenarios

In this final test, we tried to rule out any evident artifacts in the results due to the particular simulation scenario used. We constructed random scenarios to this end as the one shown in Fig. 6.13, randomly placing crosses into an empty area of the same size as the mall, until it was saturated. The results shown in Fig. 6.14, albeit less generous, illustrate that the same trends can be observed in both kinds of scenario.

6.10.6.

Final remarks

We can extract several concluding remarks after these batteries of tests. Firstly, we see that by using the combined results of all algorithms, we can expect to obtain noticeable improvements over the sequential strategy, which is an encouraging result. However, we also saw that there is room for improvement, specially in regard to the M INAVE objective. We also observed that late splitting is, in general, a more successful strategy. This may be of use for the design of improved heuristics, and is definitely advantageous for the real case in which network range is not known.

6.11.

Experiments

We performed several experiments with our robotic team, composed by differential drive Pioneer 3 robots (see §C.1), in order to validate experimentally the C ONNECT T REE strategy. These experiments used the sequential variant that has 120

6.11. Experiments

80

60

50

60 40

50 40

30

30

20

20

Quality X Threshold Packets lost

10 0

0

50

Packets lost

Quality (%), X position (m)

70

10

100

150

200

0

Time (s)

Figure 6.16: Single-robot test with internal antenna and low threshold. been first presented, while the concurrent improvements are scheduled for future work. The purpose of the tests performed was to verify that working with real signals is feasible using this deployment strategy, and to learn about the behavior of said signals in particular environments, starting with our laboratory as a first step towards a larger experiment located at the shopping mall used for simulations. In general, and unless noted otherwise, these experiments consisted on sending the robots away towards a goal out of reach, in order to force the exhaustion of all the links between robots and forcing a total standstill when all robots become relays and no remaining robot could proceed forwards. At such point, and after a ten seconds delay, the team automatically detected that the target was unreachable and aborted the mission, beginning its retreat towards the base as if the goal had been already reached. The scenario of the experiments is a L-shaped corridor (Fig. 6.20), with a narrow long leg, at the end of which robots are deployed, and a wider and shorter leg, which has the target at its farther end. This provides insights into the out-of-sight link behaviors. This corridor is also used in the video linked from Fig. 6.26, were the author can be seen sitting at the intersection between legs. The initial long leg is oriented along the X-axis, which is the reason why in some of the following figures only the X position of a robot is plotted.

6.11.1.

Single-robot experiments

Our initial tests were done using a single mobile robot and the static base (a laptop) communicating in the 5.2GHz band by means of the RT-WMP protocol. Results from one of such tests are shown in Fig. 6.15. One particular circumstance of this test is that, by mistake, we used an internal antenna from a PCM121

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Quality (%), X position (m)

100

80

60

40 Quality X Threshold Packets lost

20

0

0

50

100

150

Time (s)

Figure 6.17: Single-robot test with aerial antenna and high threshold.

CIA card, instead of the aerial antenna that has proper clearance in all directions. Nonetheless, as sometimes happens, this provided some insights into the signal behavior. The most relevant issue, caused by the internal antenna, is that robot orientation had a large influence on signal quality. In practice, when the robot faced towards the base, a sharp drop was evident (around t=80s), presumably because the robot body interfered with the Wi-Fi signal. During the moments of lowest quality (around 20%), a few packets could not be delivered by the network layer. Note that the network protocol implements retransmission as long as the packets can satisfy their real-time requirements [Tardioli 07]. Hence, this means that these failures could have implied a failure to meet real-time deadlines, which could in turn have implications for HRT control applications. Nonetheless, the amount of lost packets is quite small, considering that four messages per second were being sent. Packet loss is depicted cumulatively in order to improve plot readability. This implies that losses were detected at the vertically raising points in the series. Also, for a loss to be detected, a subsequent packet carrying a sequence counter in it had to be received. In practice, this means that large jumps in packet loss are in reality a steady loss that has been just detected. In other words, the sharper the jump, the longer the silence period from the sender. Another point that can be mentioned is that signal quality actually improved when the robot was a few meters away from the base, as evidenced by the peaks at t=25s and t=125s approximately. We did not observe this effect later with the aerial antenna (Fig. 6.17). Finally, an engineering issue that has not been previously mentioned in the algorithm description is highlighted by the robot stop and quality drop after t=50s. As described previously, a robot stops as soon as its quality drops below the threshold. However, a robot does not becomes a relay unless it has been continuously under the threshold for five seconds. Otherwise, it resumes moving 122

6.11. Experiments

80 Quality Silence Packets lost X Y

80

60

60

40

40

20

20 0

Position (m)

Quality (%), Time (s), Packet count

100

0 −20 −20 0

100

200

300

400

500

−40

Time (t)

Figure 6.18: Single-robot test with aerial antenna and no threshold using the 5.2GHz g band.

normally. This filters out small drops below the threshold due to the noisy signal nature. The threshold in this first test was quite conservative, set at 50%. Another test with the same setup but with a 30% threshold is shown in Fig. 6.16. The same effects observed previously were aggravated here by the lower threshold, which translated in a larger number of lost packets. Indeed, at the final stopping point of the robot, signal quality was too irregular to be considered satisfactory. Once identified the issue with the antenna, we proceeded to repeat this experiment using the aerial unobstructed antenna. Fig. 6.17 shows results that are much more satisfactory. No single packet loss occurred, and the signal fell controllably under the safety threshold. One particular observation we made in this test was that, even when the robot had started to retreat, signal did not improve for some time (t=105s–125s). This cannot be attributed to the quality smoothing filters, which operate in much lower time scales (around 1s), and once again illustrates that real signals exhibit complex behaviors that simplistic simulations fail to capture. Next tests focused on determining good threshold values for the two main classes of frequencies used by the Wi-Fi standard, 802.11b/g. In order to do so, the threshold was set at 1%, causing the robot to move forward until total loss of signal. Figs. 6.18-6.20 show the results for these runs. In addition to packet loss, these graphs show time between successful communications (silence in the plots). This value is important because while the b band shows a higher packet loss count, it is a steady loss that does not impair communications nor the distance the robot can go away from the base. Several conclusions can be extracted from these results: the b band provides more range and a steadier signal degradation. 123

Chapter 6. The C ONNECT T REE strategy for cluttered environments

80 Quality Silence Packets lost / 10 X Y

80

60

60

40

40

20

20 0

Position (m)

Quality (%), Time (s), Packet count

100

0 −20 −20 0

100

200

300

400

500

600

700

−40

Time (t)

Figure 6.19: Single-robot test with aerial antenna and no threshold using the 2.4GHz b band. While both suffer from a noticeable drop in quality when the line-of-sight is broken (around the point at which the Y value starts to rise), the drop is much sharper in the g case, which would indicate that it is less suited for our strategy. The steady loss of packets in the b case could be caused by interferences with other networks in our laboratory. A last experiment was performed in another environment: the Canfranc old train tunnel that crosses from Spain to France (Fig. 6.22). This is a challenging en-

5.2Ghz 2.4Ghz

40

Y (m)

20

0

−20

−40 −40

−20

0

20

40

X (m)

Figure 6.20: Paths followed in the single-robot comparison of bands experiment. 124

6.11. Experiments

200

100 Q Packets lost X

100

60 0 40 −100

20

0

X position (m)

Quality (%), Packets

80

0

200

400

600

−200

Time (s)

Figure 6.21: Single-robot test in Canfranc tunnel.

vironment because signals exhibit particular fading phenomena [Masson 09]. The setup was the same and we let the robot advance until signal exhaustion. Subsequent hardware problems prevented the successful return of the robot. Nonetheless, some significant observations can be made from the data depicted in Fig. 6.21. For starters, the tunnel favors communications at longer ranges than in the other experiments, as the X variable depicts, with more than 300 meters traversed until signal loss. A noisier signal and the fading spots are also visible as large quality oscillations. This single run is not sufficient to draw solid conclusions besides that this particular environment merits more experimentation. It is possible that the fading oscillations could be taken advantage of in order to place relays in favorable spots.

Figure 6.22: Canfranc tunnel. 125

Chapter 6. The C ONNECT T REE strategy for cluttered environments

6.11.2.

Two-robot experiment

An experiment using two robots was performed using the g band in order to minimize packet loss, as previously observed. The results are shown in Fig. 6.23. There are no plots for lost packets or silence in communications since everything performed as expected with the chosen safety threshold of 40% signal quality. As the trajectory plot shows, the tail robot stopped just at the junction of the two perpendicular sections, probably due to the loss of line-of-sight with the base. The head robot then continued for a third of the length of the remaining section. Quality remained acceptable during all the experiment. Note that the head robot quality in respect to the tail robot remained reasonably stable during all the time that the tail robot was close, that is, as long as it did not need to stop to act as a relay. Another noticeable circumstance is that Tail-Base quality started higher than HeadTail quality, albeit they all were very close at the beginning of the experiment. We attribute this to different performance of the radio hardware (since it was configured with the same parameters, particularly transmission power) or influence of robot chassis (since one of the robots was an indoors two-wheeled 3-DX while the other was an outdoors 3-AT). It is remarkable how closely the two robots followed the same overlapping paths (Fig. 6.23). This is due to the underlying adaptive grid used for navigation (see §6.6): robots are localized with a particle filter with very good precision, since the corridor is narrow and has sufficient features in the form of inset doors and benches. The laser map is divided in cells using a quadtree partition which is used for goal generation. Robots are given goals for the next cell they have to visit, which are taken from the path tree generated by the C ONNECT T REE strategy. Consequently, all robots are given the same close goals (albeit not at the same 0

Time (s)

−5

100

−25 −30 Head −35 Tail −30

200

300

400

500

600

80

Quality (%)

Y (m)

−20

100

90

−10 −15

0

70 60 50 40 30

Head quality Tail quality Threshold

20 10 0

−20

−10

0

10

20

30

40

50

X (m)

Figure 6.23: Test using two robots. Qualities depicted are between each robot and its predecessor (Head-Tail, Tail-Base), as per the C ONNECT T REE algorithm. 126

6.11. Experiments

100

Head-Middle quality Middle-Tail quality Tail-Base quality Threshold Head Y Middle Y Tail X

80

100

40

50

Position (m)

Quality (%)

60

20

0 0

−20

0

100

200

300

400

500

600

700

800

900

Time (s)

Figure 6.24: Signal quality in C ONNECT T REE experiment with three robots. time), which does not leave them any options but to follow the same path.

6.11.3.

Three-robot experiments

The first experiment involving three robots was performed in the same indoor setup. Results are shown in Figs. 6.24 and 6.25. While the experiment was completed successfully, there were some puzzling results that we expect to come to understand with further experimentation. For example, the downward spikes in quality in the tail robot do not have a clear origin, since it never lost line-of-sight with the base, nor was particularly close to losing it. Since each robot has a different assortment of mounted hardware (cameras, GPS receivers, pan-tilt units), this is a possible source of differences and interferences. Packet loss is not represented in the figures since it was very moderate, with 0, 22 and 35 packets lost for the head, middle and tail robot respectively. In this particular run, the tail robot stopped before turning the corner, as evidenced by its X coordinate. The middle robot went around the corner, but after losing line-ofsight with the tail robot, the quality quickly fell under the threshold. The head robot was able to advance half-way to the end of the wider corridor leg. A final experiment was performed in the indoor scenario albeit in the context of the URUS project, modeling the possibility of working in areas with both network infrastructure and uncovered areas. In particular, the upper-left wider leg was 127

Chapter 6. The C ONNECT T REE strategy for cluttered environments

Head Middle Tail

40

Y (m)

20

0

−20

−40 −40

−20

0

20

40

X (m)

Figure 6.25: Map plot for a C ONNECT T REE experiment with three robots. considered an area with available infrastructure, in which random tasks could be triggered requiring a robot to visit their location. The lower-left narrower leg was considered an area in which no network coverage was available. The mechanics of the experiment dictated that robots could execute random tasks (allocated by means of auctions) until a target in the uncovered area were requested. At that point, robots would finish any pending tasks in its plan for the infrastructure area, gather at the end of the infrastructure point (the intersection of legs) and begin a C ONNECT T REE deployment to reach the out-of-infrastructure target. In practice, we manually set this latter target at the far end (bottom-right of Fig. 6.26) of the corridor. Fig. 6.26 shows the paths followed by the robots (with a slight Y offset in order to better visualize them in overlapping parts). It is appreciable that in the covered area each robot naturally acquired nearby targets and three approximate regions were formed, each served by one robot. Naturally, further tasks appearing in this area could at any moment have intermixed robot plans. In any case, after the task at the uncovered part was triggered, the robots gathered at the juncture and started the C ONNECT T REE deployment. Again, no significant packet losses were recorded, so they are not shown in Fig. 6.27. It can be seen that only one relay was necessary this time, stopped slightly after the middle point of the leg. The middle and head robots were able to reach the target at the end of the corridor. Since they never separated, signal for the head robot can be seen stable during the whole time. The middle robot, on the contrary, experienced a steady loss of quality once it started to separate from the 128

6.11. Experiments

0

Infrastructure area

−10

Y (m)

Head Middle Tail

−20

Non-infrastructure area

−30

−30

−20

−10

0

10

20

30

X (m)

Figure 6.26: Map plot for a combined auctions+C ONNECT T REE experiment. Link 6.2: http://robots.unizar.es/data/videos/auct+ctree.avi

stopped relay. However, since the end of the corridor was reached before quality fell under the safety threshold, this robot never became a relay. In conclusion, this experiment satisfactorily demonstrated operation in both covered and uncovered environments, with no communication losses and using commodity radio hardware.

100 120 100 80

60

60

40

Head quality Middle quality Tail quality Threshold Head X Middle X Tail X

20 0

40 20

X position (m)

Quality (%)

80

0 −20

−20

0

100

200

300

400

500

Time (s)

Figure 6.27: Signal qualities in a combined auctions+C ONNECT T REE experiment. Only the C ONNECT T REE part of the experiment is shown. 129

Chapter 6. The C ONNECT T REE strategy for cluttered environments

6.12.

Discussion

This chapter addressed the problem of multi-robot routing under communication constraints, when network integrity is a primary concern and dense obstacles make the modeling of accurate signal propagation too difficult. The proposed strategy relies on the careful construction of a tree of paths, rooted at a monitoring station, that reaches all targets. Robots traverse this tree in a network-safe manner which requires only local link information. A valid target may require any number of robots (even all) to be serviced, even using the optimal route; all such targets are guaranteed to be serviced. At this moment, these paths are built considering only geometrical information. We presume that the use of signal maps [Hsieh 08] might translate in better ranges, albeit the basic algorithm would remain the same. In any case, the strategy is sound from a network safety point of view, whatever the paths used. Differences between any expected signal and reality are mitigated by the replanning feature, which enables the use of best-effort paths while minimizing backtracking distances. We focused on tree construction algorithms backed with theoretical bounds on their performance with respect to a baseline performance of an optimal closed TSP tour cost in an unrestricted networking environment. Four algorithms were studied: a simple star tree of least-cost paths from the base to each target, an optimal open TSP tour, a minimum spanning tree, and a depth-limited spanning tree. Simulations in a large realistic scenario empirically demonstrated improvement over the simple star solution, and identified the minimum spanning tree (MST) as the best choice for our network safe approach. We also observed that the empirical performance ratios are well below the worst case bounds. Also studied was the concurrent traversal of trees with the objective of increased performance. While the initial design of this strategy only visited targets in a sequential way, we have laid the foundations to use concurrency whenever possible in order to reduce mission costs. We have done so in such a way that the cost bounds that were demonstrated for the sequential strategy can be retained in some cases for the concurrent strategy, as long as some rules are adhered to, as described. As part of our approach to concurrency, we studied the algorithmic complexity qualities of the problem at hand, showing that it is NP-complete and thus ruling out the immediate availability of an optimal algorithm for the general problem. Instead, we described several heuristic approaches and studied the improvements that these heuristics can offer. We saw that noticeable improvements can be obtained for both makespan and latency objectives, and to some extent quantified, by comparing to optimal solutions for small problem instances, the gap still to cover. Preliminary experiments using the C ONNECT T REE strategy were performed that show that the strategy is valid with the much noisier real signals in our testing area. We have for now implemented only the sequential strategy, being the implementation of the concurrent deployment the focus of future work. There are other several open fronts to pursue in conjunction with the C ONNECT130

6.12. Discussion

T REE strategy. Signal quality maps, as already mentioned, are one of them. We also plan to combine the spring-damper mechanism described in chapter 5 with the fixed paths of C ONNECT T REE. This way, we hope to allow the application of this strategy in even noisier environments, where static relays would require high safety thresholds of reduced efficiency. Using mobile relays instead of fixed ones may enable the use of tighter thresholds, perhaps at the expense of reduced data rates for brief periods of time, while adjusting happens. Finally, theoretical research into the latency optimization objective is needed in light of the results presented herein.

131

Chapter 6. The C ONNECT T REE strategy for cluttered environments

132

Chapter 7

Reusable components for multi-robot research and deployment This last chapter is devoted to the implementation backbone of the simulations and experiments in this thesis. The robotics community is increasingly interested in the publication of source code, as a means to enable code verification, algorithm validation and full reproducibility of simulations and experiments, due to the many complexities of large robotic systems. In particular, multi-robot systems impose a wide range of demands upon software developers: embedded code, remote GUIs, message passing across robots, computing that is often distributed or decentralized, etc. S ANCTA is a flexible architecture for multi-robot teams that follows a component-based approach, which makes it easily customizable and expandable. Published under the GPL open source license, its availability allows other researchers the close inspection of the algorithms described in this thesis, and provides a big jump-start to anyone interested in reproducing or furthering its results. In-house experiments and realistic deployment within the context of the URUS project further illustrate the applicability of this library.

133

Chapter 7. Reusable components for multi-robot research

7.1.

Introduction

Multi-robot teams present new challenges: coordination needs among robots, long-term mission planning, communication with human operators in manageable ways that are not overwhelming in feedback nor too demanding on inputs, etc. Such broad necessities, ranging from embedded software to distributed computation and interaction, require a disciplined approach to software development. Researchers, albeit principally interested in validation of research concepts, can also benefit from rigorous methodologies and processes. While reusability and flexibility are always beneficial, matters like early error detection and bug minimization are of capital importance in order not to compromise research results. Hard real-time software, found in most control applications; embedded code in robots, needing special debugging tools; distributed algorithms with communication and synchronization needs; these are no trifling topics that can hinder proper experimental validation. In multi-robot teams, these issues multiply given the possibly heterogeneous nature of the robots, and always due to the larger number of elements involved. Interactions and complexities rapidly grow as larger numbers of robots and other agents are integrated in a NRS. These difficulties are being increasingly recognized as a new issue affecting both research progress and result validation, as evidenced by, for example, recent workshops in leading international conferences [Bonsignorio 09b] and other publications [Takayama 09]. Of the many available programming languages, Ada is a natural choice for a researcher due to its ingrained principles in software correctness, both for the developer (early error detection, runtime error detection, reusability, portability) and the maintainer or reviewer (readability, separate specifications, maintainability). The library being presented in this chapter, S ANCTA, is mostly programmed in the Ada language but for glue code with C and C++ libraries. S ANCTA1 is a flexible component-based architecture for multi-robot teams that takes advantage of the high-abstraction multitasking capacities of Ada, not present in other languages like C or C++. Published under the GPL open source license, S ANCTA is the backbone to the simulations and experiments in this thesis. Its availability allows other researchers the close inspection of the algorithms described in this thesis, and provides a big jump-start to anyone interested in reproducing or furthering its results. This library was introduced in [Mosteo 07b] and was awarded the second prize in the XVII Ada-Spain competition for Ada-related academic projects.

7.2.

Architecture

S ANCTA was born initially as just the testing implementation of our research,

being named after one of our first implemented allocation methods. Soon it became evident that, rather than have stand-alone programs for each element of research, 1 Simulated

ANnealing Constrained Task Allocation.

134

7.2. Architecture

Figure 7.1: Key elements in the S ANCTA architecture. Components are interconnected as determined by the configuration file. Some important components are explicitly shown. Grayed blocks are C code corresponding to the Player library, accessed through the Player-Ada binding. a reusable solution was more desirable. We could have integrated our code in one of the many existing robot control libraries. However, most are oriented to single robot control, so we preferred to create a new Ada layer on top of one of these, namely Player [Gerkey 03b]. This way we retain more control over our research, while minimizing dependencies on other software. We chose Ada because we believe in its advantageous properties over other popular languages. Also, we had previous exposure to it, since it is actively used for teaching in our university. There are two principles that are paramount in the design of our system: Simplicity: Albeit always desirable, a tool that may be potentially used by several researchers benefits from being kept as simple as possible. This is achieved in S ANCTA by reducing the elements to its simplest expression. The approach taken is modularity, and the building blocks are components. Each component operates as a black box, except for its defined inputs and outputs. Several components can thus interact without having to be aware of any internals of each other. Data passing facilities are provided that ensure multithreading safety. Table 7.1 shows a selection of some relevant components available for our architecture. Flexibility: In order to impose minimum constraints on component programmers, they must conform to a very simple interface. Additionally, some basic execution scheduling is offered by default. Namely, programmers can choose between periodic, asynchronous and event-driven execution (or a combination of these). Direct component interaction is achieved by connecting matching inputs and outputs, allowing the construction of powerful flows of data. Fig. 7.2 exemplifies this chaining construction, that corresponds to a pipes and filters flow pattern [Avgeriou 05]. Indirect interaction is also available through a shared repository approach. 135

Chapter 7. Reusable components for multi-robot research

Figure 7.2: A node configuration example. Odometry and Laser are sensor readers. Scan matching, Aligner and Transform are processing algorithms. Planner is a task allocation component (i.e. Annealer or Bidder). Executor determines the robot commands necessary to perform a task, while Go to goal directly commands the robot hardware. Notable components of our architecture are shown in Fig. 7.1. Nodes can be robots, laptops showing a GUI or, more generally, any connected computing device. A node configuration is defined and constructed with just a collection of component instances and their interconnections. In order to simplify configuration, some components are predefined and always created automatically, though this could be easily changed if necessary due to hardware constraints.

7.2.1.

Configuration definitions

A node configuration is maintained in a simple XML file. This configuration is read (using XmlAda [Briot 00]) at start-up by the S ANCTA executable launched in each robot or device. Fig. 7.2 shows a visualization for a superset of the configuration snippet shown in listing 7.1.



Listing 7.1: Snippet of configuration file Several relevant characteristics are apparent: 136

7.2. Architecture

The top-level element names the configuration. Each node (robot or not) is defined in an “agent” element. Assuming that every node has synchronized copies of the configuration files (see §7.2.2), this allows to configure the whole team in one file. Each node can have a different configuration according to its capabilities. Each component instance is defined in a “component” element. Attributes are used to configure specific component behavior. Component inputs/outputs are defined in “requires”/“provides” elements, respectively. “data” attributes are predefined and detailed in the component documentation. “as” attributes are an arbitrary identifier (Fig. 7.3). “data” names need not to be unique among components, but external names can only appear once as output in each node (unlimited as inputs). Matching external names create an output → input connection. “type” attributes are informative and optional, but data types of connected outputs/inputs should match (more precisely, belong to a same class). Unused outputs can be omitted to avoid clutter. Warnings are optionally available at startup about unused data. Following a factory method pattern [Gamma 95], what the configuration does is to instance a collection of components, each one providing and requiring a series of data that the user must connect appropriately. Components can be data filters, hardware drivers, network modules or any necessary algorithm. Some predefined components naturally act as singletons and are automatically created: Network: Provides messaging among nodes. A straightforward implementation uses UDP packets. A predefined abstract interface exists for network components, so it can be seamlessly replaced with another network/messaging layer. Available layers are UDP, TCP and RT-WMP [Tardioli 07]. Furthermore, abstract layers can be stacked to gain additional functionality. For example, early RT-WMP versions did not support messages longer than a

Figure 7.3: data/as attributes in configuration files correspond to internal/external names of a same value. 137

Chapter 7. Reusable components for multi-robot research

Table 7.1: Some components, ordered in decreasing abstraction level. Component Global database

Inputs Network

Outputs Database

Local database

None

Database

Annealer

Pose, Database

Task allocation

Bidder

Pose, Network Pose, Laser scan None Pose

Task allocation Map

Scan matching

Pose, Laser scan

Pose

Aligner

Pose, Laser scan

Pose

GPS

Pose

Pose

Executor

Task list

Go to goal

Pose, Goal Robot state, Network Any Any

Robot commands None

Map Network Transformer

GUI relay Logger Watchdog Player Ada

Robot commands

Network Pose

None None None Robot sensors

Explanation Globally accessible database for data sharing among nodes with built-in replication. Local data storage and sharing among components. Computes a best effort task allocation for a multi-robot team using simulated annealing techniques [Mosteo 06b]. Bids on auctioned tasks using market-based techniques [Dias 05]. Builds an environment grid map. Provides messaging between nodes. Transforms a pose in robot coordinates to world coordinates. Uses MBICP [Minguez 05] to improve odometry using laser readings. Corrects the pose angle when the robot is in an environment with known principal orientations (i.e. orthogonal walls). Combines an odometry pose with GPS readings to produce global localization. Determines the robot actions needed to perform a task. Issues Player [Gerkey 03b] movement calls. Relays information to remote GUIs. Logs some input to disk. Aborts execution if input remains unchanged for some time. Proxy to robot hardware [Mosteo 06a].

fixed size around 1400 bytes. Two abstract network layers are available that allow more friendly usage without any modification of the RT-WMP component: a compression layer (which transparently compresses and uncompresses messages) and a splitting layer (which transparently splits messages larger than a configurable size and reassembles them at the receiver). Local database: Enables communications amongst the rest of components. Player-Ada: Interfaces with robot hardware. An abstract interface exists so a change of platform should have an isolated impact. 138

7.3. Implementation details

7.2.2.

Configuration synchronization and team startup

As described in the previous section, a single file is enough to describe the full team configuration. In order to start execution, this file should be synchronized to all robots, and then the S ANCTA executable must be launched in every robot. In order to simplify this operation, a script is distributed with S ANCTA which can perform automatically these and other operations: ./sancta-launch [ sync | start | stop ]+ The last parameter instructs this command, used from any node (usually the base station), to perform one of the following operations: sync: Synchronize configuration and executable files in all nodes. start: Launch the node executable simultaneously in all nodes. stop: Kill execution in all nodes.

7.2.3.

Execution modes

Components have two ways of receiving execution time that are supported by the architecture, plus a third that is self-management. More precisely, the modes are: Periodic: When created, the component receives an initial call. Each component can specify, at that moment and at each subsequent run slice, the next time to be invoked. There is a single Ada task (thread) managing these calls, so this is the mode indicated for soft real-time execution of brief-lasting calls. Event-driven (or synchronous): A component can subscribe to data-base records of interest, using an observer [Avgeriou 05, Riehle 97] pattern, and will be notified each time a monitored record changes. Event chains can be constructed this way. It is the responsibility of the user to avoid infinite recursive calls. Asynchronous: Computationally intensive components should create its own Ada task. This should also be done when hard real-time scheduling is desired. The predefined components are tasking safe, so they can be used in a typical client-server approach for Rate-Monotonic Analysis (RMA) or similar techniques.

7.3.

Implementation details

This section explores some details of Ada code, highlighting fragments related to the explained aspects of the architecture. 139

Chapter 7. Reusable components for multi-robot research

type Object i s a b s t r a c t tagged l i m i t e d n u l l record ; −− R o o t t y p e . type Object Access i s access a l l Object ’ Class ; type O b j e c t C r e a t o r i s access f u n c t i o n ( C o n f i g : i n Xml . Node ) return Object Access ; −− For a f a c t o r y a p p r o a c h . p r o c e d u r e Run ( T h i s : i n Next : is abstract ; −− I n v o k e d on c o m p o n e n t −− N e x t s h o u l d be f i l l e d

out O b j e c t ; o u t Ada . C a l e n d a r . Time ) creation . with the next c a l l time .

Listing 7.2: Root component type.

7.3.1.

Component specification

Listing 7.2 shows that components, as advanced, are really simple. They are just a code container, with an added facility for periodic soft real-time calls, that are managed by a priority queue sorted by time. A creation function must be registered (for example on component body elaboration) following a typical factory pattern. This function receives the XML element corresponding to the component configuration. The factory will then be used to create the components as specified in the configuration file.

7.3.2.

Local database specification

The database component exports a type that is actually a protected wrapper over a standard Containers.Indefinite Sorted Maps.Map. The keys are strings, and stored values are descendants of a root type, so the database can store any variety of non-limited objects. Thus, asynchronous data sharing is simply a matter of using the functions in listing 7.3. A simple view renaming allows comfortable use of typed data.

f u n c t i o n Get ( T h i s : i n D a t a b a s e ; Key : i n O b j e c t K e y ) return Object Data ’ Class ; procedure Put ( This : in out Database ; Key : in Object Key ; Value : in Object Data ’ Class ) ;

Listing 7.3: Some database accessing functions.

140

7.3. Implementation details

type Key Listener a b s t r a c t tagged type Key Listener access a l l Key

is l i m i t e d n u l l record ; Access is Listener ’ Class ;

procedure On Key ( This : in out Key : in Value : in is abstract ;

Stored Key Listener ; Object Key ; Object Data ’ Class )

procedure L i s t e n ( This : in out Database ; Key : in Object Key ; L i s t e n e r : not n u l l K e y L i s t e n e r A c c e s s ) ;

Listing 7.4: Database data observing.

Now we present the facilities for event registration, following an observer pattern. Components can obtain automatically from the configuration file the equivalence between its internal keys (the “data” attribute) and the particular name bound to it (the “as” attribute), in fact allowing for the connection of inputs/outputs. The user must define a descendant type of Key Listener, and register an instance in the database with the Listen subprogram (listing 7.4). Every time the key receives an update, the On Key Stored dispatching procedure will be invoked for each listener. Using a tagged type instead of a function pointer has the advantage that the listener can store any necessary contextual data right within it. We provide also a default empty component with a listener member and convenient functions for registration (listing 7.5). This could have been done with the new language interfaces, but we encountered compiler problems. A typical event-driven component will register itself on creation to the inputs he needs. This is particularly useful for filtering components. In Fig. 7.2 is depicted a

t y p e L i s t e n e r ( P a r e n t : a c c e s s Base Component ) i s l i m i t e d new K e y L i s t e n e r w i t h n u l l r e c o r d ; −− Base Component h a s b e e n d e c l a r e d i n p u b l i c p a r t . t y p e Base Component i s l i m i t e d new Component . O b j e c t w i t h record L i s t e n e r : a l i a s e d L i s t e n e r ( Base Component ’ A c c e s s ) ; end r e c o r d ; not o v e r r i d i n g p r o c e d u r e S u b s c r i b e ( T h i s : i n o u t Base Component ; Key : String );

Listing 7.5: Component with embedded listener.

141

Chapter 7. Reusable components for multi-robot research

typical chain of components used to correct the robot pose.

7.3.3.

Fixed-point Ada types

The simulated annealing algorithm (chapter 4) does not require high floatingpoint precision, but takes advantage of exact repeatability of results when operating on task costs. This repeatability is a key factor to make our algorithm O(n log n) instead of O(n2 ). This can be naturally handled in Ada by means of fixed types instead of floating point ones or integer workarounds.

7.4.

New bindings

As part of our work we have implemented some bindings of interest for the Ada and robotics community.

7.4.1.

Player-Ada binding

Our choice for low level robot interaction is the Player [Gerkey 03b] project. Its advantages are many: portability across robot platforms, open source nature, dedicated volunteers and developers, acknowledgment with-in the robotic community, and availability of a quite realistic multi-robot simulator to say a few. Many client libraries existed for it but unfortunately not one written in Ada. Our binding is called Player-Ada [Mosteo 06a]. The C library uses object oriented (OO)

# include ” l i b p l a y e r c / playerc . h” size t get device t size () { return s i z e o f ( p l a y e r c d e v i c e t ) ; }

Listing 7.6: C auxiliary file in Player-Ada. f u n c t i o n Get Device T Size return I n t e r f a c e s .C. s i z e t ; pragma I m p o r t ( C , G e t D e v i c e T S i z e ) ; t y p e P a d A r r a y i s a r r a y ( P o s i t i v e r a n g e ) o f System . S t o r a g e E l e m e n t s . S t o r a g e E l e m e n t ; pragma Pack ( P a d A r r a y ) ; t y p e P o s i t i o n 2 d i s r e c o r d −− A new D e v i c e T Reserved : Pad Array (1 . . Natural ( Get Device T Size ) ) ; −− Note t h e f u n c t i o n c a l l t o G e t D e v i c e T S i z e . −− O t h e r f i e l d s h e r e . end r e c o r d ; pragma C o n v e n t i o n ( C , P o s i t i o n 2 d ) ;

Listing 7.7: Player-Ada specification detail.

142

7.4. New bindings

techniques and so our Ada 95 binding uses the support for OO programming in the form of tagged and controlled types. Our binding uses some auxiliary C functions to improve the portability of the Ada part in the event of changes in the underlying C library. For example, some constant sized structures that are opaque to the user may likely change in size between versions. The construct shown in listings 7.6 and 7.7 abstracts the size of the C structure from the Ada binding, thanks to the Ada type elaboration capabilities. A single recompilation will take care of this aspect in the event of changing the Player version. The same technique has been used for other relevant constants. We believe this is valuable over a pure Ada binding that is specially useful when the C library is still evolving. We have been able to successfully use our binding with three different Player versions over the time doing only minor modifications, and indeed this technique has proven useful in at least one version change.

7.4.2.

Concorde binding

Concorde [Applegate 03] is an advanced TSP solver, free for research purposes. Only suited for the single-traveler with symmetric costs problem, with graph transformations it can be used for asymmetric and multi-robot cases. We have implemented a thick binding for it and also the appropriate transformation functions. With our binding, asymmetric, open and MTSP problems of moderate size2 can be solved. 2 This

is due to the use of large integers in the transformations involved and the use of integers in the C library. We have solved problems involving several hundred cities. Current ascendancy of 64bit operating systems will mitigate this limitation.

Figure 7.4: Actors in the URUS NRS. 143

Chapter 7. Reusable components for multi-robot research

7.5.

S ANCTA deployment in the URUS project

This section details the task allocation subsystem for the URUS project implemented using the S ANCTA architecture. We were responsible for the work package 6, which had to provide the planning and allocation of missions and its constituting tasks to available resources within the URUS environment. This project goal was to bring mobile robots into the city, in order to provide services like transportation of objects and guiding of persons within a sensor network area. One of the principal results of URUS was the definition of the URUS robot, as a set of services that each robot must provide via a well-defined interface. Any robot implementing this URUS protocol can be then integrated in the team and start receiving orders and performing missions. More information about the whole scope of the project was published in e.g. [Sanfeliu 08, Sanfeliu 07]. Fig. 7.4 shows the principal actors in the URUS system. There is a variety of interaction ways available to a user, from direct human-to-robot interaction to smartphone interfaces. Depending on the area of operation, Wi-Fi or GSM coverage could be available to robots, which is transparently managed by a networking layer built upon the YARP [Metta 06] library. Missions can be initiated by direct request of a user using one of these means, or triggered by the camera network upon detection of certain events, like a user waving to a camera. Upon mission creation, task allocation was accomplished using an auctionbased solution [Dias 05, Gancet 04a], in order to achieve maximum robustness and scalability, derived from our implementations in chapter 3. Additionally, the stochastic optimizer described in chapter 4 was running in each node, looking for opportunistic optimizations. For all these functions, identical pieces of software run in each agent involved. However, not all agents run every module. The schema in Fig. 7.5 shows internal and external relations for the implemented modules. Robots run Plan Manager, Task Allocator and Optimizer modules. Central station runs Plan Manager, Watchdog and Task Allocator modules.

Figure 7.5: Module relations. 144

7.5. S ANCTA deployment in the URUS project

Figure 7.6: Smartphone screen for URUS mission request.

7.5.1.

Missions

The experimental applications took place in a urban scenario, namely the Campus Nord of Universidad Polit`ecnica de Barcelona (Fig.7.8). Missions start by requesting (via smartphone, see Fig. 7.6) a meeting with a robot at a certain place, or by physically intercepting a robot and interacting with its on-board tactile GUI. These were the missions finally demonstrated:

Transportation Cargo is loaded at the meeting point, and transported to a requested unload location. A GUI is used to signal the completion of load/unload stages to the robot by the user.

Taxi A specific kind of robot is sent to the meeting point, capable of carrying one passenger (see Romeo robot in §C.2). The user has to climb into the robot, use the interface to start the robot, and then the passenger is transported to the destination autonomously. See an example of full mission evolution in Fig. 7.7.

Guiding A person is lead by a robot to a desired location or transferred to another robot that will continue the guiding, until the final destination is reached. This segmented guiding is due to the possibility of certain robots being constrained to particular areas of operation. The planner determines the amount of robots needed, its meeting points, and generates tasks to dispatch them to the appropriate goals.

7.5.2.

Plans

We define a plan as an annotated task tree containing all the relevant information: task owner, task parent-child relationship, task couplings. Plans are managed within the Plan Manager component. 145

Chapter 7. Reusable components for multi-robot research

Figure 7.7: Evolution of Taxi mission in URUS.

146

7.5. S ANCTA deployment in the URUS project

Central plan The full collection of missions requested to the system are tracked as the full system plan in the central station. This plan is under control of the central plan manager, which tracks finished, running and pending tasks and allows manual operator intervention. Robot plan Each robot, upon receiving tasks for execution, starts to have a partial view of the central plan. In the extreme, this robot plan could be a replica of the central plan, but more likely it will be just a view of the relevant data for the robot: its owned subtrees of tasks and completion status. According to the coupling in the execution of tasks, we have: Uncoupled tasks, which have no constraints on one another. Transport and taxi missions belong to this category, since the object or person is transported by a single robot to its destination. Loosely coupled tasks, which have partial ordering constraints derived from the premise that one must finalize before the other can start. Multiplesegment guiding tasks belong to this category, since they can be potentially split into several stages linked by meeting points. Tightly coupled tasks, which have to be executed in parallel. Evacuation missions would fit into this category (see Fig. 7.8).

7.5.3.

Task evaluation

Task evaluation in the URUS project was done in terms of cost. More precisely, the unit chosen was time, and each robot had a defined cruise speed that was used to convert the paths into estimated times to arrival. This time would then be used

Figure 7.8: Experimental area in Campus Nord of UPC. Link 7.1: http://robots.unizar.es/data/videos/urus/urus-wp9.avi 147

Chapter 7. Reusable components for multi-robot research

Figure 7.9: Task coupling. In URUS, guiding subtasks are potentially loosely coupled by an ordering constraint: if two or more segments by different robots are needed for a full route, each segment cannot begin before the previous one is finished. for both auctions and local annealing plan optimization. Since each robot had unique characteristics (e.g. the taxi robots were limited on their navigation abilities, compared to the smaller robots), these costs would be computed from approximate paths computed by each robot navigation subsystem.

7.5.4.

Software modules

The figure in subsection 7.5 already presented the components and its interaction with other subsystems. More detailed, the main software elements are: Task Allocator The task allocator must communicate with allocators in other agents to decide the best owner for a given task in the context of the agent plans. This was done using a single-item first-round auction algorithm (Fig. 7.7). Robot task allocators apply different optimization criteria to find a suitable point of insertion for the task in the agent plan, according to the criterion descriptor in use (see §3.2). Insertion heuristic, two-opt permutations and optimal routes (for plans of reduced size) are attempted. Task allocator responsibilities include: Receive offers from remote allocators (i.e. auctioned tasks). Ask the robot for information to compute task costs (i.e. compute bids). Check the viability of task insertion in the local plan. Initiate auctions for tasks, either because they are new and not doable by the local robot, or by operator command. Plan Manager The plan manager is responsible for plan transactions. It maintains the consistency of the plan by providing a set of safe plan-manipulation functions. 148

7.5. S ANCTA deployment in the URUS project

An instance of the plan manager runs in each robot. Plan manager interactions consist of: Inter-plan management communication to ensure global consistency of robot plans, e.g. when global optimizations are found by the annealing algorithm. Robot supervisor: to indicate the task to be executed. Task allocation system: to insure safe insertion of new tasks in the current robot plan. Watchdog The watchdog runs in the central station in order to verify that task completions occur as expected. Instances that may cause the watchdog to trigger task reallocations are: Tasks taking an unexpectedly long time to be completed, according to the agent plan. The loss of a robot with assigned tasks. Optimizer The optimizer checks the current plan for possible improvements in the ordering of tasks that were not found at the time of task allocation, which has to be a necessarily quick operation. The optimizer is constantly running the simulated annealing algorithm described in chapter 4. In the final system, each robot carried this optimizer, which could modify the local robot plan (but for its currently executing task). Base stations, which had a global picture of all the robot plans, could consequently find global optimizations.

7.5.5.

Operator inteface

Operators could connect to any URUS agent (robot, station) using a standard web browser. A unified user interface is then loaded (see Fig. 7.10). Due to the distributed characteristics of the system, the available information will depend on the node contacted. Robots have full information about themselves, and limited information about other robots and stations they have interacted with at some point. However, they can relay orders to any such known entity if commanded via the interface. Stations, on the other hand, have full knowledge of missions and selected status information about the robots. The operator can jump from node to node following the known entity lists of its current connected node, providing high awareness of the whole system status and allowing fast localization of desired information. 149

Chapter 7. Reusable components for multi-robot research

Figure 7.10: Interface for local and remote monitoring of robots and stations.

7.6.

Discussion

Software engineering and development is a discipline in itself, with many pitfalls and difficulties. This has been noted in recent times as an additional hindrance for researchers in disciplines with a strong dependency on software, such is robotics and, therefore, multi-robot systems. In consequence, there is raising interest in software disclosure from researchers in order to replicate results and simplify ongoing research. Having experienced these issues, we devoted this chapter to introducing the software architecture used for our simulations and experiments. It makes use of modularity in the form of black box components to allow manageable complexity and enable easy cooperation among developers. This modularity also has allowed the easy integration with other robotic libraries, like the Player/Stage [Gerkey 03b] and YARP [Metta 06] platforms, which are mainly concerned with single-robot hardware drivers and algorithms. In contrast, S ANCTA provides many components devoted to multi-robot issues, like replicated databases, auction algorithms, and the C ONNECT T REE building blocks, making them complementary tools that have enabled us to do more efficient development and deployments in complex heterogeneous robotic systems like the URUS project. The code has been published under the open source GPL license, which enables interested parties to inspect, reuse and improve it. As we have seen, not only the S ANCTA library has been published, but also the complete code used in the URUS project is available from its own repositories. Furthermore, we are undertaking 150

7.6. Discussion

ongoing efforts in order to integrate selected parts of S ANCTA, like the Ada-Player binding, into the Debian repositories, which will enable even greater exposure and simpler usage of this code. The URUS project offered a notable opportunity for the deployment of our algorithms in a very realistic experimental environment, with several heterogeneous robots and a complete sensor network. This was a challenging experience in which many valuable lessons about real integration and deployment were learned. Each robot executed many subsystems programmed by various authors, which introduced complex interactions. In this regard, our usage of S ANCTA simplified noticeably the adaptation of our code to the URUS protocol, built upon the YARP communication architecture. This allowed us to concentrate on the dynamics of interactions with the subsystems provided by other partners.

151

Chapter 7. Reusable components for multi-robot research

152

Chapter 8

Conclusion This thesis has covered several topics with the goal of advancing in the building of versatile multi-robot teams. It started studying the popular auction-based paradigm in light of optimization objectives of broad application. Next, it explored more advanced auctions and proposed a complementary stochastic algorithm that addresses shortcomings of these methods. The network is a critical element in these and many other approaches, so the hot topic of limited connectivity was addressed next, offering two different approaches suited for open or cluttered environments respectively. Across all of our research, a common set of optimization objectives of general usefulness was used to measure the performance of our proposals. In this closing chapter we review the main contributions of this thesis and outline the next steps for this work.

153

Chapter 8. Conclusion

8.1.

Towards autonomous multi-robot teams

In the introduction of this thesis we stated two ambitious complementary goals: finding general results of relevance in general-purpose multi-robot teams, and carry on with these results into the realm of limited communications. Chapters 3 and 4 were most relevant about the first statement: by comparing allocations in terms of total sum and worst case evaluations, we found that for the auction algorithms studied, the latter was a better approach in the general case given the relative penalties involved in each direction. Chapters 5 and 6 were ostensibly directed to the second of our stated goals, that is, robot teams working in limited communication environments. Chapter 5 presented a reactive task allocation method that takes into account the current network topology and navigation constraints in order to assign targets that are compatible with mission progress and communication integrity. In chapter 6 we developed an alternative strategy that is better suited for cluttered environments with complex signal propagation conditions, based on the idea of temporarily static relays over a fixed tree of paths that provide certain cost bounds over the optimal unconstrained solution. In more detail, our contributions per chapter can be summarized as follows: A survey of the task allocation problem from the point of view of generalpurpose service robotics, including a brief review of relevant characteristics of recently proposed multi-robot architectures and teams. This survey served as input for the EURON excellence network roadmap and “Strategic Research Agenda for Robotics in Europe, 07/2009” (EUROP). Implementation of three canonical auction-based task allocation algorithms by several authors, and their evaluation using a deterministic cost model (MTSP) under the total sum of costs and worst agent cost metrics. An implementation technique (criterion descriptors) was proposed for efficient computation and switch of bidding rules. The evaluation revealed the higher suitability of worst case criteria for general-purpose missions, given the lower penalties involved towards the total team cost, and in general the good performance of criteria that included both kind of objectives simultaneously. Additional auction rounds after the initial allocation proved to be useful, from which we concluded that starting with bound-guaranteed allocation methods followed by additional improvement rounds was the best approach. Design and implementation of a stochastic planner based on simulated annealing and hierarchical task networks, as well as an equivalent (in planning capabilities) auction-based solution, based on other authors’ work. Evaluation and comparison in problems of increasing size shed light on the relative merits of each approach. Our proposal often finds better allocations even in real-time execution, for all the problem sizes studied. It also does not suffer from noticeable delay in producing complete allocations for large numbers 154

8.1. Towards autonomous multi-robot teams

of new tasks, which can be a drawback of auction methods. A real implementation of both approaches was showcased in a complementary way in the real deployment of the URUS project, where auctions were used for initial allocations and simulated annealing for opportunistic optimization of local plans. A reactive task allocation method that takes into account the current network topology and navigation constraints in order to assign targets that are compatible with mission progress and communication integrity. This method can be considered a meta-algorithm, in the sense that one of its steps includes the use of other, connectivity unaware, allocation algorithm. This enables the easy adaptation and reuse of already known algorithms, as we did, and their evaluation within our framework. We provided the description of seven of those adapted algorithms and evaluated them under the same total sum and worst case costs metrics, as well as an additional average cost per task metric. We identified the network conditions best suited for each of the algorithms evaluated. Long-term planning algorithms, like TSP and circular sweeping tours proved better for the more limited network ranges, while the optimal (for instantaneous assignments) Hungarian method demonstrated superior performance in less restricted situations. Outdoor experiments with a team of three robots using real signal qualities validated the proper working of the whole integrated system. In our C ONNECT T REE research, we developed an alternative strategy that is better suited for cluttered environments with complex signal propagation conditions. We started from the idea of deploying static relays, and added two key ingredients: to make them only temporarily static, and to force the robots to follow fixed entry and exit paths. By using these two premises we could focus on route planning and execution ordering in a way that offered some cost bounds over baseline, unbounded solutions like the optimal TSP tour. These bounds can be obtained by the relation between optimal TSP tours and minimum spanning trees for a given graph. We tested several route construction algorithms in these conditions, and found that minimum spanning trees offered the best results, being also a good compromise instead of an optimal Steiner tree, which can be too difficult to be exactly computed. A further enhancement for concurrent task execution was presented in which several branches of said path trees are visited simultaneously in order to speed up task execution. We characterized its complexity, showing that it belongs to the NP-hard class of problems. Because of this, we provided several heuristics and evaluated them, again under the same previously described team objectives. These heuristics relied on ideas like the branching factor of a tree, target distance, and random cost-bounded solution exploration. The improvement margins that can be expected over the sequential initial algorithm were found to be in the range of 5%-20%, depending on the 155

Chapter 8. Conclusion

network conditions. As a whole, the C ONNECT T REE strategy offers all the key ingredients needed by a multi-robot team working in severely limited networking conditions in very complex environments. We finished this thesis by presenting the component-based software architecture that underlies our research, as a way of cooperating with both the research community and the open source movement. The largest deployment of this architecture took place during the experiments part of the URUS project. Heterogeneous robots (two Segway platforms, an instrumented golf cart and a differential drive field robot) were used for transport and guidance missions, in cooperation with infrastructure cameras and control stations within the whole NRS system. Our S ANCTA library proved to be a valuable tool for us during the integration process, which involved a large number of software partners, by enabling us to concentrate on the proper interaction issues.

8.2.

Future work

There are several open fronts which require additional effort. In the theoretical side, the latency objective has proven more difficult to tackle than the time span or total sum from a bounds point of view, which are better understood in general and for the MTSP framework we have used in our research. Consequently, we would like to find some results in this direction, specially given the relevance of this objective in rescue related missions. We also want to expand our list of criteria with some new ones derived from human involvement, like perceived fairness by the users of the system. These kind of criteria are acquiring greater importance as missions involving human-robot interaction become more common. Our focus is now in our work in network limited connectivity. In this regard, the interaction with the physical layer is proving to be a difficult topic which merits further research. Signal quality maps as an alternative to geometric ones have to be evaluated; there are still open questions about the reproducibility of signal steadiness which would difficult that approach. In order to help with such signal fluctuations, we also plan to combine the spring-damper mechanism described in chapter 5 with the fixed paths of C ONNECT T REE. Also, more complex experiments, in larger environments, with more robots, and with a higher variety of them, is a short-term objective in which we are already working. One of the outcomes of the URUS project was the demonstration of a complex, heterogeneous Networked Robotic System, in which we deployed a twofold task allocation system, combining an auction-based algorithm with the simulated annealing planner. Detailed simulations of their interaction are in the pipeline as a complement for the results already shown in chapter 4.

156

Appendices

157

Appendix A

Multi-robot routing cost model

159

Chapter A. Multi-robot routing cost model

A multi-robot routing problem is formally specified by a set of robots, R = {r1 , r2 , . . . , rn }, a set of targets, T = {t1 ,t2 , . . . ,tm }, the locations of all robots and targets on the two-dimensional plane, and a non-negative cost function c(i, j), i, j ∈ R ∪ T , which denotes some abstract cost of moving between locations i and j in either direction (e.g., distance, energy, time, etc.). For simplicity, robots are assumed to be identical, therefore the same cost function applies to all of them. Typical cost measures, such as travel distance, travel time, or energy consumption between locations satisfy these assumptions in any typical environment. The objective of multi-robot routing is to find an allocation of targets to robots and paths for all robots, so that all targets are visited and a team objective function is minimized1 . In general, a team objective is expressed as  min f g(r1 , a1 ), . . . , g(rn , an ) , A

where function g measures the performance of each robot, function f measures the performance of the team, and A = {a1 , a2 , . . . , an } is a partition of the targets, such that targets in ai are allocated to robot ri . In this thesis, we have considered three intuitive team objectives [Tovey 05]: M IN S UM: Minimize the sum of the robot path costs over all robots. M IN M AX: Minimize the maximum robot path cost over all robots. M INAVE: Minimize the average target path cost over all targets.

The robot path cost of a robot r is the sum of the costs along its entire path, from its initial location to the last target on its path. The target path cost of a target t is the total cost of the path traversed by robot r (the unique robot assigned to visit t) from its initial location up to target t, including any previous targets visited along its path. These team objectives above can be expressed in terms of our generic team objective structure. Let RPC(ri , ai ) denote the robot path cost for robot ri to visit all targets in ai from its current location. Similarly, let TPC(ri ,tk , ai ) be the target path cost of a target tk in the assignment ai of robot ri . Finally, let CTPC(ri , ai ) denote the cumulative target path cost of all targets in ai , again, if robot ri visits all targets in ai from its current location: CTPC(ri , ai ) = ∑ TPC(ri ,tk , ai ) k

Then, the three team objectives can be expressed as 1 Although we assume that robots are not required to return to their initial locations, our algorithms

and results apply also to the case of closed tours. Similarly, they apply to maximization of a utility function.

160

M IN S UM : min ∑ RPC(r j , a j ), A

j

M IN M AX : min max RPC(r j , a j ), j

A

M INAVE : min A

1 CTPC(r j , a j ). m∑ j

Solving the multi-robot routing problem optimally under any of the above objectives is NP-hard [Lagoudakis 05].

161

Chapter A. Multi-robot routing cost model

162

Appendix B

Reactive allocation complementary modules

163

Chapter B. Reactive allocation complementary modules

B.1.

Cooperative navigation module

Introduction and related work For the work described in chapter 5, as well as a real-time protocol for communication, a system is needed to enforce that the link among robots is of sufficient quality. To that end, cooperative robot navigation must be employed to prevent robots from moving too far apart. There are several proposals for robot formation movement, and some of which use a spring model for motion. In [Reif 95] restricted potential fields were used for simulating spring forces and [Gulec 05] used graph theory, where the links between nodes are springs. Previous works only had the purpose of maintaining a topology or formation among the robots, but not maintaining network connectivity as proposed in our work. To deal with this problem, we have developed a spring-damper model to maintain connectivity among robots while simultaneously enabling them to perform the mission tasks.

Spring-Damper Systems for cooperative navigation In this section we detail the Cooperative Navigation Module (CNM) for a team of robots. As described in §5.3, the aim of this layer is to provide a control for the robots to achieve their objectives while maintaining communication restrictions at all times. We have developed a motion model based on a Spring-Damper System analogy (SDS). The forces generated on the SDS structure are responsible for the coordinate movement of the robots. The model is based on that presented in [Urcola 08], adapted to the communication connectivity problem here dealt with.

Spring-damper model Firstly we show an approximation of the model. Fig. B.1 presents a simple structure of our system. This figure illustrates a team of four robots linked by SDSs. There are two types of robots, mobile and fixed. A fixed robot is, for example, a base station. We introduced the concept of fixed robot to support applications that need a static base station, like a computer that collects information from all robots or sends commands to them. The objective is to move the robots Ri to a goal using a virtual force Gi whose module is computed as a function of the given leader’s maximum desired velocity, and so it is always bounded. Due to the SDS which links two robots, a new force SDi will appear affecting both of them. The forces generated by the SDS for each robot are defined as: N

SDi =

∑ sdi j ai j

(B.1)

j=1

where A is a matrix whose elements ai j represent the links between robots, and the force of a spring-damper link sdi j = (sdi jx , sdi jy ) is computed as: 164

B.1. Cooperative navigation module

Figure B.1: Spring-damper model to maintain connectivity and motion coordination.

sdi j = ks (γ − γ0 ) du + kv vi j

(B.2)

where ks and kv are the spring and damping coefficients, chosen to have a slightly overdamped behavior, du is the unit vector linking i, j robots, and vi j is the relative velocity between robots i and j. These values are calculated from the kinematic information provided by every robot, through the broadcast service of the COM layer (see §B.2). The value of γ is computed as a function of the link quality, s, so it does not represent the actual distance between robots, but a measurement used to establish the connection between each pair of robots when that quality decreases. The lower the link quality, the higher γ is. The γ0 represents the rest value of γ. The link quality values are computed by the COM module from a Link Quality Matrix (LQM, explained in §B.2), describing the topology of the network. A median filter is applied to obtain s from the LQM values. After several tests we verified that a good function for γ is γ(s) = ks · (smax − s), smax being the maximum link quality possible between two physically-coupled nodes, and ks a constant computed from the maximum number of robots and the link quality thresholds explained below. Since we want to model the behavior of a real system, we have to introduce a damping term Di on the robot forces defined by Di = fd vi

(B.3)

where fd is the damping coefficient and vi = (x˙i , y˙i ) the velocity vector of the robot. 165

Chapter B. Reactive allocation complementary modules

Safety Zone

st

Signal

Controlled Zone

Forbidden Zone

Distance

Figure B.2: Theoretical function of the radio signal versus the distance between the transmitter and the receiver. When the radio has a value less than the safety threshold (st), it enters the Controlled zone where the spring-damper analogy is used to avoid network disconnection. Moreover, the obstacles in the environment could cause the robots to modify their relative location. This fact is included in the model by means of an external force Ei on the robot Ri , always bounded to a maximum value. Summarizing, the total force Fi on each robot is calculated as: Fi = Gi + SDi + Di + Ei

(B.4)

For this kind of application, this SDS model has several advantages over other approaches to coordinate the motion of robot teams. It adapts very well to the stated problem for several reasons. Firstly, this kind of model allows the robot team structure to be maintained considering the real kinodynamic constraints of robots, that is, realistic and feasible motions are computed, as we will see in the experiments. Secondly, the link to a task allocator system is direct, because it can assign tasks (i.e. goals) as attractive forces in a natural manner. Thirdly, influences of the environment are also very naturally included in the model as repulsive forces acting on each robot, adapting the motion to the dynamism or shape of the environment. Approaches based on graph models do not incorporate the management of the system dynamics in real situations, making it necessary to deal with dynamic behaviors using other additional models. The SDS proposed in this paper allows management of cooperative motion, taking into account the real dynamics of the robots and their connectivity, using a solution that involves a low computational burden. In order to apply the computed forces calculated by the SDS to each real robot, we use a Motion Generator for differential-drive mobile robots (the kind of robots used in the experiments). The Motion Generator transforms these forces (Fi ) into linear and angular velocities according to the equation: x˙ i = Pxi + QFi

(B.5)

By solving this differential equation we can obtain the linear and angular velocities xi = (v , ω), complying with the kinodynamic constraints of the robots. The 166

B.1. Cooperative navigation module

Figure B.3: Spring-damper structure generated by the Prim-based algorithm with matrix of links generated for the minimum spanning tree. parameters of P and Q matrices are tuned to obtain an overdamped behavior, and to generate feasible trajectories for the real robots. Details about the model, stability issues, dynamic behavior, parameter tuning and adaptability to the shape and size of the environment can be found in [Urcola 08], whose results are applicable to the adapted model explained here. From the point of view of the applied forces, the worst case situation to achieve a mission using the proposed techniques occurs when all the robots involved in the mission form a chain from the base to a goal as described in §5.4. If the goal is unreachable the spring-damper structures will suffer the maximum forces, the highest being the force of the SDS connecting the base and the first robot. As mentioned above, the parameter ks is computed for this situation to avoid the link quality between the first robot and the base decreasing under a given threshold.

Building the virtual structure When the link quality between two nodes is high, there is no reason to put a SDS between them. But when the link quality starts to decrease, it is necessary to act to prevent a possible link loss. To that end, SDSs are created in our system before the link quality reaches unsafe values. However, the use of a virtual spring-damper structure to link two nodes restricts the mobility of both, and should therefore be used only when really necessary. In fact, SDSs should be created only for those links that have a link quality less than the safety threshold (st) (Fig. B.2) and only when really necessary to maintain the network connectivity. To select the set of necessary SDSs, we have used graph theory. Assimilating the network to a graph, where the nodes are the vertices, we fix the weight of the edges as follows: if a link has a quality greater than the safety threshold, its weight is zero. Otherwise, the weight is a function of the link quality. The lower the link quality, the higher the weight of the link. Then, we apply an algorithm based on Prim’s Minimum Spanning Tree (MST) algorithm [Prim 57] to such a graph to obtain a spanning tree that contains the maximum possible number of zero-weight edges, and only the less weighty ones among the other edges. The latter are those corresponding to 167

Chapter B. Reactive allocation complementary modules

y (m)

2

Spring−Damper Link Goal

0

−2

0

10

20

30 x (m)

40

y (m)

2

50

60

Spring−Damper Link Goal

0 Base

−2

0

R1

10

R2

20

30 x (m)

R3

40

50

60

Figure B.4: Evolution of the robots movement and links created between them.

the link for which it is necessary to create a SDS. Movement of the robots can imply changes in shared link-quality information and consequently a change in the resulting spanning tree. However, the position of the robots is a continuous function of time (i.e. there are no jumps), and the springs are always created in their rest length. This means that the force generated at the creation of the SDS is null. This fact guarantees that the system does not suffer from sudden forces so that smooth and feasible movements are produced. This process is completely decentralized. In fact, since all the robots have the same information about the link qualities in the network, each robot can autonomously calculate the tree and, consequently, the structure of the resulting SDS.

Communication and cooperative navigation experiments The objective of this set of tests was to verify the correct behavior and the validity of the spring-damper model proposed, without taking into account the task allocation module at all. The idea was to put a base station (BS) and three robots (R1 , R2 and R3 which is the head robot) in a chain a few meters apart from each other and assign a goal to the robot at the head of the chain. The correct behavior of the system is that in which the head of the chain moves freely up to the moment when the link quality between it and its successor reaches the threshold of the controlled zone (Fig. B.2). At that moment, a SDS should be created by the system between these two robots. The second robot starts to move also pulled up by the SDS to the moment when the same situation occurs with the third robot. On the other hand, the base station is fixed and the system should either lengthen to allow the head robot to reach the goal, or a stall situation would occur if the head robot cannot reach the goal (due to the insufficient lengthening of the SDSs). We reproduced this experiment in simulation and in a real environment. The results of these experiments are presented in the following sections. 168

Vlinear (m/s)

Distance (m)

169 t1

t1

time

time

t2

t2

t3

t3

R3−R2 R2−R1 R1−Base Threshold

R3 R2 R1

Figure B.5: Linear velocity during the simulation and evolution of the relative distances between consecutive robots.

0

5

10

15

20

0

0.1

0.2

0.3

0.4

0.5

B.1. Cooperative navigation module

Chapter B. Reactive allocation complementary modules

Figure B.6: Snapshots of the robots during the experiment. Distance-based simulation experiment The Player Stage platform allows the simulation of robots in a simulated environment using our independent code in each robot. The user code is the same as that used in real robots. However, since simulation implies the use of virtual robots running on desktop machines (and therefore not mobile), the use of a real LQM was not possible. As a consequence, the simulation was performed using a function of the distance between the simulated robots instead of the link quality. These tests allowed us to verify the correct implementation of the system. Results are shown in Fig. B.4 and Fig. B.5. In t1 the distance between R3 and R2 reached the rest length of the virtual spring and the first SDS was created. Consequently, the speed of the head robot decreased and R2 started to move. In t2 the same event occurred with R2 and R1 , and the speed of R3 also changed. Finally, in t3 the third SDS between R1 and the base station (fixed robot) was created. All the robots began to decrease their speed until reaching a complete stop. Distance-based real experiment We reproduced the same experiment already performed by simulation, in a real environment using three Pioneer 3 robots (Fig. B.6). The experiment is shown in Extension 1. The robots were equipped with an on-board PC with a Pentium III processor at 800MHz and a 802.11b Cisco 350 Series wireless card. The robots were also equipped with differential GPS to provide accurate knowledge of their own positions. Results are shown in Fig. B.7. Despite the noise introduced by the GPS and other equipment (e.g. the compass for orientation), the results were similar to those obtained by simulation. Quality-based real experiment The same experiment was performed using the sensed link qualities. In this case we used the function γ of the link quality (the better the link quality, the lower the function). The results are shown in Fig. B.8. As expected, the graphs 170

Vlinear (m/s)

171

0

5

10

15

20

0

0.2

0.4

t1

t1

time

time

t2

t2

R3−R2 R2−R1 Threshold

R1 R2 R3

Figure B.7: Linear velocity and distance during the real experiment and evolution of the relative distances between consecutive robots.

Distance (m)

0.6

B.1. Cooperative navigation module

Chapter B. Reactive allocation complementary modules

are noisier than the two preceding ones, due to the fluctuation of the radio signal among nodes. Fig. B.8 clearly shows the moments at which the SDS were created by the system. In t1 the value of γ(s) between R3 and R2 increased and passed beyond the threshold causing the first SDS to be created. The robot R2 started to move. At t2 robot R1 started to move after the link R2 − R1 overcame the threshold. The delay between this event and t2 was due to a temporary good link between R2 and the base station (not reported in the graph to avoid confusion) that caused the creation of an alternative multi-hop route and allowed R2 to move a little before the creation of the SDS that linked it with R1 precisely at t2 . This fact shows clearly that in some situations the link quality among nodes is not directly related to the distance. Between t2 and t3 both links were stable (that is γ(s) were over the threshold for both SDSs) and velocities of R1 , R2 and R3 were more or less constant, despite a little oscillation. At t3 γ(s) for the R1 -base link (that grew rapidly during the movement of the system) reached the threshold and caused the third SDS to be created and the stopping of R1 . At the same time, links R2 −R3 and R2 −R1 reached a similar γ value and thus a null result force. This fact caused the slowdown of R2 until a complete stop at around t4 that in turn caused a slowdown of R3 . After that, fluctuation of the radio signal among robots caused further movements until a complete stop occurred at t5 due to force equilibrium. A temporary signal quality improvement between R1 and base caused an R1 advancement, allowing R3 to reach the goal at t5 . As can be deduced from the experiment, using the link quality against distance provides a different behavior, so the use of distance to create robot connections does not work. Another conclusion is that the SDS model is well adapted to controlling the dynamic system motion in real situations providing, in this aspect, similar results to those obtained by simulation. Effect of obstacles on the link quality In order to verify the effect caused by the presence of obstacles and the absence of line-of-sight on our system, we performed an experiment in an indoor environment, shown in Fig. B.9. We used two mobile robots and a fixed base station and recorded the link quality among robots and base station. The goal of the experiment was to move the robot R2 away from R1 in a corridor and observe the behavior of the link quality and the behavior of the robots when the mobile robot turned around a corner and entered a room (see Fig. B.9). As can be seen in Fig. B.9.e, the function of the link quality (γ(s)) between R1 and R2 remained more or less stable up to the moment in which R2 started to turn to enter the room (t1 ). At that moment the line of sight between the antennas of R2 and R1 was partially cut off by the laser head of the former. The function γ grew and an SDS was created between both robots at time t2 (Fig. B.9.b). At this moment R1 started to move to reestablish a good link quality with the leader robot. Despite the movement, γ(s) between R2 and R1 continued growing because R2 entered the room (t3 ). Nevertheless, the γ(s) between R1 and the base station grew very slowly (there still 172

(m/s)

linear

V

γ (s)

173

15

20

25

30

35

40

0

0.2

0.4

0.6

t2

t2

time

time

t3

t3

t4

t4

t5

t5

Figure B.8: Linear velocity and evolution of the γ(s) among robots during the quality-based real experiment. Link B.1: http://robots.unizar.es/data/videos/urus/ijrr-lineal.mp4

t1

t1

R3−R2 R2−R1 R1−Base Threshold

R3 R2 R1

B.1. Cooperative navigation module

Chapter B. Reactive allocation complementary modules

being line of sight between them) up to the moment (t4 ) at which the threshold was reached and a second SDS was created by the system (Fig. B.9.c). After that, both robots started to slow down because the pulling forces imposed by the SDS and the respective γ functions oscillated a little around the equilibrium point. At this moment the robots could not move forward without loss of communication with the base station.

174

175

10

20

30

40

50

t2

t3

t4 time

Figure B.9: Evolution of the link quality in an indoor environment.

t1

R1−R2 R1−Base R2−Base Threshold

B.1. Cooperative navigation module

γ (s)

Chapter B. Reactive allocation complementary modules

B.2.

Real-Time Wireless Multi-hop Protocol

Introduction and related work In the environments being discussed, humans and robots self-organize in a MANET. These networks are usually multi-hop, that is, nodes also act as routers, to increase coverage and to deal with changing network topology. Additionally, a MANET for robotic team applications must provide Hard Real-Time (HRT) traffic and connectivity guarantees. As is well known, all control applications have strict real-time requirements. In distributed control applications, such as those of robotic teams, the requirements extend to communications. In order to cooperate, robots must exchange information concerning both the environment and their own state. The real-time constraints of multi-robot applications have been recognized in the literature in, e.g., [Facchinetti 05, Akyildiz 04, Franchino 05]. Real-time support is therefore a key feature. In [Basu 04] movement control algorithms are proposed from the communication fault tolerance perspective. The idea is to maintain the network biconnected, that is, that there be always at least two alternate communication paths between each pair of robots. However, the problem is seen only from the network connectivity point of view, and the fulfillment of mission objectives is not taken into account. Protocols for ad-hoc networks typically focus on issues such as maximizing throughput or minimizing average message delay, neglecting the indeterminism introduced. Moreover, most of the commercial low-level network protocols (e.g. 802.11, 802.15.4, etc.) do not provide timeliness guarantees on network transmissions due to packet collisions, exponential back-offs, and the false blocking problem [Ray 03]. Consequently, specific protocols aimed at eliminating indeterminism and supporting real-time traffic have been developed in recent years. In some solutions, a node that needs to transmit occupies the medium with pulses of energy [Sobrinho 96, Sheu 04]. The duration of the pulses is proportional to the priority of the node. In general, such solutions do not address the problem of the hidden terminal. Moreover, these solutions require hardware modification. In [Lee 02] and [Donatiello 03] two token-passing based solutions are proposed. In these works, the network must have a ring topology and each node knows its successor and predecessor. However, the need to maintain the ring topology causes serious problems of mobility. In [Facchinetti 05] it was proposed another solution at the Medium Access Control (MAC) level based on a time-division scheme to avoid collisions and using an implicit Earliest Deadline First (EDF) scheduling to provide real-time guarantees. As mentioned in §5, our solution needs information about the link quality among nodes and the dynamic state of each of the robots. Consequently, the communication module must be capable of providing the link quality between each pair of robots and transporting the kinematic information in a multi-hop network with real-time guarantees. Furthermore, message priority support is needed if multiple 176

B.2. Real-Time Wireless Multi-hop Protocol

flows of information in the network are required (e.g. laser scans, camera images, etc.) in order to guarantee real-time requirements. Link quality measurement or prediction is today a challenging question. Some estimators have been proposed such as Received Signal Strength Indication (RSSI), Signal to Noise Ratio (SNR), Packet Delivery Ratio (PDR), or Bit Error Rate (BER) [Vlavianos 08]. All of them have limitations. RSSI does not capture the amount of destructive interference on links. It is extremely hard to accurately compute the SNR because commercial hardware does not provide noise information while receiving packets, or simply provides no noise level information at all. The use of PDR involves a large latency for estimating the link quality [Souryal 06], and the BER computation introduces significant overheads and is sensitive to bit sequences [Vlavianos 08]. We use RSSI for estimating the link quality because commercial cards provide this measure directly (no overhead), and the use of a token passing protocol avoids interferences between the nodes of the network. This last assumption supposes that no other nodes outside the network can interfere with the communication. However, instantaneous RSSI or SNR estimators can have temporary peaks due to small scale variations in the multipath propagation. Thus, filters such as moving average or Kalman filters (e.g. [Farkas 08]) must be used to smooth the link quality estimation. For this work we have used RSSI filtered with a moving average. Future improvements in hardware and estimation techniques can be adopted in the proposed system. In this work we have used the Real-Time Wireless Multi-Hop Protocol (RTWMP) [Tardioli 07] because it provides the real-time multi-hop communication needed and a measure of the link qualities. However any other protocol supporting multi-hop traffic and an estimation of the link qualities can be used in the proposed system in §5. An extension was made to the protocol to allow the broadcast communication needed to disseminate kinematic information of the robots to all the nodes of the network.

B.2.1.

The Real-Time Wireless Multi-hop Protocol

RT-WMP is a novel protocol for MANETs that supports hard real-time traffic. To that end, in RT-WMP, end-to-end message delay has a bounded and known duration and the protocol manages the global static priorities of messages. RT-WMP supports multi-hop communication to increase network coverage. The protocol has been designed to connect a relatively small group of mobile nodes (about 15 units). It is based on a token passing scheme and is completely decentralized. The protocol manages topology changes through the exchange of a matrix containing the link quality among nodes. It can be run on commercial hardware without modifications. It is currently implemented using commercial low-cost 802.11 devices. In RT-WMP, each node has a transmission and a reception priority queue. Each message is identified by a priority level, and messages with the same priority are stored in FIFO order. The protocol works in three phases (Fig. B.10): the Priority 177

Chapter B. Reactive allocation complementary modules

n Hop number (n) Link quality

Priority Arbitration Phase Link Quality Matrix Authorization Transmission Phase 1 p2 p3 p4 Message Transmission Phase Link p1 - 88 0 79

p

7

p1

(88)

1

p2

8

(79)

(77)

p4

(74)

p3

6 9

(67) (60)

5

p5

3

(80)

p6

4

2

p2 p3 p4 p5 p6

p5 p6 0

0

67

0

74 60

0

88

-

77

0

77

-

79

0

74

-

0

0

0

67 60

0

-

80

0

0

0

80

-

0

0

Figure B.10: A hypothetical situation described by the network graph and the corresponding LQM. The hops sequence of the protocol is also shown. Arbitration Phase, in which nodes reach a consensus about which one holds the Most Priority Message in the network at a given time; the Authorization Transmission Phase, in which an authorization to transmit is sent to the node that holds the most priority message; and the Message Transmission Phase, which sends the message to the destination node. The routing of authorization and data messages is based on the Link Quality Matrix, which keeps the estimation of the quality of the links between nodes in the network. This matrix is shared among the members of the network.

B.2.2.

The Link Quality Matrix

To describe the topology of the network, RT-WMP defines a weighted network connectivity graph where weights are functions of the radio signal between couples of nodes, and are indicators of the link quality between them. These weights are calculated using a moving average of the instantaneous value of the Received Signal Strength Indication (RSSI) that is the received signal strength in a wireless environment, expressed in an arbitrary unit. Each column of the the Link Quality Matrix (LQM) describes the link quality of any node with its neighbors. The nodes use this information to find a reliable path over which to send route messages or authorizations, and to choose the node to send the token to. The information contained in the LQM is shared among all nodes, since it travels with the token and reaches all members of the network in every Priority Arbitration Phase. The nodes are responsible for actualizing the corresponding column of the LQM to inform the other nodes of local topology changes. The information about link quality with neighbors is obtained by promiscuously monitoring the medium in order to obtain the radio signal information provided by the network adapter and update the LQM.

B.2.3.

Adding multi-cast capabilities

The RT-WMP as initially defined does not have a multi-cast capability. This means that, to implement our system, the information on the position of each node (that must reach all the other nodes periodically, see §B.2.4 for details) would have to travel in the network as unicast messages. In an n nodes network, this means 178

B.2. Real-Time Wireless Multi-hop Protocol

Figure B.11: Example of a modified frame. All but the last field are used in the basic RT-WMP protocol. In the tail, kinematic information of the robots travels with the frame to reach all the nodes. that n2 messages would be needed to disseminate the data throughout the whole network. Since this type of information must be exchanged frequently, the corresponding network load can be guaranteed by RT-WMP only for a small number of nodes. Moreover, all the traffic would be used up in this task and no bandwidth for other flows would be available. To solve this, we extended the protocol to allow transportation of small quantities of data in the tail of the frames (Fig. B.11). In other words, at the end of the protocol frames we added a space that nodes can use to publish information that all the other nodes can read. In this way, the information reaches at least all the nodes in each Priority Arbitration Phase, implementing a sort of broadcast communication. In this system in particular, we divided the tail of the frames in n parts (n being the number of members of the network). In each part, each node publishes its kinematic variables, which all the other nodes can read when they receive a frame.

B.2.4.

Timing issues

To compute the cooperative movement based on the spring-damper analogy, nodes have to share their kinematic information (absolute pose and velocities) to compute the forces acting on the system using the same set of data. However, due to the distributed nature of the system and the communication needed to share these data, a perfect synchronization is not possible and thus robots have to work with slightly different information. The time-displacement among kinematic data on different robots must be sufficiently short to guarantee a correct system dynamic. The task of propagating the shared data is carried out by the broadcast service of the COM. In each iteration of the control loop, the CNM publishes its kinematic information that is propagated by the network. This introduces a (known) propagation delay t p that can be calculated as [Tardioli 07]: t p = (4n − 8)tt0 + (n − 2)ta0 + (n − 2)tm0

(B.6)

n being the number of nodes, tt0 the time needed to send a token of the RT-WMP protocol considering the extra data in the tail, ta0 and tm0 the time needed to send an authorization and a message respectively, in the same conditions. However, in the worst case, a node could have to wait the same amount of time before being able to send the data (if it has just sent its previous actualization), thus the actualization 179

Chapter B. Reactive allocation complementary modules

time must be considered as tact = 2 · t p . Nevertheless, when a node receives fresh data from the network, it is stored in the RT-WMP broadcast queue up to the moment in which the control loop pops it to compute a new system status. Thus, the information can wait in the queue up to a whole control loop period Tcl before being used. Taking into account all these terms, the time displacement in the shared data with which any control loop (except the local one) works in any iteration can be tdisp = Tcl + tact in the worst case. On the other hand, the time-displacement between the most recent pose information provided by the hardware, and the data with which any local control loop is working in any iteration (and that it sends to the other robots) can be, in the worst case, equal to the control loop period if, as in our system, both events are not synchronized. This can happen if the control loop reads the information just before the moment in which the microcontroller updates the data. This fact must be taken into account when choosing the control period to guarantee a strict concordance between the reality and the control-loop view of the reality and allow a correct behavior of the system. In our system, the hardware provides new pose information each Tµc = 100ms and the control loop iterates with the same period (Tcl = 100ms). For the network considered in the experiments (4 nodes network, 11Mbps data rate and maximum data unit of 1500 bytes for unicast messages and 14 bytes per robot of kinematic information), the value of the propagation time is t p = 6.05ms and thus tact = 12.1ms. This implies that we are guaranteeing a maximum time displacement of tdisp = 112ms approximately. The maximum time-displacement between hardware data and control loop data is, instead, about Tlocal hw disp = Tµc = T cl = 100ms for the local node and tremote hw disp = Tdisp + Tcl = 212ms for the remote nodes, in the worst case. Notice that the value of tact depends on the number of nodes, on the maximum data unit for unicast messages and on the data rate. As an example in a 10-nodes and 54Mbps-data rate network, its value is about 30ms. These values are completely assumable by the system since the dominant dynamic time constant of the system is about one second.

180

Appendix C

Robots used in this thesis

181

Chapter C. Robots used in this thesis

C.1.

UniZar robots

Our university robots, familiarly known as the C ODEX C ONAN team1 , is a team of four Pioneer 3 units (Fig. C.1). One of them is a two-wheels indoor 3-DX model, while the rest are outdoor four-wheeled 3-AT models. Their base computational capabilities are provided by an embedded Pentium III processor at 800MHz with 512MB of RAM and WiFi connectivity. They are currently fitted with an structure that allows the use of additional laptops and the mounting of an external antenna on the very top of it, free from obstacles in all directions, which is important for our connectivity work. All robots are equipped with a SICK LRS200 range finder, sonar arrays and GPS receivers. A external differential GPS antenna provides precise localization in outdoor settings (like the experiment described in chapter 5). They also carry an assortment of cameras and pan-tilt units, but these have been not used in our work to date. A very recent addition to this family is the robot described in §C.3. This is a prototype that has not yet been used in experiments. 1 COoperative

DEployment eXperiments with COstrained NAvigation Networking.

Figure C.1: University of Zaragoza robot team. 182

C.2. URUS robots

C.2.

URUS robots

The URUS project made use of a very heterogeneous collection of robots for its experiments. We describe here only the ones that were used in our task allocation work. Besides our own robots just described, the following robots were deployed in the Campus Nord of Universidad Polit`ecnica of Catalunya. Tibi and Dabo: These are two humanoid robots using a Segway platform. They carry forward and after laser range finders, mobile arms and a frontal touchscreen for human interaction (Fig. C.2). Dala: This LAAS-based robot (Fig. C.3) is a four-wheeled outdoor unit carrying laser range finder and stereo camera for navigation. Romeo: This robot from the Universidad de Sevilla is an instrumented golf cart capable of autonomous navigation using car-like kinematics (Fig. C.3). More pictures and videos of the full URUS robotic assets are available in the link provided at Fig. C.3.

Figure C.2: Segway-based Tibi and Dabo robots.

Figure C.3: Dala (left) and Romeo (right) robots. Link C.1: http://urus.upc.es/NuevoRobots.html 183

Chapter C. Robots used in this thesis

C.3.

A low-cost LEGO robot for indoor research

We have devoted some time to the building of a prototype of a mobile, smallsize indoors robot, suitable for the testing of the C ONNECT T REE strategy in our laboratory. These design issues were considered: Price: a cheap robot would enable larger team sizes, which in turn can better demonstrate multi-robot research. Features: commodity hardware (e.g. nettops) were the preferred computing device, since this would allow the use of Player/Stage and our software without modification. Maintainability: hardware issues can be a showstopper involving high expenses and long delays while spare parts are provided or fitted. The outcome of this side project is our first LEGO prototype (Fig.C.4). It is a complete solution for our stated criteria, which relies on the following elements: A small yet sturdy LEGO frame, capable of carrying an on-board nettop that connects via USB to the NXT LEGO brick. This frame can be built with the pieces provided in the academic LEGO Mindstorms boxes, fulfilling the easy maintenance purpose. A new driver that has been contributed to the Player project, which enables easy control of LEGO motors (one degree of freedom).

Figure C.4: LEGO prototype. 184

C.3. A low-cost LEGO robot for indoor research

A new driver that has been contributed to the Player project, which takes two one-degree-of-freedom interfaces (such as the one previously described) and provides a generic differential drive interface. A USB-powered Hokuyo laser range finder, which enables effective localization. This is the more expensive component in the robot. All these points fulfill our design criteria, providing a small-scale solution that integrates all the required elements for our ongoing C ONNECT T REE research.

185

Chapter C. Robots used in this thesis

186

Acronyms AI Artificial Intelligence BER Bit Error Rate CNM Cooperative Navigation Module COM COmmunication Module DARPA Defense Advanced Research Projects Agency DARS Distributed Autonomous Robotic Systems DCOP Distributed Constraint Optimization Problem DLST Depth-Limited Spanning Tree EDF Earliest Deadline First EURON EUropean RObotics research Network EUROP EUropean RObotics technology Platform EXPRES T´ecnicas de EXPloraci´on avanzadas en tareas de REScate FIFO First In, First Out GPL GNU General Public License GPS Global Positioning System GUI Graphical User Interface HRT Hard Real-Time HTN Hierarchical Task Network ICRA International Conference on Robotics and Automation IEEE Institute of Electrical and Electronics Engineers IJRR International Journal of Robotics Research IROS International conference on RObots and Systems LNCS Lecture Notes on Computer Science LQM Link Quality Matrix MAC Medium Access Control 187

MANET Mobile Ad-hoc NETwork MDP Markov Decision Process MST Minimum Spanning Tree MTSP Multiple Traveling Salesmen Problem NERO NEtworked RObots NRS Networked Robotic System OAP Optimal Assignment Problem OLSR Optimized Link State Route Protocol PCMCIA Personal Computer Memory Card International Association PDR Packet Delivery Ratio POMDP Partially Observable Markov Decision Process POSG Partially Observable Stochastic Game RFID Radio Frequency IDentification RMA Rate-Monotonic Analysis RSSI Received Signal Strength Indication RT-WMP Real-Time Wireless Multi-Hop Protocol SANCTA Simulated ANnealing Constrained Task Allocation SDS Spring-Damper System SG Stochastic Game SNR Signal to Noise Ratio TAM Task Allocation Module TCP Transmission Control Protocol TESSEO TEams of robots for Service and Security missiOns TSP Traveling Salesman Problem UAV Unmanned Air Vehicle UDP User Datagram Protocol URUS Ubiquitous networking Robotics in Urban Settings USAR Urban Search And Rescue USB Universal Serial Bus WSN Wireless Sensor Network YARP Yet Another Robotic Platform

Bibliography [AdaCore 04]

AdaCore. Ada Libre Site: https://libre.adacore.com, 2004.

[Akyildiz 04]

I.F. Akyildiz & I.H. Kasimoglu. Wireless sensor and actor networks: research challenges. Ad Hoc Networks, vol. 2, no. 4, pages 351–367, 2004.

[Alba 04]

Enrique Alba & Bernab´e Dorronsoro. Solving the Vehicle Routing Problem by Using Cellular Genetic Algorithms. Lecture Notes in Computer Science, vol. 3004, pages 11–20, 2004.

[Amigoni 05]

Francesco Amigoni & Alessandro Gallo. A Multi-Objective Exploration Strategy for Mobile Robots. In IEEE Int. Conf. in Robotics and Automation (ICRA’05), pages 3861–3866, 2005.

[Antonelli 05]

G. Antonelli, F. Arrichiello & S. Chiaverini. The null-spacebased behavioral control for mobile robots. In IEEE Computational Intelligence in Robotics and Automation (CIRA’05), pages 15–20, 2005.

[Applegate 03]

D. Applegate, R. Bixby, V. Chvatal & W. Cook. Concorde TSP Solver: http://www.tsp.gatech.edu/concorde.html, 2003.

[Avgeriou 05]

P. Avgeriou & U. Zdun. Architectural patterns revisited — a pattern language. In 10th European Conf. on Pattern Languages of Programs (EuroPlop’05), pages 1–39, July 2005.

[Baker 07]

Theodore P. Baker & Michele Cirinei. Principles of distributed systems, chapter Brute-Force Determination of Multiprocessor Schedulability for Sets of Sporadic Hard-Deadline Tasks, pages 62–75. Springer Berlin / Heidelberg, 2007.

[Balch 98]

Tucker Balch & Ronald C. Arkin. Behavior-based Formation Control for Multi-robot Teams. IEEE Transactions on Robotics and Automation, vol. 14, no. 6, December 1998.

[Basu 04]

P. Basu & J. Redi. Movement Control Algorithms for Realization of Fault Tolerant Ad Hoc Robot Networks. IEEE Network, vol. 18, no. 4, pages 36–44, July 2004.

[Batalin 02]

Maxim A. Batalin & Gaurav S. Sukhatme. Spreading Out: A Local Approach to Multi-robot Coverage. In 6th Int. Symp. on Distributed Autonomous Robotics Systems (DARS’02), 2002.

189

BIBLIOGRAPHY

[Belker 03]

Thorsten Belker, Martin Hammel & Joachim Hertzberg. Learning to Optimize Mobile Robot Navigation Based on HTN Plans. In IEEE Int. Conf. on Robotics and Automation (ICRA’03), Taipei, Taiwan, 2003.

[Bererton 03]

Curt Bererton, Geoff Gordon, Sebastian Thrun & Pradeep Khosla. Auction Mechanism Design for Multi-Robot Coordination. In S. Thrun, L. Saul & B. Sch¨olkopf, editeurs, Conf. on Neural Information Processing Systems (NIPS). MIT Press, 2003.

[Berhault 03]

M. Berhault, H. Huang, P. Keskinocak, Sven Koenig, W. Elmaghraby, P. Griffin & A. Kleywegt. Robot Exploration with Combinatorial Auctions. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’03), Las Vegas, October 2003.

[Bernstein 02]

Daniel S. Bernstein, Robert Givan, Neil Immerman & Shlomo Zilberstein. The Complexity of Decentralized Control of Markov Decision Processes. Math. Oper. Res., vol. 27, no. 4, pages 819– 840, 2002.

[Bertsekas 90]

D. P. Bertsekas. The auction algorithm for assignment and other network flow problems: a tutorial. Interfaces, vol. 20, no. 4, pages 133–149, 1990.

[Bicchi 08]

Antonio Bicchi & contributors. EURON Research Roadmap DR.1.3 V.4.1. Technical report, University of Karlsruhe, European Robotics Network, August 2008.

[Bonarini 06]

Andrea Bonarini, Wolfram Burgard, Giulio Fontana, Matteo Matteucci, Domenico Giorgio Sorrenti & Juan Domingo Tardos. RAWSEEDS: Robotics Advancement through Web-publishing of Sensorial and Elaborated Extensive Data Sets. In Workshop Benchmarks in Robotics Research at IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’06), 2006.

[Bonsignorio 09a]

Fabio P. Bonsignorio, John Hallam & Angel P. del Pobil. Defining the Requisites of a Replicable Robotics Experiment. In Robotics: Science And Systems (RSS’09): Workshop On Good Experimental Methodology In Robotics, Seattle, WA, USA, June 2009.

[Bonsignorio 09b]

Fabio P. Bonsignorio, Angel P. Del Pobil & John Hallam, editeurs. Good experimental methodology in robotics, Seattle, WA, USA, June 2009.

[Botelho 99]

S. Botelho & R. Alami. M+: A Scheme for Multi-Robot Cooperation Through Negotiated Task Allocation and Achievement. In IEEE Int. Conf. on Robotics and Automation (ICRA’99), pages 1234–1239, 1999.

[Bowling 02]

Michael H. Bowling & Manuela M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, vol. 136, no. 2, pages 215–250, 2002.

190

BIBLIOGRAPHY

[Bowling 04]

Michael Bowling, Brett Browning & Manuela Veloso. Plays as effective multiagent plans enabling opponent-adaptive play selection. In Int. Conf. on Automated Planning and Scheduling (ICAPS’04), 2004.

[Briot 00]

Emmanuel Briot, Christophe Baillon & Martin Krischik. XmlAda: https://libre.adacore.com/xmlada/, 2000.

[Brooks 05]

Alex Brooks, Tobias Kaupp, Alexei Makarenko, Stefan Williams & Anders Oreb¨ack. Towards Component-Based Robotics. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’05), pages 3567–3572, Edmonton, Alberta, Canada, August 2005.

[Brummit 98]

Barry Brummit & Anthony (Tony) Stentz. GRAMMPS: A Generalized Mission Planner for Multiple Mobile Robots. In IEEE Int. Conf. on Robotics and Automation (ICRA’98), May 1998.

[Bruyninckx 01]

Herman Bruyninckx. Open Robot Control Software: the OROCOS project. In IEEE Int. Conf. on Robotics and Automation (ICRA’01), 2001.

[Burdakov 10]

Oleg Burdakov, Patrick Doherty, Kaj Holmberg, Jonas Kvarnstr¨om & Per-Magnus Olsson. Relay Positioning for Unmanned Aerial Vehicle Surveillance. The International Journal of Robotics Research, vol. 29, pages 1069–1087, July 2010.

[Burgard 05]

W. Burgard, M. Moors, C. Stachniss & F. Schneider. Coordinated Multi-Robot Exploration. IEEE Transactions on Robotics, vol. 21, no. 3, pages 376–378, 2005.

[Burion 04]

Steve Burion. Human detection for robotic urban search and rescue. Master’s thesis, Carnegie Mellon University, 2004.

[Caloud 90]

P. Caloud, W. Cho, , J. C. Latombe, C. L. Pape & M. Yim. Indoor automation with many mobile robots. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’90), 1990.

[Castellanos 99]

J.A. Castellanos & J.D. Tard´os. Mobile robot localization and map building: A multisensor fusion approach. Kluwer Academic Publishers, Boston, 1999.

[Causse 95]

O. Causse & L. H. Pampagnin. Management of a multi-robot system in a public environment. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’95), volume 2, page 2246, 1995.

[Cerny 85]

V. Cerny. Thermodynamical approach to the Traveling Salesman Problem: An efficient simulation algorithm. Journal of Optimization Theory and Applications, vol. 45, no. 1, pages 41–51, 1985.

[Chlamtac 03]

Imrich Chlamtac, Marco Conti & Jennifer J.-N. Liu. Mobile ad hoc networking: imperatives and challenges. Ad Hoc Networks, vol. 1, no. 1, pages 13–64, 2003.

191

BIBLIOGRAPHY

[Choset 97]

Howie Choset & Philippe Pignon. Coverage Path Planning: The Boustrophedon Decomposition. In Int. Conf. on Field and Service Robotics, 1997.

[Choset 01]

Howie Choset. Coverage for robotics – A survey of recent results. Annals of Mathematics and Artificial Intelligence, vol. 31, pages 113–126, 2001.

[Clement 99]

Bradley J. Clement & Edmund H. Durfee. Top-down search for coordinating the hierarchical plans of multiple agents. In 3rd Conf. on Autonomous Agents, pages 252–259, New York, NY, USA, 1999. ACM Press.

[Dahl 03]

Torbjorn S. Dahl, Maja J. Mataric´c & Gaurav S. Sukhatme. Multirobot task-allocation through vacancy chains. In IEEE Int. Conf. on Robotics and Automation (ICRA’03), pages 2293– 2298, September 2003.

[Dias 02]

M Bernardine Dias & Anthony (Tony) Stentz. Opportunistic Optimization for Market-Based Multirobot Control. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’02), September 2002.

[Dias 04]

M. Bernardine Dias, Marc B. Zinck, Robert Michael Zlot & Anthony (Tony) Stentz. Robust Multirobot Coordination in Dynamic Environments. In IEEE Int. Conf. on Robotics and Automation (ICRA’04), April 2004.

[Dias 05]

M. Bernardine Dias, Robert Michael Zlot, Nidhi Kalra & Anthony (Tony) Stentz. Market-based multirobot coordination: a survey and analysis. Technical report CMU-RI-TR-05-13, Robotics Institute, Carnegie Mellon University, April 2005.

[Donatiello 03]

L. Donatiello & M. Furini. Ad Hoc Networks: A Protocol for Supporting QoS Applications. Technical report, Department of Computer Science, University of Piemonte Orientale, 2003.

[Emery-Montemerlo 05] Rosemary Emery-Montemerlo, Geoff Gordon, Jeff Schneider & Sebastian Thrun. Game Theoretic Control for Robot Teams. In IEEE Int. Conf. on Robotics and Automation (ICRA’05), pages 1175–1181, 2005. [Erol 94]

Kutluhan Erol, James A. Hendler & Dana S. Nau. HTN Planning: Complexity and Expressivity. In Conf. on Artificial Intelligence (AAAI’94), pages 1123–1128, 1994.

[Facchinetti 05]

T. Facchinetti, G. Buttazzo & L. Almeida. Dynamic resource reservation and connectivity tracking to support real-time communication among mobile units. EURASIP J. Wirel. Commun. Netw., vol. 5, no. 5, pages 712–730, 2005.

[Farkas 08]

K. Farkas, T. Hossmann, F. Legendre, B. Plattner & S.K. Das. Link Quality Prediction in Mesh Networks. Computer Comunications, vol. 31, pages 1497–1512, 2008.

192

BIBLIOGRAPHY

[Ferreira 08]

Paulo Ferreira, Felipe Boffo & Ana Bazzan. Using swarmgap for distributed task allocation in complex scenarios, volume 5043 of Lecture Notes in Computer Science, pages 107–121. Springer Berlin / Heidelberg, 2008.

[Force 03]

Internet Engineering Task Force. RFC3626: Optimized Link State Routing Protocol (OLSR), 2003.

[Franchino 05]

G. Franchino, G. Buttazzo & T. Facchinetti. A distributed architecture for mobile robots coordination. In 10th IEEE Int. Conf. on Emerging Technologies and Factory Automation (EFTA’05), 2005.

[Fudenberg 91]

D. Fudenberg & J. Tirole. Game theory. MIT Press, Cambridge, MA, 1991.

[Gamma 95]

E. Gamma, R. Helm, R. Johnson & J. Vlissides. Design patterns: Elements of reusable object-oriented software. Addison-Wesley Professional Computing. Addison-Wesley Publishing Company, New York, NY, 1995.

[Gancet 04a]

J´er´emi Gancet, Gautier Hattenberger, Rachid Alami & Simon Lacroix. Distributed decision in multi-UAV systems: architecture and algorithms. Technical report 04566, LAAS, October 2004.

[Gancet 04b]

J´er´emi Gancet & Simon Lacroix. Embedding heterogeneous levels of decisional autonomy in multi-robot systems. In Int. Symp. on Distributed Autonomous Robotic Systems (DARS’04), 2004.

[Gancet 05]

J. Gancet, G. Hattenberger, R. Alami & S. Lacroix. Task Planning and Control for a multi-UAV system: architecture and algorithms. In IEEE Int. Conf. on Intelligent Robots and Systems (IROS), Edmonton, Canada, 2005.

[Garey 79]

Michael R. Garey & David S. Johnson. Computers and intractability: A guide to the theory of NP-completeness. W. H. Freeman & Co., 1979.

[Garey 97]

M. Garey, R. Graham & D. Johnson. The complexity of computing Steiner minimal trees. SIAM Journal of Applied Mathematics, vol. 32, pages 835–859, 1997.

[Gerkey 02]

Brian P. Gerkey & Maja J. Matari´c. Sold!: Auction methods for multi-robot coordination. IEEE Transactions on Robotics and Automation, vol. 18, no. 5, pages 758–768, October 2002.

[Gerkey 03a]

Brian P. Gerkey & Maja J. Matari´c. Multi-robot task allocation: analyzing the complexity and optimality of key architectures. In IEEE Int. Conf. on Robotics and Automation (ICRA’03), pages 3862–3868, 2003.

[Gerkey 03b]

Brian P. Gerkey, Richard T. Vaughan & Andrew Howard. The Player/Stage Project: Tools for Multi-Robot and Distributed Sensor Systems. In Int. Conf. on Advanced Robotics, pages 317– 323, 2003.

193

BIBLIOGRAPHY

[Gerkey 03c]

Brian Paul Gerkey. On Multi-Robot Task Allocation. PhD thesis, University of Southern California, August 2003.

[Gerkey 04a]

Brian P. Gerkey & Maja J. Matari´c. A formal analysis and taxonomy of task allocation in multi-robot systems. Int. J. of Robotics Research, vol. 23, no. 9, pages 939–954, September 2004.

[Gerkey 04b]

Brian P. Gerkey & Maja J. Matari´c. Are (explicit) multi-robot coordination and multi-agent coordination really so different? In AAAI Spring Symposium on Bridging the Multi-Agent and Multi-Robotic Research Gap, pages 1–3, 2004.

[Gerkey 05]

Brian P. Gerkey, Sebastian Thrun & Geoff Gordon. Parallel stochastic hill-climbing with small teams. Multi-Robot systems: from swarms to intelligent automata, vol. 3, pages 65–77, 2005.

[Gerkey 06]

Brian P. Gerkey, Roger Mailler & Benoit Morisset. Commbots: Distributed control of mobile communication relays. In AAAI Workshop on Auction Mechanisms for Robot Coordination (AuctionBots), page 51–57, Boston, Massachusetts, July 2006.

[Gmytrasiewicz 04]

P. Gmytrasiewicz & P. Doshi. A framework for sequential planning in multi-agent settings. In Int. Symp. on Artificial Intelligence and Mathematics, 2004.

[Goldman 00]

Robert P. Goldman, Karen Zita Haigh, David J. Musliner & Michael J.S. Pelican. MACBeth: A Multi-Agent ConstraintBased Planner. In AAAI’00 Workshop on Constraints and AI Planning, tech. report WS-00-02, AAAI, pages 11–17. AAAI Press, 2000.

[Goldman 04]

Robert P. Goldman. Adapting Research Planners for Applications. In Int. Conf. on Automated Planning and Scheduling (ICAPS’04), 2004.

[Guerrero 06]

Jos´e Guerrero & Gabriel Oliver. Physical interference impact in multi-robot task allocation auction methods. In IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications (DIS’06), 2006.

[Guestrin 01]

Carlos Guestrin, Daphne Koller & Ronald Parr. Multiagent Planning with Factored MDPs. In Advances in Neural Information Processing Systems (NIPS-14), 2001.

[Gulec 05]

N. Gulec & M. Unel. A Novel Coordination Scheme Applied to Nonholonomic Mobile Robots. In Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC ’05. 44th IEEE Conf. on, pages 5089–5094, 12-15 Dec. 2005.

[Hansen 04]

E. Hansen, D. Bernstein & S. Zilberstein. Dynamic programming for partially observable stochastic games. In National Conf. on Artificial Intelligence (AAAI’04), San Jos´e, 2004.

194

BIBLIOGRAPHY

[Hardy 18]

G.H. Hardy & S. Ramanujan. Asymptotic Formulae in Combinatory Analysis. Proc. London Math. Soc., vol. 17, pages 75–115, 1918.

[Hayes 03]

Justin Hayes, Martha McJunkin & Jana Kosecka. Communication Enhanced Navigation Strategies for Teams of Mobile Agents. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’03), volume 3, pages 2285–2290, October 2003.

[Helmberg 99]

Christoph Helmberg. The m-Cost ATSP. In 7th Int. IPCO Conf. on Integer Programming and Combinatorial Optimization, pages 242–258, London, UK, 1999. Springer-Verlag.

[Howard 03]

Andrew Howard & Nicholas Roy. The Robotics Data Set Repository (Radish): http://radish.sourceforge.net, 2003.

[Hsieh 08]

M. Ani Hsieh, Anthony Cowley, R. Vijay Kumar & Camillo J. Taylor. Maintaining network connectivity and performance in robot teams. Journal of Field Robotics, vol. 25, no. 1–2, pages 111–131, January 2008.

[Hunsberger 00]

Luke Hunsberger & Barbara J. Grosz. A Combinatorial Auction for Collaborative Planning. In Int. Conf. on MultiAgent Systems (ICMAS’00), page 151, Washington, DC, USA, 2000. IEEE Computer Society.

[Hwang 92]

Frank K. Hwang, Dana S. Richards & Pawel Winter. The steiner tree problem. Elsevier, 1992.

[Iocchi 01]

Luca Iocchi, Daniele Nardi & Massimiliano Salerno. Reactivity and Deliberation: A Survey on Multi-Robot Systems. In Balancing Reactivity and Social Deliberation in Multi-Agent Systems, From RoboCup to Real-World Applications (selected papers from the ECAI 2000 Workshop and additional contributions), pages 9–34, London, UK, 2001. Springer-Verlag.

[Kalra 03]

Nidhi Kalra & Anthony (Tony) Stentz. A Market Approach to Tightly-Coupled Multi-Robot Coordination: First Results. In ARL Collaborative Technologies Alliance Symposium, May 2003.

[Kalra 05]

Nidhi Kalra, David Ferguson & Anthony (Tony) Stentz. Hoplites: a market-based framework for planned tight coordination in multirobot teams. In Int. Conf. on Robotics and Automation (ICRA’05), pages 1182–1189, April 2005.

[Kalra 07]

Nidhi Kalra, Dave Ferguson & Anthony Stentz. A Generalized Framework for Solving Tightly-coupled Multirobot Planning Problems. In Int. Conf. on Robotics and Automation (ICRA’07), pages 3359–3364, 2007.

[Kalyanasundaram 93]

Bala Kalyanasundaram & Kirk Pruhs. Online weighted matching. J. of Algorithms, vol. 14, no. 3, pages 478–488, 1993.

[Kapp 02]

S. Kapp. 802.11: Leaving the Wire Behind. IEEE Internet Computing, vol. 6, no. 1, pages 82–85, 2002.

195

BIBLIOGRAPHY

[Kirkpatrick 83]

S. Kirkpatrick, C. D. Gelatt & M. P. Vecchi. Optimization by Simulated Annealing. Science, vol. 4598, no. 220, pages 671– 680, May 1983.

[Konolige 04]

Kurt Konolige, Dieter Fox, Charlie Ortiz, Andrew Agno, Michael Eriksen, Benson Limketkai, Jonathan Ko, Benoit Morisset, Dirk Schulz, Benjamin Stewart & Regis Vincent. Centibots: Very large scale distributed robotic teams. In Int. Symp. on Experimental Robotics, 2004.

[Korte 00]

B. Korte & J. Vygen. Combinatorial optimization: Theory and algorithms. Springer-Verlag, Berlin, 2000.

[Kuhn 55]

Harold W. Kuhn. The Hungarian Method for the assignment problem. Naval Research Logistic Quarterly, vol. 2, no. 1, pages 83–97, 1955.

[Kumar 04]

Vijay Kumar, Daniela Rus & Sanjiv Singh. Robot and Sensor Networks for First Responders. IEEE Pervasive Computing, pages 24–33, October 2004.

[Lagoudakis 04]

Michail Lagoudakis, Marc Berhault, Sven Koenig, Pinar Keskinocak & Anton J. Kleywegt. Simple Auctions with Performance Guarantees for Multi-Robot Task Allocation. In IEEE/RSJ Int. Conf. on Intelligent Robotic Systems (IROS’04), pages 698–705, 2004.

[Lagoudakis 05]

Michail G. Lagoudakis, Evangelos Markakis, David Kempe, Pinar Keskinocak, Anton Kleywegt, Sven Koenig, Craig Tovey, Adam Meyerson & Sonal Jain. Auction-Based Multi-Robot Routing. In Robotics: Science and Systems, Cambridge, USA, June 2005.

[Larsen 04]

Allan Larsen, Oli B. G. Madsen & Marius M. Solomon. The A-Priori Dynamic Traveling Salesman Problem with Time Windows. Transportation Science, vol. 38, no. 4, pages 459–472, November 2004.

[Latimer 02]

DeWitt Talmadge Latimer IV, Siddhartha Srinivasa, Vincent Lee Shue, Samuel Sonne, Howie Choset & Aaron Hurst. Towards sensor based coverage with robot teams. In ICRA’02. IEEE, May 2002.

[Lee 02]

D. Lee, R. Attias, A. Puri, R. Sengupta, S. Tripakis & P. Varaiya. A Wireless Token Ring Protocol For Ad-Hoc Networks. In Aerospace Conf. Proceedings, 2002. IEEE, 2002.

[Lemaire 04]

Thomas Lemaire, Rachid Alami & Simon Lacroix. A Distributed Tasks Allocation Scheme in Multi-UAV Context. In IEEE Int. Conf. on Robotics and Automation (ICRA’04), May 2004.

[Liepert 09]

Bernd Liepert & contributors. Robotic visions to 2020 and beyond – the strategic research agenda for robotics in europe, 07/2009. European Robotics Technology Platform, 2nd edition, July 2009.

196

BIBLIOGRAPHY

[Lin 05]

Liu Lin & Zhiquiang Zheng. Combinatorial Bids based Multirobot Task Allocation Method. In IEEE Int. Conf. on Robotics and Automation (ICRA’05), pages 1157–1162, 2005.

[Long 03]

Matthew T. Long, Robin R. Murphy & Lynne E. Parker. Distributed Multi-Agent Diagnosis and Recovery from Sensor Failures. In IEEE Int. Conf. on Robotics and Automation (ICRA’03), volume 3, pages 2506–2513, 2003.

[Long 05]

Matt Long, Aaron Gage, Robin Murphy & Kimon Valavanis. Application of the Distributed Field Robot Architecture to a Simulated Demining Task. In IEEE Int. Conf. on Robotics and Automation (ICRA’05), 2005.

[Mandelbaum 07]

Robert Mandelbaum. DARPA’s Landroids to autonomously cover areas with Wi-Fi: http://www.darpa.mil/ipto/programs/ ld/ld.asp, 2007.

[Marques 06]

Carlos Marques, Joao Cristoao, Pedro Lima, Joao Frazao, Isabel Ribeiro & Rodrigo Ventura. RAPOSA: Semi-Autonomous Robot for Rescue Operations. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’06), pages 3988–3993, October 2006.

[Masson 09]

Emilie Masson, Pierre Combeau, Marion Berbineau, Rodolphe Vauzelle & Yannis Pousset. Radio Wave Propagation in Arched Cross Section Tunnels — Simulations and Measurements. Journal of Communications, vol. 4, no. 4, pages 276–283, May 2009.

[McLurkin 09]

James McLurkin. Experiment Design for Large Multi-Robot Systems. In Robotics: Science And Systems (RSS’09): Workshop On Good Experimental Methodology In Robotics, Seattle, WA, USA, June 2009.

[Metta 06]

G. Metta, P. Fitzpatrick & L. Natale. YARP: Yet Another Robot Platform. International Journal of Advanced Robotics Systems, special issue on Software Development and Integration in Robotics, vol. 3, no. 1, 2006.

[Minguez 05]

J. Minguez, F. Lamiraux & L. Montesano. Metric-Based Scan Matching Algorithms for Mobile Robot Displacement Estimation. In IEEE Int. Conf. on Robotics and Automation (ICRA’05), Barcelona, Spain, 2005.

[Mitsunaga 03]

Noriaki Mitsunaga, Taku Izumi & Minoru Asada. Cooperative Behavior based on a Subjective Map with Shared Information in a Dynamic Environment. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’03), volume 1, pages 291–296, October 2003.

[Montemerlo 03]

Michael Montemerlo, Nicholas Roy & Sebastian Thrun. Perspectives on standardization in mobile robot programming: The carnegie mellon navigation (CARMEN) toolkit. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’03), pages 2436–2441, 2003.

197

BIBLIOGRAPHY

[Mosteo 06a]

Alejandro R. Mosteo. 2006.

[Mosteo 06b]

Alejandro R. Mosteo & Luis Montano. Simulated annealing for multi-robot hierarchical task allocation with flexible constraints and objective functions. In Workshop on Network Robot Systems: Toward intelligent robotic systems integrated with environments at Int. Conf. on Intelligent Robots and Systems (IROS’06), 2006.

[Mosteo 07a]

Alejandro R. Mosteo & Luis Montano. Comparative experiments on optimization criteria and algorithms for auction based multi-robot task allocation. In IEEE Int. Conf. on Robotics and Automation (ICRA’07), pages 3345–3350, Roma, April 2007.

[Mosteo 07b]

Alejandro R. Mosteo & Luis Montano. Reliable software technologies – Ada-Europe 2007, volume 4498 of Lecture Notes in Computer Science, chapter SANCTA: An Ada 2005 generalpurpose architecture for mobile robotics research, pages 221– 234. Springer-Verlag, 2007.

[Mosteo 08]

Alejandro R. Mosteo, Luis Montano & Michail G. Lagoudakis. Multi-Robot Routing under Limited Communication Range. In IEEE Int. Conf. on Robotics and Automation (ICRA’08), pages 1531–1536, May 2008.

[Mosteo 09a]

Alejandro R. Mosteo & Luis Montano. Concurrent tree traversals for improved mission performance under limited communication range. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’09), pages 2840–2845, Piscataway, NJ, USA, 2009.

[Mosteo 09b]

Alejandro R. Mosteo, Luis Montano & Michail G. Lagoudakis. Distributed autonomous robotic systems 8, chapter GuaranteedPerformance Multi-robot Routing under Limited Communication Range, pages 491–502. Springer Berlin Heidelberg, 2009.

[Mosteo 10a]

Alejandro R. Mosteo. S ANCTA: http://www.mosteo.com/index.php/code-mainmenu-39, 2010.

[Mosteo 10b]

Alejandro R. Mosteo & Luis Montano. AMI-009-10-TEC: A survey of multi-robot task allocation. Technical report, Aragon Institute of Engineering Research (I3A), 2010.

[Mostofi 08]

Yasamin Mostofi. Communication-Aware Motion Planning in Fading Environments. In IEEE Int. Conf. on Robotics and Automation (ICRA’08), May 2008.

[Mouaddib 04]

Abdel-Illah Mouaddib. Multi-Objective Decision-Theoretic Path Planning. In IEEE Int. Conf. on Robotics and Automation (ICRA’04), volume 3, pages 2814–2819. IEEE, Apr 2004.

[Nanjanath 06]

Maitreyi Nanjanath & Maria Gini. Dynamic task allocation for robots via auctions. In IEEE Int. Conf. on Robotics and Automation (ICRA’06), pages 2781–2786, 2006.

198

Player-Ada: https://ada-player.sf.net,

BIBLIOGRAPHY

[Nau 99]

Dana S. Nau, Yue Cao, Amnon Lotem & H´ector Mu˜noz-Avila. SHOP: Simple Hierarchical Ordered Planner. In IJCAI ’99: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pages 968–975, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.

[Nau 04]

Dana Nau, Tsz-Chiu Au, Okhtay Ilghami, Ugur Kuter, H´ector Mu˜noz-Avila, J. William Murdock, Dan Wu & Fusun Yaman. Applications of SHOP and SHOP2. IEEE Intelligent Systems, vol. 20, no. 2, pages 33–41, 2004.

[Ng 00]

Andrew Y. Ng & Michael I. Jordan. PEGASUS: A policy search method for large MDPs and POMDPs. In 16th Conf. on Uncertainty in Artificial Intelligence (UAI’00), pages 406–415, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.

[Nguyen 04]

H.G. Nguyen, N. Pezeshkian, A. Gupta & N. Farrington. Maintaining Communication Link for a Robot Operating in a Hazardous Environment. In ANS 10th Int. Conf. on Robotics and Remote Systems for Hazardous Environments, Gainesville, FL, March 2004.

[Nyugen 04]

H. Nyugen, N. Farrington & N. Pezeshkian. Maintaining communication link for tactical ground robots. In AUVSI Unmanned Systems, August 2004.

[Parker 97]

Lynne E. Parker. L-ALLIANCE: Task-Oriented Multi-Robot Learning In Behavior-Based Systems. In Advanced Robotics, Special Issue on Selected Papers from IROS’96, pages 305–322, 1997.

[Parker 98]

L. Parker. ALLIANCE: An architecture for fault-tolerant multirobot cooperation. IEEE Transactions on Robotics and Automation, vol. 14, no. 2, pages 220–240, 1998.

[Peshkin 00]

Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau & Leslie Pack Kaelbling. Learning to cooperate via policy search. In 16th Conf. on Uncertainty in Artificial Intelligence (UAI’00), pages 489–496, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.

[Pezeshkian 07]

Narek Pezeshkian, Hoa G. Nguyen & Aaron Burmeister. Unmanned Ground Vehicle Radio Relay Deployment System for Non-line-of-sight Operations. In IASTED Int. Conf. on Robotics and Applications, Germany, August 2007. DARPA/ITO.

[Porta 05]

Josep M. Porta, Matthijs T. J. Spaan & Nikos Vlassis. Robot Planning in Partially Observable Continuous Domains. In Robotics: Science and Systems, pages 217–224. MIT Press, 2005.

[Porta 06]

Josep M. Porta, Nikos Vlassis, Matthijs T.J. Spaan & Pascal Poupart. Point-Based Value Iteration for Continuous POMDPs. J. Mach. Learn. Res., vol. 7, pages 2329–2367, 2006.

199

BIBLIOGRAPHY

[Prim 57]

R. C. Prim. Shortest connection networks and some generalizations. Bell System Technical Journal, vol. 36, pages 1389–1401, 1957.

[Pynadath 02a]

David V. Pynadath & Milind Tambe. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models. Journal of Intelligence Research, vol. 16, pages 389–423, June 2002.

[Pynadath 02b]

David V. Pynadath & Milind Tambe. Multiagent teamwork: analyzing the optimality and complexity of key theories and models. In 1st Int. Joint Conf. on Autonomous Agents and Multiagent Systems (AAMAS’02), pages 873–880, New York, NY, USA, 2002. ACM Press.

[Quigley 09]

Morgan Quigley, Brian Gerkey, Ken Conley, Josh Faust, Tully Foote, Jeremy Leibs, Eric Berger, Rob Wheeler & Andrew Ng. ROS: an open-source Robot Operating System. In IEEE/RSJ Int. Conf. on Robotics and Automation (ICRA’09): Workshop on Open Source Software., 2009.

[Quin 03]

L. Quin & T. Kunz. On-demand Routing in MANETs: The Impact of a Realistic Physical Layer Model. Lecture Notes in Computer Science, 2003.

[Ray 03]

S. Ray & J.B. Carruthers. RTS/CTS-Induced Congestion in Ad Hoc Wireless LANs. In In Wireless Communications and Networking Conf. (WCNC), March 2003.

[Reif 95]

J.H. Reif & H. Wang. Social potential fields: a distributed behavioral control for autonomous robots. In WAFR: Proceedings of the workshop on Algorithmic foundations of robotics, pages 331–345. A. K. Peters, Ltd., 1995.

[Reinelt 94]

Gerhard Reinelt. The traveling salesman: computational solutions for tsp applications, volume 840 of Lecture notes in computer science. Springer, 1994.

[Riehle 97]

D. Riehle. A role-based design pattern catalog of atomic and composite patterns structured by pattern purpose. Technical report 97-1-1, Union Bank of Switzerland, Switzerland, 1997.

[Rooker 04]

M.N. Rooker & A. Birk. Robocup 2004: Robot soccer world cup viii, chapter Combining Exploration and Ad-Hoc Networking in RoboCup Rescue. no. 3276 in Lecture Notes in Artificial Intelligence. Springer Verlag, 2004.

[Rooker 06]

Martijn N. Rooker & Andreas Birk. Communicative Exploration with Robot Packs. Lecture Notes in Artificial Intelligence, pages 267–278, 2006.

[Rooker 07]

Martijn N. Rooker & Andreas Birk. Multi-robot exploration under the constraints of wireless networking. Control Engineering Practice, vol. 15, no. 4, pages 435–445, 2007.

200

BIBLIOGRAPHY

[Rosencrantz 03]

Matthew Rosencrantz, Geoffrey Gordon & Sebastian Thrun. Locating moving entities in indoor environments with teams of mobile robotics. In Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS’03), 2003.

[Sachs 04]

Shai Sachs, Steven LaValle & Stjepan Rajko. Visibility-Based Pursuit-Evasion in an Unknown Planar Environment. Int. J. of Robotics Research, vol. 23, no. 1, pages 3–26, January 2004.

[Sanfeliu 07]

Alberto Sanfeliu. URUS: Ubiquitous Networking Robotics in Urban Settings. In Workshop on Network Robot Systems at IEEE Int. Conf. on Robotics and Automation (ICRA’07), Pasadena, 2007.

[Sanfeliu 08]

A. Sanfeliu. Cooperative robotics in urban sites. In Interantional Symposium on Network Robot Systems, Osaka, December 2008.

[Saridis 79]

G. N. Saridis. Toward the realization of intelligent controls. Proceedings of the IEEE, vol. 67, no. 8, pages 1115–1133, August 1979.

[Scerri 05]

Paul Scerri, Alessandro Farinelli, Steven Okamoto & Milind Tambe. Allocating tasks in extreme teams. In 4th Int. Joint Conf. on Autonomous Agents and Multiagent Systems (AAMAS’05), pages 727–734, New York, NY, USA, 2005. ACM.

[Schaller 97]

R.R. Schaller. Moore’s law: past, present and future. IEEE Spectrum, vol. 34, no. 6, pages 52–59, June 1997.

[Schesvold 03]

Doug Schesvold, Jingpeng Tang, Benzir Md Ahmed, Karl Altenburg & Kendall E. Nygard. POMDP Planning for High Level UAV Decisions: Search vs. Strike. In 16th Int. Conf. on Computer Applications in Industry and Engineering (CAINE’03), 2003.

[Schneider 05]

Jeff Schneider, David Apfelbaum, Drew Bagnell & Reid Simmons. Learning Opportunity Costs in Multi-Robot Market Based Planners. In IEEE Int. Conf. on Robotics and Automation (ICRA’05), pages 1163–1168, 2005.

[Sheu 04]

J.-P. Sheu, C.-H. Liu, S.-L. Wu & Y.-C. Tseng. A priority MAC protocol to support real-time traffic in ad hoc networks. Wirel. Netw., vol. 10, no. 1, pages 61–69, 2004.

[Simmons 00]

Reid Simmons, David Apfelbaum, Wolfram Burgard, Dieter Fox, Mark Moors, Sebastian Thrun & Hakan Younes. Coordination for Multi-Robot Exploration and Mapping. In National Conf. on Artificial Intelligence (AAAI’00), 2000.

[Smith 80]

R. G. Smith. The Contract Net Protocol. IEEE Transactions on Computers, vol. 29, no. 12, pages 1104–1113, 1980.

[Sobrinho 96]

J.L. Sobrinho & A.S. Krishnakumar. Real-time traffic over the IEEE 802.11 medium access control layer. Technical report, Bell Labs, 1996.

201

BIBLIOGRAPHY

[Souryal 06]

M.R. Souryal, L. Klein-Berndt, L.E. Miller & N. Moayeri. Link Assessment in an Indoor 802.11 Network. In IEEE Wireless Communications & Networking Conf. (WCNC 2006), 2006.

[Spaan 03]

M. T. J. Spaan & F. C. A. Groen. Team coordination among robotic soccer players. In G. Kaminka, P. U. Lima & R. Rojas, editeurs, RoboCup 2002, pages 409–416. Springer-Verlag, 2003.

[Spaan 04]

Matthijs T.J. Spaan & Nikos Vlassis. A point-based POMDP algorithm for robot planning. In IEEE Int. Conf. on Robotics and Automation (ICRA’04), pages 2399–2404, New Orleans, Louisiana, April 2004.

[Spaan 05]

Matthijs T. J. Spaan & Nikos Vlassis. Perseus: Randomized Point-based Value Iteration for POMDPs. Journal of Artificial Intelligence Research, vol. 24, pages 195–220, 2005.

[Spaan 06]

Matthijs T. J. Spaan, Geoffrey J. Gordon & Nikos Vlassis. Decentralized planning under uncertainty for teams of communicating agents. In Int. Joint Conf. on Autonomous Agents and Multiagent Systems (AAMAS’06), pages 249–256, 2006.

[Sridhara 07]

Vinay Sridhara & Stephan Bohacek. Realistic propagation simulation of urban mesh networks. Computer Networks, vol. 51, no. 12, pages 3392–3412, August 2007.

[Stentz 99]

Anthony Stentz & M Bernardine Dias. A Free Market Architecture for Coordinating Multiple Robots. Technical report CMURI-TR-99-42, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, December 1999.

[Stojmenovic 04]

Ivan Stojmenovic. Geocasting with guaranteed delivery in sensor networks. IEEE Wireless Communications Magazine, vol. 11, no. 6, pages 29–37, December 2004.

[Stone 97]

Peter Stone & Manuela Veloso. Multiagent Systems: A Survey from a Machine Learning Perspective. Autonomous Robots, vol. 8, pages 345–383, 1997.

[Stone 98]

Peter Stone & Manuela M. Veloso. Task Decomposition and Dynamic Role Assignment for Real-Time Strategic Teamwork. In Agent Theories, Architectures, and Languages, pages 293–308, 1998.

[Stump 08]

Ethan Stump, Ali Jadbabaie & Vijay Kumar. Connectivity management in mobile robot teams. In IEEE Int. Conf. on Robotics and Automation (ICRA’08), pages 1525–1530. IEEE, 2008.

[Takayama 09]

Leila Takayama. Toward a Science of Robotics: Goals and Standards for Experimental Research. In Robotics: Science and Systems (RSS’09), Seattle, WA, 2009 2009.

[Tardioli 07]

Danilo Tardioli & Jos´e Luis Villarroel. Real Time Communications over 802.11: RT-WMP. In Mobile Ad-Hoc and Sensor Systems, 2007.

202

BIBLIOGRAPHY

[Tardioli 10]

Danilo Tardioli, Alejandro R. Mosteo, Luis Riazuelo, Jos´e Luis Villarroel & Luis Montano. Enforcing network connectivity in robot team missions. Int. J. of Robotics Research, vol. 29, no. 4, pages 460–480, April 2010.

[Thayer 00]

Scott Thayer, Bruce Digney, M Bernardine Dias, Anthony (Tony) Stentz, Bart Nabbe & Martial Hebert. Distributed Robotic Mapping of Extreme Environments. In SPIE: Mobile Robots XV and Telemanipulator and Telepresence Technologies VII, volume 4195, November 2000.

[Tovey 05]

Craig Tovey, Michail G. Lagoudakis, Sonal Jain & Sven Koenig. Generation of Bidding Rules for Auction-Based Robot Coordination. In 3rd International Multi-Robot Systems Workshop, Washington DC, March 2005. Naval Research Laboratory.

[Urcola 08]

Pablo Urcola, Luis Riazuelo, Mayte T. L´azaro & Luis Montano. Cooperative Navigation using environment compliant robot formations. In IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS’08), pages 2789–2794, Nice, France, September 2008.

[Vail 03]

Douglas Vail & Manuela Veloso. Dynamic multi-robot coordination. In Alan C. Schultz, Lynne E. Parker & Frank E. Schneider, editeurs, MultiRobot Systems: From Swarms to Intelligent Automata, volume 2, pages 87–100. Springer, 2003.

[Vazquez 04]

Jose Vazquez & Chris Malcolm. Distributed Multirobot Exploration Maintaining a Mobile Network. In IEEE Int. Conf. on Intelligent Systems, pages 113–118, 2004.

[Viguria 07]

A. Viguria, I. Maza & A. Ollero. SET: An algorithm for distributed multirobot task allocation with dynamic negotiation based on task subsets. In IEEE Int. Conf. on Robotics and Automation (ICRA’07), pages 3339–3344, Rome, Italy, 2007.

[Viguria 08]

A. Viguria, I. Maza & A. Ollero. S+T: An algorithm for distributed multirobot task allocation based on services for improving robot cooperation. In IEEE Int. Conf. on Robotics and Automation (ICRA’08), Pasadena, USA, May 2008.

[Vlavianos 08]

A. Vlavianos, L.K. Law, I. Broustis, S.V. Krishnamurthy & M. Faloutsos. Assesing Link Quality in IEEE 802.11 Wireless Networks: Which is the Right Metric? In IEEE 19th Int. Symp. on Personal, Indoor and Mobile Radio Communications, September 2008.

[Vu 03]

Thuc Vu, Jared Go, Gal A. Kaminka, Manuela M. Veloso & Brett Browning. MONAD: a flexible architecture for multi-agent control. In Int. Joint Conf. on Autonomous Agents and Multiagent Systems (AAMAS’03), 2003.

[Wagner 04]

Alan Wagner & Roland Arkin. Multi-Robot CommunicationSensitive Reconnaissance. In IEEE Int. Conf. on Robotics and Automation (ICRA’04), pages 4674–4681, 2004.

203

BIBLIOGRAPHY

[Wawerla 09]

Jens Wawerla & Richard T. Vaughan. Publishing Identifiable Experiment Code And Configuration Is Important, Good and Easy. In Workshop On Good Experimental Methodology In Robotics at Robotics: Science and Systems, Seattle, WA, USA, June 2009.

[Wellman 97]

Michael P. Wellman. Market-Aware Agents for a Multiagent World. In 8th European Workshop on Modelling Autonomous Agents in a Multi-Agent World, page 1, London, UK, 1997. Springer-Verlag.

[Werger 00]

Barry Brian Werger & Maja J. Matari´c. Broadcast of local eligibility: behavior-based control for strongly cooperative robot teams. In 4th Int. Conf. on Autonomous agents (AGENTS’00), pages 21–22, New York, NY, USA, 2000. ACM Press.

[Yamauchi 98]

Brian Yamauchi. Frontier-based exploration using multiple robots. In 2nd Int. Conf. on Autonomous agents (AGENTS’98), pages 47–53, New York, NY, USA, 1998. ACM Press.

[Yokoo 01]

Makoto Yokoo. Distributed constraint satisfaction: foundations of cooperation in multi-agent systems. Springer-Verlag, London, UK, 2001.

[Zhang 05]

Fei Zhang, Weidong Chen & Yugeng Xi. Improving Collaboration through Fusion of Bid Information for Market-based Multirobot Exploration. In IEEE Int. Conf. on Robotics and Automation (ICRA’05), pages 1169–1174, 2005.

[Zlot 02]

Robert Zlot, Anthony Stentz, M. Bernardine Dias & Scott Thayer. Multi-Robot Exploration Controlled by a Market Economy. In IEEE Int. Conf. on Robotics and Automation (ICRA’02), 2002.

[Zlot 03]

Robert Michael Zlot & Anthony (Tony) Stentz. Market-based Multirobot Coordination Using Task Abstraction. In Int. Conf. on Field and Service Robotics, July 2003.

[Zlot 05]

Robert Michael Zlot & Anthony (Tony) Stentz. Complex Task Allocation For Multiple Robots. In IEEE Int. Conf. on Robotics and Automation (ICRA’05). IEEE, April 2005.

[Zuluaga 08]

Mauricio Zuluaga & Richard T. Vaughan. Modeling multirobot interaction using generalized occupancy grids. In In Proceedings of the IEEE Int. Conf. on Robotics and Automation (ICRA’08), May 2008.

204

Suggest Documents