laborative target tracking before considering a more general framework for optimal .... In its most general form, a convex optimization problem (or program) can be ...... Rajkumar, Raj, Rybski, Paul, Salesky, Bryan, Scherer, Sebastian, Woo-Seo,.
A CONVEX OPTIMIZATION FRAMEWORK FOR MULTI-AGENT MOTION PLANNING by Jason C. Derenick
A Dissertation Presented to the Graduate Committee of Lehigh University in Candidacy for the Degree of Doctor of Philosophy in Computer Science and Engineering
Lehigh University April 2009
c Copyright 2009 by Jason C. Derenick
All Rights Reserved
ii
Approved and recommended for acceptance as a dissertation in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Date John R. Spletzer, Ph.D. Dissertation Director/Committee Chair (Lehigh University)
Accepted Date Committee Members:
Mooi Choo Chuah, Ph.D. (Lehigh University)
Hector Mu˜ noz–Avila, Ph.D. (Lehigh University)
Aur´elie C. Thiele, Ph.D. (Lehigh University)
Camillo J. Taylor, Ph.D. (University of Pennsylvania) iii
Lovingly dedicated to my mother, Donna
iv
Acknowledgements I’d like to begin by expressing my sincere gratitude to my dissertation advisor, John Spletzer. His patience and guidance over the years have taught me more than he will likely ever know or even begin to fully appreciate. Undoubtedly, he deserves all of the credit, and I am honored to have had him as a mentor. I would also like to express my gratitude to those that I have had the privilege of working with during my tenure at VADER Lab. Among them are Humberto Serme˜ no–Villalta, Chris Thorne, Chris Mansley, Chao Gao, Mike Kowalski, Chris Wojciechowski and Tom Miller. I will miss the times spent racing RC cars around the lab, playing Warfish on Friday afternoons, or just grabbing a quick bite to eat at the Goose or D`eja Brew. They made the difficult times so much easier to bear. Additionally, I would like to thank Professor Dan Lee for his keen insight during my time as a visiting student in the GRASP Lab at the University of Pennsylvania. Working as a member of the Ben Franklin Racing Team under his guidance was second–to–none. It was truly a once–in–a–lifetime experience, and I am deeply indebted to him for having offered me such a wonderful opportunity. I would also like to mention the folks at Boston Dynamics. Specifically, I would like to thank the MDMR team and foremost, Al Rizzi. Although the work I completed at BD did not directly contribute to this dissertation, I learned more than I could have ever anticipated during my brief stay with them. It goes without saying that I am a better roboticist and person as a result of this experience. I could not possibly go without mentioning two of the closest people in my life. As far as best friends go, Brandon is in a league of his own. I cannot begin to express my appreciation to both him and his family for how wonderfully they have v
treated me over the years. I must also express my eternal gratitude to my darling love, Tiffany. I am indebted to her for the constant outpouring of compassion and support she has shown during this time. Last – but certainly not least – I would like to thank my dedicated and loving family. This includes my mother, Donna, grandmother, Jule, my brother and sister– in–law, Jack and Betsy, my delightful niece, Sarah, and my pleasant nephew, Jack Jr. Their support over the years has been invaluable and without it none of this would be possible. They are remarkable people, and I hope one day to be half as understanding as they have all been towards me. I love them all so much.
vi
Contents Acknowledgements
v
Abstract
1
1 Introduction
2
1.1
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Motivation by Example . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.3
Employing Convex Optimization
. . . . . . . . . . . . . . . . . . . .
6
1.4
Dissertation Contributions . . . . . . . . . . . . . . . . . . . . . . . .
7
1.5
Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Optimal Team Formations
13
2.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3
The Shape Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4
Solving via Second–order Cone Programming . . . . . . . . . . . . . . 19
2.5
2.4.1
Regulating Formation Parameters . . . . . . . . . . . . . . . . 22
2.4.2
Optimal Solutions in R3 . . . . . . . . . . . . . . . . . . . . . 23
Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5.1
Theoretical Complexity . . . . . . . . . . . . . . . . . . . . . . 28
2.5.2
Performance in Practice . . . . . . . . . . . . . . . . . . . . . 34
2.6
Enforcing Motion Constraints . . . . . . . . . . . . . . . . . . . . . . 37
2.7
Handling Workspace Constraints . . . . . . . . . . . . . . . . . . . . 38
vii
2.8
2.9
Distributing Computation . . . . . . . . . . . . . . . . . . . . . . . . 47 2.8.1
Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . 48
2.8.2
Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 57
A Decentralized Solution . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3 Optimal Target Tracking
66
3.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3
The Tracking Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4
Employing Spectral Graph Theory . . . . . . . . . . . . . . . . . . . 70 3.4.1
Guaranteeing Full Target Coverage . . . . . . . . . . . . . . . 72
3.4.2
Guaranteeing Network Connectivity . . . . . . . . . . . . . . . 73
3.5
Interactive Control Functions . . . . . . . . . . . . . . . . . . . . . . 74
3.6
Solving via Semidefinite Programming . . . . . . . . . . . . . . . . . 77 3.6.1
A Discrete–time Formulation
. . . . . . . . . . . . . . . . . . 78
3.6.2
Choosing an Objective . . . . . . . . . . . . . . . . . . . . . . 80
3.7
Guaranteeing k–Coverage of Targets . . . . . . . . . . . . . . . . . . 83
3.8
A Relaxation for Large–scale Teams . . . . . . . . . . . . . . . . . . . 90
3.9
On Characterizing Optimality . . . . . . . . . . . . . . . . . . . . . . 96
3.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4 A Convex Framework
101
4.1
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2
Goal–directed Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.3
Collision Avoidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.4
A Unified Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.5
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5 Future Work
133
5.1
Decentralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2
Exploiting Algebraic Topology . . . . . . . . . . . . . . . . . . . . . . 134 viii
A Hybrid FSO/RF Mobile Ad–hoc Networks for Robot Teams
137
A.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 A.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 A.3 A Hierarchical Link Acquisition System . . . . . . . . . . . . . . . . . 139 A.3.1 Coarse Alignment Phase . . . . . . . . . . . . . . . . . . . . . 140 A.3.2 Vision-based Alignment Phase . . . . . . . . . . . . . . . . . . 141 A.4 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 A.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 A.5.1 Vision-based Alignment Validation . . . . . . . . . . . . . . . 145 A.5.2 Network Deployment . . . . . . . . . . . . . . . . . . . . . . . 148 A.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 B Robust Perception for the 2007 DARPA Urban Challenge
151
B.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 B.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.3 Terrain Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 B.3.1 Ground Plane Extraction . . . . . . . . . . . . . . . . . . . . . 155 B.3.2 Point Classification . . . . . . . . . . . . . . . . . . . . . . . . 157 B.4 Traffic Line Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 B.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 C The Sick LIDAR Matlab/C++ Toolbox
162
C.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 C.2 Compatible Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.2.1 Compatible Sick LMS 2xx Units . . . . . . . . . . . . . . . . . 164 C.2.2 Compatible Sick LD Units . . . . . . . . . . . . . . . . . . . . 166 C.3 Toolbox Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.3.1 Sick LMS 2xx Support . . . . . . . . . . . . . . . . . . . . . . 166 C.3.2 Sick LD Support . . . . . . . . . . . . . . . . . . . . . . . . . 167 C.4 Architectural Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 168 C.5 Applying the Toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
ix
C.6 Groups Utilizing the Toolbox . . . . . . . . . . . . . . . . . . . . . . 172 C.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 References
174
Vita
188
x
List of Figures 1.1
Illustration of global vs. local planning . . . . . . . . . . . . . . . . .
3
1.2
Motivating examples for optimal motion planning . . . . . . . . . . .
5
2.1
Illustration of the equivalency between differing shape formations . . 17
2.2
Solving the shape problem . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3
Constraining the feasible formation orientation via linear inequalities 2
23
2.4
Simulation results for a team obtaining an optimal formation in R
. 25
2.5
Simulation results for a team obtaining an optimal formation in R3 . 28
2.6
Transforming the Newton KKT system to mono–banded form . . . . 36
2.7
Mean CPU time for solving the coordination problem via MOSEK . . 37
2.8
Simulation results for a team with motion constraints in R2 . . . . . . 39
2.9
Simulation results (and associated Newton KKT system) for a team operating in a convex polygonal workspace . . . . . . . . . . . . . . . 42
2.10 Simulation results for a team charged with maintaining a desired shape formation while traversing a non–convex polygonal workspace . 46 2.11 Newton KKT system corresponding to Figure 2.10 . . . . . . . . . . . 47 2.12 A computationally distributed architecture for solving the shape problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.13 Decomposition of compact Newton KKT system . . . . . . . . . . . . 51 2.14 Solving via a distributed LU solver . . . . . . . . . . . . . . . . . . . 52 2.15 Computationally distributed approach implemented on a team of six Sony Aibos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
xi
2.16 An observation of the relative pose of an agent that has solved the shape problem yields an additional linear constraint . . . . . . . . . . 60 2.17 Simulation results for a team of agents utilizing the proposed hierarchical decentralized framework to obtain a globally optimal {4,4}
tessellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1
The weighted visibility graph for a team of three agents tracking a single target in R2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.2
Interactive control functions for a simple target tracking application . 76
3.3
A team of eight robots tracks five targets in R3 while ensuring full target coverage and network connectivity across the formation. . . . . 83
3.4
Progression of λ2 (LV (X)) and λ2 (LN (X)) for Figure 3.3 . . . . . . . 84
3.5
The combinatoric visibility graph for four agents maintaining 2–coverage of a single target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.6
A team of fours agents ensures 2–coverage for a pair of diverging targets in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.7
Progression of λ2 (GV1 (X)) corresponding to Figure 3.6 and relative agent–target distances showing the governing nature of ICFs. . . . . . 91
3.8
The binodal visibility graphs for a team of three agents tasked with maintaining coverage of three targets . . . . . . . . . . . . . . . . . . 92
3.9
A team of three agents employs the SOCP relaxation to ensure each target is tracked by at least a single member . . . . . . . . . . . . . . 94
3.10 A comparison of CPU utilization for the SDP and SOCP tracking problem formulations as a function of team size . . . . . . . . . . . . 95 3.11 Gauging convergence by considering static target placements . . . . . 97 4.1
Illustrating the use of potential fields for a robot operating in dynamic environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.2
Embedding goal–directed behavior is accomplished by iteratively minimizing the distance between a point–mass abstraction of the team and an objective waypoint . . . . . . . . . . . . . . . . . . . . . . . . 104
xii
4.3 4.4
Illustrating the formulation of the inter–agent collision graph . . . . . 108 √ Sigmoid function for rija = 10 illustrated for different values of γ . . 109
4.5
Illustrating the formulation of the binodal inter–agent collision graphs 113
4.6
Illustrating the formulation of the extended collision graph . . . . . . 115
4.7
A multi–agent system exhibiting goal–directed behavior with formation control in the face of static obstacles. . . . . . . . . . . . . . . . 119
4.8
Minimum agent–to–agent distance corresponding to Figure 4.7 . . . . 121
4.9
An example in R2 where an obstacle ultimately violates the feasibility of proximity constraints yielding an unavoidable collision . . . . . . . 123
4.10 A multi–agent system exhibiting goal–directed behavior with formation control in the face of dynamic obstacles with αz = 1.8 . . . . . . 126 4.11 A multi–agent system exhibiting goal–directed behavior with formation control in the face of dynamic obstacles with αz = 2.2. . . . . . . 128 5.1
Static sensor network topology can be fully captured by exploiting a subfield of algebraic topology known as simplicial homology where the network is represented as a chain of oriented k–simplices . . . . . 135
A.1 A high–level view of the proposed coarse–to–fine approach for autonomously aligning directional FSO transceivers . . . . . . . . . . . 141 A.2 The pair of hybrid nodes employed in the FSO/RF network experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 A.3 Signal–to–noise ratio gauging the effectiveness of the normalized– intensity distribution for agent tracking . . . . . . . . . . . . . . . . . 147 A.4 Deployment strategy for hybrid FSO/RF network experiments . . . . 148 A.5 Mosaics generated from the images acquired during a single trial of the link acquisition system . . . . . . . . . . . . . . . . . . . . . . . . 149 B.1 “Little Ben” crosses the finish line becoming the fourth place finisher of the 2007 DARPA Urban Challenge . . . . . . . . . . . . . . . . . . 152 B.2 Problem geometry for obstacle detection and traffic line extraction . . 155
xiii
B.3 Example weighting function for IRLS–based approach under different values for αg , αo , β with ρ = 0.5. . . . . . . . . . . . . . . . . . . . . 156 B.4 Terrain classification and obstacle detection results . . . . . . . . . . 158 B.5 Traffic line classification results using LIDAR reflectivity values . . . 160 C.1 Example Sick LMS 2xx and Sick LD–LRS models supported by The Sick LIDAR Matlab/C++ Toolbox . . . . . . . . . . . . . . . . . . . 165 C.2 The Sick LIDAR Matlab/C++ Toolbox architecture schematics . . . 170 C.3 Screenshots of the toolbox streaming/processing range and reflectivity returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
xiv
Abstract Imagine a world where driving fatalities are drastically reduced because cars are able to safely coordinate and drive themselves. A world where a robot team is sent into a burning building to pull a child from the clutches of death or where robots can assist an elderly couple by mowing the lawn, finding lost belongings, bringing daily medications, or calling for assistance in an emergency. A fundamental requirement of such transformative technologies will be the ability to effectively collaborate in order to reach some goal or bring about some physical state in the environment. In general, this decision process is known as motion planning and it poses one of the most significant challenges facing the fledgling field of robotics. The focus of this dissertation is addressing the motion planning problem by developing a rich, real–time framework that affords optimality and provides substantial performance guarantees. Towards this end, this research exploits recent advances in convex optimization with an emphasis on applications to optimal team formation changes and optimal collaborative target tracking. Regarding the former, second– order cone programming (SOCP) techniques are utilized to develop highly efficient centralized, distributed, and decentralized solutions to the problem. Regarding the latter, semidefinite programming (SDP) is melded with seminal results from spectral graph theory to develop a discrete–time framework for optimal target tracking subject to k–coverage and network connectivity constraints. As the capstone of this research, these seemingly unrelated contributions are unified in a discrete–time optimization framework that also affords goal–directed team behavior and dynamic collision avoidance while minimizing an arbitrary convex objective. Simulation and experimental results are included to highlight the utility of this approach. 1
Chapter 1 Introduction 1.1
Problem Statement
According to Jean–Claude Latombe, the author of the seminal book on the subject, motion planning is the process whereby a robot decides which actions to take in order to achieve some desired arrangement of physical elements in its operating environment (Latombe, 2003). In the often–employed “roadmap” approach, a global planner provides a high–level sequence of waypoints while a local planner attempts to achieve these waypoints subject to environmental considerations (see Figure 1.1). Extending upon this basic notion, the thrust of this dissertation is in developing a rich, real–time motion planning framework for optimally coordinating multi–agent systems (e.g. a mobile robot team or sensor network). Specifically, as the unifying theme of this research, the following fundamental question is explored Problem 1.1.1. How can a mobile multi–agent system collaboratively generate an optimal plan that enables the team to achieve a team–based objective (perhaps with member–driven sub–goals) while providing substantial performance guarantees (e.g. formation maintenance, network connectivity, and dynamic obstacle avoidance)? Addressing this problem is paramount to enable the transformative technologies promised by multi–agent systems. This dissertation proffers a solution to this
2
1.1. PROBLEM STATEMENT
Figure 1.1: In the context of mobile robotics, motion planning is typically a multi– phased approach. A global planner produces a high–level sequence of waypoints that the robot attempts to follow as closely as possible subject to environmental perceptions generated from proprioceptive and/or exteroceptive sensing modalities. problem by exploiting the fact that coordinating a multi–robot system (by generating a plan) is inherently a resource–allocation problem in which the objective is to optimally allocate team members so as to minimize some performance metric subject to allocation constraints (e.g. expendable energy, etc.). In adopting this perspective, this research exploits both continuous and discrete–time optimization constructs with initial emphasis placed on optimal team formation changes and collaborative target tracking before considering a more general framework for optimal planning in unexplored environments – all with impressive performance guarantees. In contrast to other approaches, some of which will be discussed in §2.2 and §3.2,
the proffered framework exploits the rich capabilities of convex optimization theory
to yield optimal solutions. As shall be seen, in many cases such solutions can be obtained in real–time for teams comprised of tens, hundreds, or even thousands of robots. Depending upon the desired application, said framework can be adapted for either global or local planning purposes.
3
1.2. MOTIVATION BY EXAMPLE It should be noted that in developing the proposed framework, a top–down approach is adopted that can be considered converse to the philosophy that intelligence “emerges” from a multi–agent system as a function of team members employing simple laws or atomic, purely reactive behaviors. The notion of emergent intelligence is especially attractive due to its algorithmic austerity and its biological roots (e.g. flocks of birds and schools of fish organizing based on localized reactions to individual percepts). It has been formalized in the context of mobile robotics by the seminal work of Brooks (Brooks, 1986) and is the cornerstone of the well–known “Boid” swarms algorithm developed by Reynolds (C.W. Reynolds, 1987). Unfortunately, such bottom–up approaches can rarely make guarantees of optimality (which is oftentimes desirable) as employing them in the context of multi–agent systems often leads only to feasible configurations/solutions with respect to problem constraints.
1.2
Motivation by Example
The motivation behind pursuing an optimal framework for multi–agent motion planning can be best explained by considering recent technological thrusts that can serve as beneficiaries. Towards this end, consider recent efforts by the NASA Institute for Advanced Concepts (NIAC) to foster the development of Planetary MicroBots (a.k.a. Ballbots) for extraterrestrial exploration (Trafton, 2006) (see Figure 1.2). In contrast to the larger rovers (i.e. Spirit and Opportunity), these agile scouts are individually sacrificial and are designed to be deployed in large–scale teams over wide–areas of rough terrain. About the size and shape of a tennis ball, they roll, hop, and bounce between locations and feature an array of sensors that include micro–scale imagers and spectrometers (Dubowsky et al. , 2005). Operating as a swarm, such a system could exploit the proffered optimization framework to achieve objectives that include maximizing coverage area or minimizing expended energy in attaining a desired network topology (e.g. a mesh). For such an application, embedding optimal behaviors is essential given the constrained resources of the team to include a finite power supply and low–grade communication/sensing modalities.
4
1.2. MOTIVATION BY EXAMPLE
(a)
(b)
Figure 1.2: (a) Planetary Microbots, proposed by (Dubowsky et al. , 2005), could exploit an optimization framework to minimize expended energy while achieving a desired network topology. (b) A team of tactical autonomous land, sea, and air vehicles (e.g. see (Abbeel et al. , 2007)) could exploit an optimization framework to maintain target coverage while minimizing uncertainty in agent/target positions. As a second motivating example, consider the task of monitoring hostile vehicles in an active war zone. A team of tactical autonomous land, sea, or air vehicles (e.g. (Abbeel et al. , 2007)) could be deployed to provide mission–critical situational awareness. The team could exploit an optimal framework to achieve objectives that include maximizing the number of active target tracks (e.g. for data–fusion purposes) or minimizing the uncertainty in believed agent and target positions while maintaining an ad–hoc network for soldier communications. In fact, the high–demand for such enabling technology has lead to the recent instantiation of the National Center for Defense Robotics (NCDR) (TTC, 2006). As a final example, consider a man–made or natural disaster that levels the communication infrastructure of a metropolitan area. A team of expendable robots, featuring line–of–sight constrained, high–throughput communication mediums (e.g. Free–space optics transceivers (Derenick et al. , 2005)), can be deployed in such areas to apply carrier–grade patches and restore vital wide–area communication capabilities. Serving perhaps as first–responders, these teams would greatly benefit from a general motion–planning framework that affords minimizing the uncertainty
5
1.3. EMPLOYING CONVEX OPTIMIZATION latent to the state–based estimation process. Although seemingly far–fetched, such systems have been explored in supporting research (see Appendix A). Although vastly different, these examples highlight the utility of optimization for multi–agent systems. In each case, useful objectives can be readily defined as a direct functional of the system state vector with constraints (e.g. vehicle kinematics) dictating the feasible set. However, applying generalized interior–point methods for solving such optimization problems often leads to poor performance for teams featuring anything more than a small number of members. Moreover, such techniques cannot make any substantial guarantees regarding the resulting level of optimality. As a result, this research considers a special class of problems that yield globally optimal solutions in polynomial–time. Specifically, we leverage recent advances in convex optimization theory (a.k.a. convex programming) (Boyd & Vandenberghe, 2004; Lobo et al. , 1998; Nesterov & Nemirovskii, 1994). Given the centrality of the convex theory to forthcoming results, an introduction is given in the next section.
1.3
Employing Convex Optimization
In convex optimization theory the goal is to minimize some convex objective (i.e. cost) function subject to constraints that define a convex feasible set (Boyd & Vandenberghe, 2004). Intuitively, a real–valued set is convex if and only if it contains all of the points along the line segment joining any two of its members. Similarly, a function is convex if and only if its epigraph (i.e. “the area lying above its curve”) comprises such a set. Convex problems are especially attractive as they have the distinguishing characteristic that all local minima are, in fact, global minima that can be readily obtained in polynomial time (Nesterov & Nemirovskii, 1994). In other words, an obtained solution is guaranteed to be globally optimal. Furthermore, recent efforts, among them the seminal results of Nesterov and Nemirovskii (1994), Boyd and Vandenberghe (2004), and others, have brought a high–level of maturity to this fledgling discipline making it attractive for more diverse applications.
6
1.4. DISSERTATION CONTRIBUTIONS In its most general form, a convex optimization problem (or program) can be mathematically expressed as min Ψ(X) X∈Φ
(1.1)
Φ = {X : hi (X) ≤ 0, ∀ i ∈ I; ai (X) = 0, ∀ i ∈ E} where I and E respectively denote finite index sets corresponding to inequality and
equality constraints, X ∈ Rn represents the decision variables, Ψ : Rn → R is the convex objective function, hi : Rn → R, ∀ i ∈ I denote convex inequality constraints and ai : Rn → R, ∀ i ∈ E are affine. In the context of motion planning,
X = (x1 , . . . , xn )T might represent the concatenated position variables of the robot team (i.e. its state). The inequality constraints hi may be used to define maximal inter–agent proximity constraints, vehicle kinematics (perhaps via first–order discretization methods), or even suitable workspace boundary constraints. The equality constraints ai might be used to enforce relative positional constraints, or to constrain the position of teammates with respect to a common frame. In this research, Problem (1.1) will usually take the form of a second–order cone program (SOCP) or semidefinite program (SDP). Both denote problem subclasses of the convex problem space with the class of SDPs (the most general of convex problem formulations) subsuming SOCPs. In a SOCP, the chosen objective is minimized over a feasible set defined by the intersection of an affine space with a finite set of second– order cones. In a SDP minimization of a convex objective is done over the cone of positive semidefinite matrices. In rigorous formulations, the objective is defined as linear; however, a number of “tricks” can often be employed to yield a suitable formulation. The interested reader is referred to (Boyd & Vandenberghe, 2004) for a thorough treatment of these subfields.
1.4
Dissertation Contributions
As previously noted, the primary contribution of this dissertation is the formulation of a convex framework for optimal motion planning with an emphasis on applications to optimal team formation changes and collaborative target tracking. In this 7
1.4. DISSERTATION CONTRIBUTIONS section, the highlights of this research are presented beginning with impressive complexity bounds to enable optimal formation changes. Subsequently, the highlights of a discrete–time tracking paradigm that guarantees target coverage and network connectivity are discussed before results that unify these seemingly disparate areas into a general convex optimization framework. Supplemental results are also highlighted that pertain to related research in estimation and sensing. Regarding optimal team formations, this dissertation proffers • A centralized approach to optimal formation changes that yields a theoreti-
cal complexity of O(m1.5 ) with performance–in–practice of O(m), where m denotes the number of agents in the formation.
• Extensions that provide both computationally distributed and hierarchically decentralized solutions to the problem. The former provides expected per–
node computational and storage complexities of O(1), making the approach accessible to less sophisticated processors. The latter allows agents to employ only local observations to achieve a desired formation shape. Excluding a finite subset of “leader” agents, each member solves the problem in O(1) time. • Embedding convex and non–convex workspace constraints, where the environ-
ment is modeled via a finite set of k linear inequalities. Regarding the former, computational complexity scales as O(k 1.5 m1.5 ) in theory and linearly in both
team size and environment size in practice. Regarding the former, complexity scales as O(k 1.5m3.5 ) or O(k 3.5m1.5 ) – depending upon the chosen permutation of a core linear system. In regards to optimal target tracking, this dissertation presents results that exploit advances in spectral graph theory to yield • A discrete–time, semidefinite approach to collaborative target tracking that ensures each target is tracked by at least a single team member and network connectivity remains intact across the robot formation during a trial. • A generalization to embed k–coverage constraints where k members of the team are guaranteed to be tracking each target at any given time while maintaining 8
1.4. DISSERTATION CONTRIBUTIONS network connectivity. This extension makes the system suitable for deployment in a larger class of scenarios to include systems that require multiple bearing observations (e.g. perhaps via cameras) to estimate target pose. • An extension to large–scale tracking scenarios (via SOCP) that is tremen-
dously more efficient to solve (in–practice) than its SDP counterpart. Such a scenario is suited for applications where communication range far exceeds that of sensing range or when loss of network connectivity is irrelevant.
As a capstone to this research, this dissertation also presents a general convex optimization planning paradigm for multi–agent systems that unites these seemingly unrelated results into a common framework that also provides • A discrete–time approach for embedding goal–directed behaviors where a goal corresponds to achieving some high–level sequence objective waypoint in Rd .
• Dynamic collision avoidance via spectral analysis for generating plans in potentially dynamic environments.
• A discrete–time, semidefinite framework for operating in unknown environments that affords minimizing some convex objective while providing formation control, network interconnectivity, dynamic obstacle avoidance, and a desired level of target coverage. The framework is suitable for a variety of multi–agent planning tasks and can be relaxed for large–scale application. As a corollary to the primary contributions of this dissertation, results obtained from concurrent research are also presented. Specifically, three appendices are included that highlight results focused primarily upon robust estimation and perception for intelligent vehicle systems. Although the focus of this dissertation is not on such technologies, it is undeniable that robust perception is central to the planning process. As such, the inclusion of this supporting material is quite natural given the context of this research. Specifically, the following key results are also presented • A definition and exploration of Hybrid Free–space Optics/Radio Frequency (FSO/RF) Mobile Ad–hoc networks for multi–agent systems to autonomously restore wide–area communications in times of a man–made or natural disaster. 9
1.5. DISSERTATION OUTLINE Key to these results is a novel vision–based link acquisition system for aligning directional FSO transceivers affixed to a subset of peers. • A robust approach to terrain classification and traffic line detection for autonomous vehicles operating in urban environments using range and remission
values from planar laser range finders. These results highlight the author’s contributions as a member of the Ben Franklin Racing Team to help field the fourth place entry in the 2007 DARPA Urban Challenge. • The Sick LIDAR Matlab/C++ Toolbox which is an open–source software package for rapid data–acquisition and visualization in Matlab for the ubiquitous
Sick LMS/LD laser range finders. It is being utilized by both academic and industry leaders located in Europe, Australia, Canada, and the United States. Accordingly, the outline of this dissertation is presented in the following section.
1.5
Dissertation Outline
This dissertation is organized as follows: Chapter 2 discusses a proposed paradigm for optimally transitioning a mobile multi–agent system from some initial configuration in Rd , d ∈ {2, 3} to some desired objective formation given some relative shape
assignment. This chapter offers a detailed discussion regarding the complexity of said approach as well as its utility for large–scale formation management (teams comprised of thousands of agents) in an online and real–time fashion. Additionally, a collection of extensions are presented to include a computationally distributed solution, a hierarchically decentralized solution, and a receding–horizon based approach for teams operating in arbitrary polygonal workspaces. Accordingly, hard complexity bounds are established for each using a standard interior–point method. Simulation results as well as experimental results are presented. Chapter 3 discusses optimal target tracking with performance guarantees for multi–agent systems. This chapter covers a discrete–time, semidefinite framework for optimal target tracking that ensures each target is tracked by at least a single team member and network connectivity remains intact across the formation. An 10
1.5. DISSERTATION OUTLINE extension is presented to guarantee k–coverage of system targets – providing a guaranteed minimal level of coverage for each target in the system. A second–order cone relaxation is also formulated for the case where network connectivity constraints can be forgone. This formulation is suitable for use by large–scale teams. Key to all of these approaches are seminal results from spectral graph theory that relates the connectivity of a weighted graph to the second–smallest eigenvalue of its weighted graph Laplacian matrix. Accordingly, simulation results for multi–agent systems operating in both R2 and R3 are presented. Chapter 4 unites the seemingly unrelated results from Chapters 2 and 3 into a common multi–objective, convex optimization framework for optimal motion planning in unknown environments. In formulating said framework, results stemming from spectral graph theory are again melded with semidefinite programming. Specifically, a discrete–time framework inspired by the results of Chapter 3 is formulated that embodies many of the key results of these chapters in addition to providing goal–directed team behavior and dynamic obstacle avoidance while minimizing an arbitrary convex objective. Simulation results are provided to illustrate the utility of the framework in the context of formation control and obstacle avoidance. Chapter 5 discusses two substantial extensions of this research. The first aims to exploit recent advances in the decentralized control of network interconnectivity via spectral analysis to yield a fully decentralized solution. The second explores the possibility of adapting algebraic topology as a tool for dynamically controlling mobile sensor networks. Key to this extension will be the relationship between the kernel of the combinatoric Laplacian (a generalization of the graph Laplacian exploited in Chapters 3 and 4) and a simplicial complex capturing the current topology (e.g. network, sensor, etc.) of the multi–agent system. Finally, given the significant role of perception in the low–level stages of motion planning, this dissertation also includes Appendices A, B, and C highlighting supplemental results obtained from concurrent research. Appendix A presents the Hybird Free–space Optics/Radio Frequency (FSO/RF) Mobile Ad–hoc Network paradigm for mobile robot teams. Experimental results are presented that highlight the utility
11
1.5. DISSERTATION OUTLINE of a vision–based link acquisition system for autonomously aligning long–range, directional FSO transceivers between respective link partners. Appendix B highlights a novel approach for lidar–based terrain classification for autonomous vehicles operating in urban environments while Appendix C discusses an open–source framework developed for rapid data–acquisition/visualization using Sick laser range finders. For the most part, each chapter is structurally arranged in a similar manner. Each begins with relevant background and related work before providing a more formal problem statements. Theoretical and simulation/experimental results are then presented along with ample discussion highlighting the contributions of the work and their utility. The final section in each chapter provides a brief discussion.
12
Chapter 2 Optimal Team Formations This chapter explores the problem of reconfiguring a potentially large–scale multi– agent system to a new shape while minimizing either the total distance traveled by the team or the maximum distance traveled by any member. Three paradigms are investigated: centralized, computationally distributed, and decentralized. For the centralized case, optimal solutions are obtained in O(m) time in practice using a logarithmic–barrier method. Key to this complexity result is transforming the Karush–Kuhn–Tucker (KKT) matrix associated with the Newton step subproblem into a band–diagonal, linear system solvable in O(m) time. Accordingly, these results are extended to a distributed approach that allows the computation to be evenly partitioned across the m nodes in exchange for O(m log m) messages in the overlay network. Additionally, a hierarchically decentralized solution is presented whereby followers are able to solve for their objective positions in O(1) time from observing the only the relative pose of a small number (2–4) of leader nodes. Such behavior is akin to biological systems (e.g. schools of fish, flocks of birds, etc.) capable of complex formation changes using only local sensor feedback. Extensions that embed motion constraints and convex and non–convex workspace constraints are also presented.
13
2.1. BACKGROUND
2.1
Background
In recent years, the robotics community has seen a tremendous increase in multi– agent systems research. This has been driven in part by the maturation of the underlying technology: advances in embedded computing, sensor and actuator technology, and (perhaps most significantly) pervasive wireless communication. However, the primary motivation is the diverse range of applications envisaged for large-scale robot teams, defined in this chapter as formations ranging from tens to thousands of robots. These include: support of first responders in search and rescue operations, autonomous surveillance and in support of military and homeland security operations, and environmental monitoring. Unfortunately, the effective coordination of a large-scale robot team is a non-trivial problem – one that will need to be solved in order for such systems to find practical use. In this chapter, an optimization approach to the planning task is considered. This is motivated by the realization that the operation of such a team is inherently a constrained resource allocation problem. A (potentially large) number of mobile agents are required to perform some task (e.g. area surveillance), perhaps with performance objectives (e.g. maximize coverage), while subjected to resources that are restricted by communication and sensing ranges, motion constraints, etc. While the optimization construct has many advantages, its potential for use in multi-agent systems has never been fully realized due to scalability concerns. Complete solutions to problems of interest typically scale in super–linear time with the number of robots and/or the size of the environment. In this work, the focus is on developing a convex formulation of the problem. By leveraging recent advances in convex optimization theory, we are able to develop centralized, computationally distributed and hierarchically decentralized motion planning strategies for coordinating a multi–agent system that yield impressive performance bounds. The result is a fairly rich framework for large–scale formation changes.
14
2.2. RELATED WORK
2.2
Related Work
Formations of robot teams have been extensively studied in the literature, and a complete survey is outside of the scope of this dissertation. Instead, focus is placed on those where shape - defined differently in different contexts – was of significant relevance to the research. Das et al (2002) described a vision-based formation control framework. This focused on achieving and maintaining a given formation shape using a leader-follower framework. Control of formations using Jacobi shape coordinates was addressed by Zhang et al (2003). This approach was applied to a formation of a small number of robots which were modeled as point masses. Belta and Kumar utilized abstraction based control as a mechanism to coordinate a large number of fully actuated mobile robots moving in formation (2004a). In this work, the configuration space of the robots Q was mapped to a lower dimensional manifold A to reduce the complexity of the control problem. Shape was roughly defined as the area spanned by the team.
Related studies include conflict resolution of multi–vehicle systems. In conflict resolution, trajectories are planned for vehicles operating independently in the same space but with potentially conflicting goals. The work of Ogren and Leonard used a dynamic window approach to avoid obstacles, and included the stability analysis (2002). These approaches often adopt simplified models for vehicle behavior, but present interesting applications for optimization–based methods. In cooperative control problems, vehicles move in a coordinated fashion to achieve some common goal and/or seek to maintain some geometrical relationships among themselves. Often motion is dictated by gradient approximations from sensor measurements, or an artificial potential field. Solutions defined with inter-robot distance relationships were explored by Bachmayer and Leonard (2002), where methods to measure and project gradient information were discussed. Their results were designed for use by robot teams attempting to monitor large–volume spaces. Such spaces include oceans where the most advantageous arrangement of sensors may not be to distribute them evenly, but to have them adapt so as to concentrate more sensors in areas where the measured variable has steeper gradients. 15
2.2. RELATED WORK There has also been significant interest in applying optimization based techniques to coordinate robot teams and deploy sensor networks. Contributions in this area include the work by Cortes et al (2004). Here the focus is on autonomous vehicles performing distributed sensing tasks. Feddema et al. (2003) applied decentralized optimization based control to solve a variety of multi-robot problems. Optimal motion planning was also considered by Belta and Kumar (2004b). In this work, authors generate a family of optimal smooth trajectories for a set of fully actuated mobile robots. The case for which robots have independent goals but share the same space was studied by LaValle and Hutchinson (1998). In contrast to these efforts, the primary contribution of this work is the application of convex optimization techniques to the motion planning task. Using these in conjunction with a formal definition for formation shape, this chapter first describes a centralized framework for coordinating large–scale robot teams that scales in O(m) time in practice, and can calculate optimal formation shape changes in real–time for teams having thousands of members. Additionally, two extensions are presented in the form of computationally distributed and hierarchically decentralized solutions. In the former, computation is evenly partitioned across m agents in exchange for O(m log m) messages in the overlay network. In the latter, followers are able to solve for their objective positions in O(1) time from observing the relative pose of a small number of leaders (2–4) – akin to biological systems (e.g. schools of fish, flocks of birds, etc.) capable of complex formation changes using only local sensor feedback. Intuitively, the problem solved in this chapter can be thought of as the converse to the assignment problem (see (Kuhn, 1955)) where the final pose of the formation is known a priori, and the goal is to find an optimal assignment of nodes to objective positions. In contrast, our results determine the optimal pose corresponding to the objective formation shape where assignments remain fixed across shape transitions.
16
2.3. THE SHAPE PROBLEM
2.3
The Shape Problem
To efficiently coordinate the robot team, a concise mathematical representation capturing the team’s formation is needed. For example, abstraction–based control approaches use low–dimensional representations such as bounding boxes or first/second order statistics of the robot positions (Belta & Kumar, 2004a). In contrast to these approaches, this work employs the notion of shape to fully capture the salient features of a robot formation. As would be expected, shape has different meanings in different contexts. In the context of this work, however, we adopt the definition from statistical shape analysis (Dryden & Mardia, 1998) and define formation shape as follows Definition 2.3.1. The shape of a formation is the geometrical information that remains when location, scale, and rotational effects are removed. By this definition, two mobile robot teams are equivalent in formation if the shape they form is equivalent up to some Gaussian similarity transformation. Figure 2.1 is provided to illustrate this point.
Figure 2.1: By the chosen definition of shape, two robot formations are the same shape if they are equivalent up to a Gaussian similarity transformation. By this definition, each of these squares can be considered the same shape. 17
2.3. THE SHAPE PROBLEM Formalizing this concept, consider a formation of m ≥ 2 robots in a Euclidean
space Rd , d ∈ {2, 3}. Let si ∈ Rd denote the relative position of the ith robot in the desired shape geometry relative to some frame F . Intuitively, si represents
the control point in the desired shape geometry assigned to the ith robot in the
formation. Without loss of generality, let s1 correspond to the origin OF of the coordinate frame and denote the formation by the m × d matrix S = (s1 , . . . , sm )T
(2.1)
representing the concatenated coordinates of the m robots in F . Given this definition, the shape of S is fully characterized by the equivalence class of similarity
transformations of the desired formation geometry, given as [S] = {αSR + 1m τ T : α ∈ R+ , R ∈ SO(d), τ ∈ Rd }
(2.2)
where α ∈ R+ is the scale, R ∈ SO(d) is a rotation matrix, 1m is the m–dimensional
vector of 1’s and τ ∈ Rd is the translation vector. Additionally, S is said to cor-
respond to an icon of the shape [S]. Adopting standard convention, we let Y ∼ S
denote an equivalence relation between Y and S (i.e. Y ∼ S ⇔ Y ∈ [S]).
Shape provides a natural, high–level representation for describing formation
pose. As such, the general motion planning problem can be reposed to one of changing to (or maintaining) a given shape subject to workspace, velocity constraints, etc. The reposed problem is called the shape problem, and this dissertation will show that solving it can be achieved through straightforward SOCP techniques. Towards this end, let the m×d matrix P = (p1 , . . . , pm )T denote the concatenated coordinates of the current formation pose in some world frame W. The objective
is to obtain a new formation pose Q = (q1 , . . . , qm )T in W where Q has the same shape as S under the equivalence relation defined in Equation (2.2) that minimizes
the chosen objective function. For our purposes, we consider minimizing either the total distance traveled by the team or the maximum distance traveled by any member and pose the shape problem as follows
18
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING Problem 2.3.2. Given an initial formation pose P ⊂ W and a formation shape
icon S ⊂ F , P 6∼ S, obtain a new formation pose Q ⊂ W, Q ∼ S, and where either
the total distance traveled by the team or that maximum distance traveled by any member is minimized. Mathematically, Problem (2.3.2) can be expressed as the following pair of non–
linear, non–convex programs with decision variables Q, α, θ, and τ . m P
min max k qi − pi k2
min
s.t. Q = αSR(θ) + 1m τ T
s.t. Q = αSR(θ) + 1m τ T
α > 0, θ ∈ [0, 2π)
α > 0, θ ∈ [0, 2π)
Q
i=1,...,m
Q
i=1
k qi − pi k2
(2.3)
Intuitively then, solving Problem (2.3.2) reduces to identifying the optimal similarity transformation parameters (α, θ, and τ ) that map the desired shape geometry (and robot assignments) defined in F to an equivalent representation Q in the team
configuration space W. To solidify this point, consider Figure 2.2 which illustrates the process for an arbitrary arrangement of four agents operating in R2 .
2.4
Solving via Second–order Cone Programming
In the previous section, we formulated the shape problem and provided intuition into its solution. Unfortunately, a naive formulation results in the non–linear, non– convex programs appearing in Equation (2.3). As a result, to ensure a globally optimal solution, an alternate problem formulation must be considered that yields a convex feasible set. In this section, such a formulation is established with respect to each Euclidean distance metric. In particular, a pair of second–order cone programs are derived by exploiting a novel problem formulation that allows the chosen definition of shape to be fully captured (i.e. without relaxation) as a linear homogenous system. These results are presented in the following theorem
19
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING
Figure 2.2: Key to solving the shape problem is identifying the optimal similarity transformation that maps the set of control points in F (each assigned to some team member) to an equivalent representation Q in the team’s configuration space W that minimizes the chosen metric with respect to initial agent positions (triangles) P . Theorem 2.4.1. Problem (2.3.2) can be restated as the following SOCPs for a team of fully–actuated robots operating on the plane. min t1
min 1Tm T
s.t. k qi − pi k2 ≤ t1 , i = 1, . . . , m
s.t. k qi − pi k2 ≤ ti , i = 1, . . . , m
Q,t1
Q,T
AQ = 0
AQ = 0
(2.4) Proof. Begin by noting that the shape of the icon S is given by the following equivalence class [S] = {αSR + 1m τ T : α ∈ R+ , R ∈ SO(2), τ ∈ R2 }
(2.5)
Letting si = (sxi , syi ) and qi = (qix , qiy ) respectively denote Cartesian coordinates in F and W and choosing τ , q1 , Equation (2.5) is then equivalent to the following 20
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING non–linear equality constraints qix − q1x = α (sxi cos θ − syi sin θ)
qiy − q1y = α (sxi sin θ + syi cos θ)
)
i = 2, . . . , m
(2.6)
where θ denotes the formation orientation. Without loss of generality, the formation orientation and scale can be respectively defined as follows θ = arctan △
q2y − q1y , q2x − q1x
α=
k q2 − q1 k2 k s2 k 2
(2.7)
where s1 = OF . Given these assignments and utilizing basic algebraic manipulation, Equation (2.6) can be restated as the following linear homogenous system ) k s2 k2 (qix − q1x ) = (sxi , −syi )T (q2 − q1 ) i = 3, . . . , m k s2 k2 (qiy − q1y ) = (syi , sxi )T (q2 − q1 )
(2.8)
These 2(m − 2) constraints are convex functions of the state vector Q. They are
necessary and sufficient for describing the formation shape (Kendall et al. , 1999).
Remark 2.4.2. Defining Equation (2.8) in matrix form, AQ = 0, also reveals a low–dimensional representation for the formation since the nullity of A is four – i.e. dim(ker (A)) = 4. In this case, the four free variables correspond to the translation, rotation, and scale as defined in Equation (2.5). The problem of finding the formation pose Q can now be restated as either of the following linearly constrained optimization problems m P
min max k qi − pi k2
min
s.t. AQ = 0
s.t. AQ = 0
Q
i=1,...,m
Q
i=1
k qi − pi k2
(2.9)
where the former corresponds to the minimax distance metric and the latter the total distance metric. While both the constraints and objective functions of these problems are convex, their form does not lend itself to solving via standard optimization techniques. Specifically, their objectives are not twice differentible. To remedy this problem, the 21
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING state vector Q can be augmented with an auxiliary variable t1 for the minimax metric and with m auxiliary variables T = (t1 , . . . , tm )T for the total distance metric. This yields the problem formulations stated by the theorem. In light of this result, the shape problem is a convex optimization problem, and it can be solved with respect to either metric, in polynomial–time, using interior point methods (Lobo et al. , 1998).
2.4.1
Regulating Formation Parameters
Through the use of additional linear and second–order cone constraints, the rotation, translation, and scale of the formation shape can also be regulated. Constraining the minimum and maximum orientation (θmin , θmax )T can be accomplished through the following pair of linear constraints (q2 − q1 )T (− sin θmax , cos θmax ) ≤ 0 (q2 − q1 )T (sin θmin , cos θmin ) ≤ 0
(2.10)
These are illustrated in Figure 2.3. Additionally, constraining the translation and the maximum scale of the formation can be respectively done using second-order cone constraints of the form k q1 − p1 k2 ≤ tmax
k q2 − q1 k2 ≤ αmax
(2.11)
Note, however, that the following constraints k q2 − q1 k2 ≥ αmin
k q2 − q1 k2 = α
(2.12)
are non–convex. If regulating the minimum scale is required, an efficient approximation method based upon our results for fixed orientations in R2 and R3 has been developed. The interested reader is referred to (Spletzer & Fierro, 2005) for details. Example: Figure 2.4 shows a simulation demonstrating the process. In this example, 101 nodes were tasked with transitioning to a new shape while minimizing the total distance traveled. The optimal values were α = 0.04, θ = 17.31◦ , and 22
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING
Figure 2.3: Constraining the feasible formation orientation can be realized by utilizing a pair of linear inequalities in q1 and q2 . q1 = (90.36, 86.61)T . While deliberately contrived, this example demonstrates the efficacy of this approach. The formation is able to optimally transition from an arbitrary shape to a very specific shape. In this example, none of the shape parameters (i.e. translation, rotation, or scale) were regulated.
2.4.2
Optimal Solutions in R3
Unfortunately, attempting to solve Problem (2.3.2) in SE(3) essentially introduces non–reducible, non–linear terms (and thus non–convex constraints) into the problem formulation. Accordingly, in this section, a relaxed variation is considered where the orientation of the shape formation is fixed. In deriving this result, it should be pointed out that fixing the orientation reduces to simply constraining it with respect
23
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING
160 140 120 100 80 60 40 20 0 0
50
100
150
200
250
150
200
250
(a)
160 140 120 100 80 60 40 20 0 0
50
100
(b)
24
2.4. SOLVING VIA SECOND–ORDER CONE PROGRAMMING
160 140 120 100 80 60 40 20 0 0
50
100
150
200
250
(c)
Figure 2.4: (a) The initial formation pose for a set of 101 nodes in R2 . (b) The mapping (lines) of each agent to its optimal position (circles). (c) The formation pose that achieves the desired shape while minimizing the total distance metric. to q1 and q2 . As a result, Equation (2.6) becomes ) qix − q1x = αsxi i = 2, . . . , m qiy − q1y = αsyi
(2.13)
which are linear in qi and α (i.e. the objective position and scale). In other words, the feasible set corresponds to the pre–shape of the formation which is given by the equivalence class [S]P S = {αS + 1m τ T : α ∈ R+ , τ ∈ Rd }
(2.14)
Remark 2.4.3. In contrast to Equation (2.8), Equation (2.13) directly embeds scale as a decision variable. As such, this variable can be freely regulated using linear constraints (e.g. αmin ≤ α ≤ αmax , αmin , αmax ∈ R+ ). Considering these constraints directly leads to the following theorem, which enables optimal solutions for multi–agent systems operating in R3 .
25
2.5. COMPUTATIONAL COMPLEXITY Theorem 2.4.4. In the case where fixed–orientation or regulated scale is desired, Problem (2.3.2) can be posed as the following SOCPs min 1Tm T
min t1
Q,α,T
Q,α,t1
s.t. k qi − pi k2 ≤ ti , i = 1, . . . , m
s.t. k qi − pi k2 ≤ t1 , i = 1, . . . , m
qi − q1 = αsi, i = 2, . . . , m
qi − q1 = αsi , i = 2, . . . , m
−α ≤ −αmin
−α ≤ −αmin
α ≤ αmax
α ≤ αmax
(2.15)
It should also be noted that although optimization is occurring with respect to (q1 , . . . , qm , α)T , the k(m − 1) constraints qi − q1 = αsi , i = 2, . . . , m are sufficient to
ensure the rigidity of the formation. As such, solving either formulation in Equation
(2.15) is equivalent to optimizing over only q1 and α, which correspond to the translation and scale terms of Equation (2.14). Example: Figure 2.5 shows a representative simulation result for 91 agents operating in R3 (e.g. unmanned aerial vehicles or unmanned underwater vehicles) charged with migrating to a desired pyramid shape. In this example, the desired orientation and scale were respectively θ = 0o and α = 10. The objective was to minimize the combined distance that the agents must travel. The resulting paths and final formation shape are also shown.
2.5
Computational Complexity
In this section, the previous results are discussed from a computational standpoint and hard complexity bounds for the shape problem are established that highlight its efficacy. In general (without exploiting problem–specific features), solving a SOCP requires a number of basic operations that scales as O(m3.5 ) (Boyd & Vandenberghe, 2004; Lobo et al. , 1998). However, for the shape problem, a theoretical complexity of O(m1.5 ) can be attained by considering an auxiliary, but equivalent, formulation of the problem which yields a favorable sparsity structure for a linear system central 26
2.5. COMPUTATIONAL COMPLEXITY
40 30 20 10 0 −10 −20 −30 −40 40 40
20 20
0 0 −20
−20 −40
−40
(a)
40 30 20 10 0 −10 −20 −30 −40 40 40
20 20
0 0 −20
−20 −40
−40
(b)
27
2.5. COMPUTATIONAL COMPLEXITY
40 30 20 10 0 −10 −20 −30 −40 40 40
20 20
0 0 −20
−20 −40
−40
(c)
Figure 2.5: (a) The initial deployment of 91 nodes in R3 . The nodes were charged with reconfiguring to a solid pyramid formation. (b,c) (b) The mapping (lines) of each agent to its optimal position (circles). (c) The fixed–orientation formation pose that achieves the desired shape while minimizing the total distance metric. to the interior–point method. For the sake of brevity, the results presented herein are specific to the minimax metric. However, similar results have also been obtained for the total distance variation (Derenick & Spletzer, 2005). This includes both regulated and unregulated formulations.
2.5.1
Theoretical Complexity
The Penalty–Barrier Approach Although it is beyond the scope of this dissertation to provide a thorough review of the standard penalty–barrier (a.k.a. log–barrier) approach (Boyd & Vandenberghe,
28
2.5. COMPUTATIONAL COMPLEXITY Algorithm 1 The Penalty-Barrier Algorithm BarrierMethod(f0 , ζ, x0, A, b, γ0, µ, ǫB ) Require: x0 is strictly feasible, τ0 > 0, µ > 1, and ǫB > 0 xk ← x0 #Initialize xk to a strictly feasible point γk ← γ0 while true do # Minimize γk f0 + ζ subject to Axk = b xk ← MinNewton(f0 , ψ, xk , A, b, γk , ǫN ) if γρk ≤ ǫB then return xk end if γk ← µγk end while 2004), its centrality to the following results justifies a brief introduction. The barrier approach is an interior–point method that operates by solving a sequence of successively relaxed, linearly–constrained, variations of Equation (1.1). The key to formulating these subproblems is first embedding the inequality constraints (typically non–linear) as logarithmic barrier terms in the optimization objective. At iteration k, the second–order Taylor approximation of a scaled variation of the objective is minimized subject to the linear equality constraints (in this case, the shape constraints) yielding a feasible Newton–step. More precisely, each log–barrier term is rescaled by a monotonically decreasing value (updated after each iteration) that serves to reduce the influence of the inequality constraints on the solution. Intuitively, this rescaling allows the sequence of resulting solutions to approach global optimality, and, as a result, the algorithm can be thought of as following a central path towards global optimality. Additional details can be found in (Boyd & Vandenberghe, 2004). Algorithm 1 presents the general formulation as outlined in (Boyd & Vandenberghe, 2004), where f0 , denotes the initial convex objective, ζ denotes the barrier terms, µ ∈ (0, 1] is an algorithmic constant for scaling γk , and ǫB is the desired duality gap or solution accuracy.
29
2.5. COMPUTATIONAL COMPLEXITY Reformulating the Shape Problem The original minimax shape problem can be restated in a relaxed form suitable for solving via the barrier approach. The objective function given in Equation (2.4) is augmented with log–barrier terms corresponding to the problem’s conic constraints as follows min γk t1 − Q,t1
m P
i=1
log (t21 − (qi − pi )T (qi − pi ))
(2.16)
s. t. AQ = 0
where γk denotes the inverse log-barrier scaler for the k th iteration. As implied in the previous discussion, solving the shape problem reduces to solving a sequence of convex optimization problems of this form. Banding the Newton KKT System During each iteration of the log–barrier approach, the second–order Taylor approximation of the objective as a function of the Newton–step, δX, subject to AδX = 0. As a result, obtaining δX is equivalent to analytically solving the Karush–Kuhn– Tucker (KKT) conditions associated with this equality-constrained sub–problem. In other words, during each iteration the following linear system must be solved (Boyd & Vandenberghe, 2004) H AT A
0
!
δX W
!
=
−G 0
!
(2.17)
where H and G respectively denote the evaluated Hessian and gradient of the objective function given in Problem (2.16) at X, W is the corresponding dual variable for δX, and A is as previously defined. Solving Equation (2.17) is the bottleneck of the algorithm as it inherently requires cubic complexity to solve. However, the following results will reveal that it can be solved in linear time by reposing Problem (2.16) – thereby reducing the complexity for solving this SOCP to O(m1.5 ). Noting that the coefficient structure of Equation (2.17) is symmetric indefinite, an LU–based solver is employed with non–symmetric partial pivoting. The performance solving such a system suffers significantly when the coefficient structure 30
2.5. COMPUTATIONAL COMPLEXITY features dense rows and/or columns due to fill–in (Saad, 2003). In particular, the algorithm would yield a worst-case performance of O(m3 ) when solving Equation (2.17) for Problem (2.16). Such a workload is highly impractical when considering potentially large–scale configurations that feature thousands of variables. To address this issue, an auxiliary, but equivalent, formulation of Problem (2.16) can be considered that facilitates transforming the Newton KKT system into a mono–banded form (i.e. featuring a band–diagonal coefficient structure as in Figure 2.6(c)). Having such a coefficient structure enables specialized LU–based algorithms to exploit knowledge of both the constant upper and lower bandwidths to solve the system in O(m) time (Press et al. , 1993). To establish this result, begin by considering the following lemma Lemma 2.5.1. Problem (2.16) is equivalent to the following SOCP m m P P min τmk ti − log (t2i − (qi − pi )T (qi − pi )) q,t
i=1
s. t. k k
i=1 x s2 k2 (ql − dxj ) s2 k2 (qly − dyj )
= (sxl , −syl )T (dj+1 − dj )
= (syl , sxl )T (dj+1 − dj )
ti+1 = ti , i = 1, . . . , m − 1
(2.18)
d2i+1 = d2i−1 , i = 1, . . . , m − 3 d2(i+1) = d2i , i = 1, . . . , m − 3
di = qi , i ∈ {1, 2}
for l = 3, . . . , m and where j = 2(l − 3) + 1. Proof. In order to establish this result, it must be shown that both the feasible set and the objective functions in both formulations are equivalent. Considering the former, begin by noting the implicit chain of q1 and q2 in the constraint set given in Problem (2.18). Specifically, q1 = d1 = d3 = d5 , . . . , d2m−5 and q2 = d2 = d4 = d6 , . . . , d2(m−2) . Substituting these values directly into the shape constraints k s2 k2 (qlx − dxj ) = (sxl , −syl )T (dj+1 − dj )
k s2 k2 (qly − dyj ) = (syl , sxl )T (dj+1 − dj )
and noting the definitions of the indices l and j reveals the original homogenous linear system given in Equation (2.8). 31
2.5. COMPUTATIONAL COMPLEXITY To establish the latter, introduce m variables t1 , . . . , tm and constrain ti+1 = ti for i = 1, . . . , m where t1 is defined as in Problem (2.16). Having done this notice that the original objective in Problem (2.4) can be expressed as follows m
τ
m
τk X τk X τk t1 = mt1 = t1 = ti m m i=1 m i=1 k
(2.19)
Although it has changed, both form are equivalent given the constraints on ti . As a result, Problem (2.16) is equivalent to the formulation appearing in the lemma. Remark 2.5.2. It should be noted that similar results are readily attained for the total distance metric as well as the regulated scale formulations in R3 . See (Derenick & Spletzer, 2005) for additional details. Given Lemma 2.5.1, the following theorem is now proffered Theorem 2.5.3. The Newton KKT system, Equation (2.17), for Problem (2.16) is solvable in O(m) basic operations. Proof. Noting the O(m) LU–based solvers for mono–banded systems (Press et al. , 1993), it is sufficient to establish a mono–banded structure to establish this result. By construction, begin by defining the initial solution vector ordering for Equation (2.17) as it corresponds to Problem (2.18) as follows
δηi =
T δη1T , δη2T , δκT1 , . . . , δκT(m−2) , µT ! w1 δd2(i−1)+1 δqi .. µ= δκi = δd2(i−1)+2 . δti δη(i+2) w7m−13
(2.20)
where the δ variables correspond to the primal Newton step components associated with each of the system variables, and wj denotes the dual variable associated with the j th constraint. In order to yield the mono–banded form, we proceed by stating the constraint/row permutation for A that yields the tri–banded system appearing Figure 2.6(b). To 32
2.5. COMPUTATIONAL COMPLEXITY do so, it is assumed that A is arbitrarily constructed with random row and column permutations. For the sake of clarity, constraints are grouped by associating them with the respective agents that introduce them into the system. Employing a slight abuse of notation by allowing qi to also denote the ith robot in the configuration, define the constraints associated with q1 and q2 as follows ̺1 , q1x = dx1
̺4 , q2x = dx2
̺2 , q1y = dy1
̺5 , q2y = dy2
̺3 , t1 = t2
(2.21)
Similarly, for 3 ≤ i ≤ (m − 1), we define the constraints associated with qi as ϕi1 , k s2 k2 (qix − dxj ) = (sxi , −syi )T (dj+1 − dj )
ϕi2 , k s2 k2 (qiy − dyj ) = (syi , sxi )T (dj+1 − dj ) ϕi3 , ti = ti−1
ϕi4 , dxj+2 = dxj ϕi5 , ϕi6 , ϕi7 ,
dyj+2 dxj+3 dyj+3
= = =
(2.22)
dyj dxj+1 dyj+1
where j is as previously defined. With qm , associate the remaining three constraints: x ϕm1 , k s2 k2 (qm − dxj ) = (sxm , −sym )T (dj+1 − dj )
y ϕm2 , k s2 k2 (qm − dyj ) = (sym , sxm )T (dj+1 − dj )
(2.23)
ϕm3 , tm = tm−1
where j is as previously stated with i = m. Again employing a slight abuse of notation, we let each ϕij denote the initial row index of its associated constraint. As such, consider the following row permutation for A. This ordering yields the tri–banded form appearing in Figure 2.6(b) T ϑT , κ1T , . . . , κ(m−1) , ςT
33
T
(2.24)
2.5. COMPUTATIONAL COMPLEXITY
ϑ=
̺1 ̺2 .. . ̺5
ϕ(i+2)1
ϕ (i+2)2 κi = .. . ϕ(i+2)7
ϕm1 ς= ϕm2 ϕm3
Given this definition of A as well as Equation (2.20), the mono–banded form of Equation (2.17) can now be constructed. Symmetrically applying the deterministic permutation that yields the following solution vector ordering T T λT , ξ1T , . . . , ξ(m−3) , χT δd2(i−1)+1 δd 2m−5 δη1 δd2(i−1)+2 δd 2m−4 w1 w7m−15 w6+7(i−1) .. χ = λ = . ξi = .. . w7m−14 w 5 w w7m−13 12+7(i−1) δη2 δηm δη(i+2)
(2.25)
produces a mono–banded coefficient structure having a total bandwidth of 17. As the upper and lower bandwidth of the system is constant, the system can now be solved in O(m) time using a band–diagonal LU solver (Press et al. , 1993). Figure 2.6 illustrates the process of transforming the Newton KKT system. The “augmented” system derived from Permutations (2.20) and (2.24) is shown in Figure 2.6(b). Taking the coefficient structure of Equation (2.17) in this form and symmetrically permuting its rows and columns according to Permutation (2.25) yields the mono-banded system appearing in Figure 2.6(c) having a constant lower/upper bandwidth of 8. Accordingly, It can now be solved in O(m) basic operations using a band–diagonal LU–based solver (Press et al. , 1993).
2.5.2
Performance in Practice
Modern interior-point methods (IPMs) require
√
m iterations to converge (Lobo
et al. , 1998). However, it is well-known that this computational bound is extremely 34
2.5. COMPUTATIONAL COMPLEXITY
0
10
20
30
40
50
60
70
80
90
0
10
20
30
40
50 nz = 661
60
70
80
90
(a)
0
50
100
150
200
250
300
0
50
100
150 200 nz = 1149
(b)
35
250
300
2.5. COMPUTATIONAL COMPLEXITY
0
50
100
150
200
250
300
0
50
100
150 200 nz = 1149
250
300
(c)
Figure 2.6: (a) Newton KKT sparsity structure for the minimax problem in SE(2). (b) Augmented Newton KKT sparsity structure. (c) The mono–banded system. conservative and that iteration complexity in practice is more like O(1) (Boyd & Vandenberghe, 2004). Consequently, solving the shape problems requires only O(m) basic operations in practice. To illustrate this point, consider the following simulation results. Simulation Results: A total of 20,000 instances of the unregulated total distance problem were solved using the MOSEK industrial package (MOSEK, 2006). Values of m were considered between 10 and 2000 using a step size of 10. For each configuration size, a total of 100 random problem instances were solved. Shape control points were uniformly generated from the unit square, and robot positions were uniformly selected from the square defined by [−25, 25]. All problems were solved on standard desktop PC having a 3.0 GHz Pentium 4 processor. From Figure 2.7, we see the time complexity scales as O(m) with r 2 = 0.9905, where r is the correlation 36
2.6. ENFORCING MOTION CONSTRAINTS 0.4
0.35
0.3
Time (sec)
0.25
0.2
0.15
0.1
0.05
0
0
200
400
600
800
1000 m
1200
1400
1600
1800
2000
Figure 2.7: The mean CPU time required to solve the unregulated total distance metric shape problem using the MOSEK industrial solver. Each point corresponds to the mean for a sample of 100 randomly generated shape SOCPs. The trend is strongly linear with r 2 = 0.9905. coefficient. Furthermore, the data indicates that for a team of up to 2000 robots, the problem can be solved in less than 300 milliseconds!
2.6
Enforcing Motion Constraints
This framework relies upon the convexity of the underlying formulation to efficiently solve the problem for large–scale formations. As a consequence, motion constraints can also be accommodated so long as they can be expressed similarly. For example, to this point, objective robot positions have been generated without consideration for vehicle constraints. However, velocity constraints can be readily accommodated as second–order conic inequalities of the form k qi − pi k2 ≤ vimax δt
37
(2.26)
2.7. HANDLING WORKSPACE CONSTRAINTS where δt denotes the time–step, and vimax denotes the velocity limits of the ith robot. This allows less mobile or even fixed (anchor) robots/nodes to be accommodated. Others might be expressed as linear constraints. As an example, a formation wanting to maintain positive velocity in the y-direction could ensure this by specifying a minimum forward distance traveled dmin for each node as i qiy − pyi ≥ vimin δt = dmin , i ∈ {1, . . . , m} i
(2.27)
In summary, if the constraints can be expressed in terms of some combination of feasible linear, second–order cone, or semi–definite constraints, they can be directly integrated into the proffered framework. Furthermore, as with the environmental constraints discussed in §2.7, the motion constraints introduced by the ith vehicle are only a function of its own initial and final positions. As a result, the mono–banded structure and complexity results derived in §2.5 will are preserved. Example: A simulation showing the same formation shape change with (triangles) and without (circles) motion constraints is presented in Figure 2.8. In both cases, the objective was for the teams to transition from an initial in-line path configuration to a triangular shaped formation. Optimization was over our minimax metric. In the former, the minimum forward distance traveled for each node was constrained as qiy − pyi ≥ 0, i ∈ {1, . . . , m}. Scale and orientation were fixed in both cases. In this case, the additional constraints increased the minimax distance by 87%.
2.7
Handling Workspace Constraints
In this section, the shape problem is reformulated to include workspace constraints. Specifically, the problem of transitioning a team, constrained to lie within a polygonal workspace, to a new shape formation while minimizing either the minimax or total distance metrics is now considered. In this research, the polygonal workspace C is defined with obstacle subspace O and free space Cf ree such that Cf ree = C − O.
Given the definition of C, solving the given problem reduces to identifying the
optimal similarity transformation that when applied to S ⊂ F yields an equivalent 38
2.7. HANDLING WORKSPACE CONSTRAINTS
(a)
(b)
Figure 2.8: Shape transitions for two teams of 15 robots, each respectively aligned along an inline path. In this case, the teams were charged with obtaining a triangular formation while minimizing the minimax distance. (a-Left) No motion constraints. (a-Right) qiy − pyi ≥ 0, i = 1, . . . , m. (b) The final team configurations. 39
2.7. HANDLING WORKSPACE CONSTRAINTS shape Q ⊂ Cf ree ⊂ W such that the desired Euclidean distance metric is minimized.
In this formulation, S once again denotes the desired shape icon and F , W respec-
tively denote local and world frames and P again denotes the initial agent positions. More formally, our problem can be stated as follows Problem 2.7.1. Given an initial formation pose P ⊂ W and a formation shape
icon S ⊂ F , P 6∼ S, obtain a new formation pose Q ⊂ Cf ree ⊂ W, Q ∼ S, and where either the minimax or total distance metric is minimized.
Mathematically, Problem (2.7.1) can be expressed as the following pair of non– convex programs in decision variable Q, α, θ, τ m P
min max k qi − pi k2
min
s.t. Q = αSR(θ) + 1m τ T
s.t. Q = αSR(θ) + 1m τ T
Q
i=1,...,m
Q
Q ⊂ Cf ree
i=1
k qi − pi k2 (2.28)
Q ⊂ Cf ree
α > 0, θ ∈ [0, 2π)
α > 0, θ ∈ [0, 2π)
Noting the results of the §2.4, both the objective and constraints of these pro-
grams can be defined in convex form with the exception of Q ⊂ Cf ree as Cf ree is
potentially non–convex. Accordingly, in this section, analysis and results are presented for the case where Cf ree is both convex and non–convex. Beginning with the former, a computational complexity of O(k 1.5 m1.5 ) is established where k denotes the number of linear inequalities used to model C and m is the number of
agents. Regarding the latter, complexity is shown to scale as either O(k 3.5 m1.5 ) or O(k 1.5 m3.5 ), depending upon the chosen permutation of the Newton KKT system. Convex Polygonal Environments: Begin by assuming that Cf ree is both polygonal and convex. As such, it can be represented by the following affine set C = {x ∈ R2 : Ac x ≤ bc }
(2.29)
where Ac ∈ Rk×2 with k denoting the the finite number of linear inequalities used to
model the workspace. As such, citing results from Theorem 2.4.1, Problem (2.28) 40
2.7. HANDLING WORKSPACE CONSTRAINTS can be formulated as second–order cone programs. Accordingly, this result is now formalized in the following theorem Theorem 2.7.2. Problem (2.7.1) can be restated as the following SOCPs for a team of fully–actuated robots in SE(2) operating in convex polygonal workspace C = Cf ree min f (t1 ) = t1
min f (T ) = 1Tm T
s.t. k qi − pi k2 ≤ t1 , i = 1, . . . , m " #" # " # Ac I Q B = As 0 U 0
s.t. k qi − pi k2 ≤ ti , i = 1, . . . , m " #" # " # Ac I Q B = As 0 U 0
Q
Q
U ≥0
U ≥0 where As ∈ R
2(m−1)×2m
(2.30)
corresponds to the coefficient structure for the linear equal-
ities given in Equation (2.8) and Ac ∈ Rkm×2m denotes the structure for the linear inequalities used to model C = Cf ree .
Remark 2.7.3. Note that km non–negative slack variables U = (u1 , . . . , ukm )T have been introduced to normalize the linear inequality constraints. Establishing this result is a fairly straightforward exercise. As such, a formal proof is omitted. It should also be noted that defining Cf ree this way introduces km additional constraints into the problem formulation. However, the core linear system remains mono–banded. This is a result of the fact that the k constraints introduced for the ith team member’s position are only a functional of that agent’s position in the final formation (i.e. qi ). Furthermore, the coefficients of these constraints can be chained locally by introducing 2k auxiliary variables for each robot (Derenick & Spletzer, 2007). As a direct consequence is the following result Lemma 2.7.4. The Newton KKT system for Problem (2.30) is solvable in O(km) – linear time in the size of the formation and the environment. which leads to the following theorem Theorem 2.7.5. Given Lemma 2.7.4 and noting the total number of Newton itera√ tions is bound by O( m), Problem (2.30) is solvable in O(k 1.5m1.5 ) basic operations. 41
2.7. HANDLING WORKSPACE CONSTRAINTS
30 25 20 15 10 5 0 −5 −10 −15 −20 −25 −40
−30
−20
−10
0
10
20
30
40
(a)
0
1000
2000
3000
4000
5000
6000
7000
8000
9000 0
1000
2000
3000
4000 5000 nz = 20559
6000
7000
8000
9000
(b)
Figure 2.9: (a) Two problem instances with/without boundary constraints. In this example, the workspace limits constrained the scale of the optimal solution from α = 85 to α = 59.85. (b) The mono–banded form with workspace constraints. 42
2.7. HANDLING WORKSPACE CONSTRAINTS Example: In this example, a variation of the shape problem was considered where the objective function was to maximize the scale of the formation subject to workspace boundary constraints. Such a result might prove useful for tasks such as maximizing sensor network coverage while satisfying a desired shape geometry. In this case, the orientation of the shape was fixed. Figure 2.9 illustrates the results obtained for a team of 101 nodes in SE(2). Both constrained and unconstrained problem solutions are presented to illustrate the influence of the convex workspace model. The coefficient structure of the Newton KKT system is also shown to highlight the fact that the mono–banded form of the linear system remains intact. Arbitrary Polygonal Environments: Extending the results of §2.7, motion
planning in polygonal environments with static obstacles (i.e. non–convex workspaces)
is now considered. Obstacles highlight a significant limitation of convexification approaches as their inclusion creates voids in the feasible solution set. The result is a loss of convexity, and the corresponding optimization problem becomes significantly more difficult to solve. Although convex relaxations present one potential remedy (d’Aspremont & Boyd, 2003), this work takes inspiration instead from (Belta et al. , 2005) and proffers a hierarchical discrete–continuous optimization approach. Employing exact cell decomposition methods (e.g. triangulation, trapezoidal decomposition, etc.), Cf ree can be tessellated into convex polygonal cells C1 , . . . , Cz , S where Cf ree = zi=1 Ci (Choset et al. , 2005). The resulting partition induces an
undirected graph G = (V, E), where vertex vi ∈ V corresponds to cell Ci , and edge
eij ∈ E implies that there exists a common edge between Ci and Cj . Paths between
cells can then be efficiently computed using traditional graph optimization algorithms (e.g. (Dijkstra, 1959)). The motion planning problem can then be reposed as transitioning the formation from cell to cell along the specified path. In the se-
quel, it is assumed that a triangulation partition of Cf ree is employed. Additionally, S it is assumed that the union of adjacent cells Cij = Ci Cj ∀ (Ci , Cj ) ∈ E is con-
vex. Although this may appear restrictive, it is straightforward to refine any pair of adjacent triangles to three such triangles where both of the resulting adjacent pairs meet this constraint. 43
2.7. HANDLING WORKSPACE CONSTRAINTS Remark 2.7.6. Given two adjacent cells (Ci , Cj ) ∈ E, where Cij = Ci
S
Cj is
convex, if node qi ∈ Ci and qj ∈ Cj , then by convexity λqi +(1−λ)qj ∈ Cij , λ ∈ [0, 1].
This implies that for a formation of m nodes with initial pose Qi = (qi1 , . . . , qim )T ∈
R2m in triangle Ci , and final pose Qj in triangle Cj , the paths of each node will
remain entirely in Cij ⊆ Cf ree . This guarantees obstacles are safely avoided. Assume that such a path Cp = {C1 , . . . , Ck } ⊆ Cf ree has been specified by a
higher level planner. Problem (2.7.1) can now be written as
Problem 2.7.7. Given a path specification Cp = {C1 , . . . , Ck }, a corresponding shape specification S = {S1 , . . . , Sk } and an initial formation pose Q0 , find a motion sequence Q = {Q1 , . . . , Qk } for the formation such that 1. Qi ∼ Si , i = 1, . . . , k 2. Qi ∈ Ci , i = 1, . . . , k
3. The distance traveled by the formation is minimized in accordance with the criteria from Problem (2.3.2).
In solving Problem (2.7.7), the proffered framework employs optimization techniques rooted in model predictive control (Jadbabaie et al. , 2001; Mayne et al. , 2000). In this context, however, the length of the horizon is not defined by time, but rather the length of the path over which the optimization problem is solved. To constrain the pose of the formation during each step of the horizon, each triangle can be modeled as a set of three linear inequality constraints on the position of each agent cTil qij ≤ 0, i = 1 . . . k, j = 1 . . . m, l = 1 . . . 3
(2.31)
Employing a slight abuse of notation, we also let Ci = (c11 , . . . , cm3 )T ∈ R3m×
2m
denote the set of linear constraints on the formation pose such that Qi ∈ Ci . Com-
bining the results obtained in §2.4 with this observation motivates the following theorem
44
2.7. HANDLING WORKSPACE CONSTRAINTS Theorem 2.7.8. Solving Problem (2.7.7) is equivalent to solving the following SOCPs with respect to the minimax and total distance metrics min Q
s.t.
k P
ti
i=1
k qij − qi−1,j k2 ≤ ti , i = 1, . . . , k, j = 1, . . . , m
Ai Qi = 0, i = 1, . . . , k Ci Qi ≤ 0, i = 1, . . . , k min Q
s.t.
k P m P
(2.32)
tij
i=1 j=1
k qij − qi−1,j k2 ≤ tij , i = 1, . . . , k, j = 1, . . . , m
Ai Qi = 0, i = 1, . . . , k Ci Qi ≤ 0, i = 1, . . . , k
where Ai are the constraints associated with shape Si as defined in §2.4. Producing the results presented in Theorem 2.7.8 is trivial given the results from the previous section and the problem formulation inspired by model predictive control techniques. As such, a formal proof is omitted. Observing Problem (2.32), it is clear that the formulation is again a second– order cone program. More significantly perhaps, the corresponding Newton KKT matrix corresponds to the chaining of k instances of our single step problem. As a result, the associated linear system will remain mono-banded; however, in this case the bandwidth will grow as either a function of m or k depending upon the selected permutation of the augmented KKT system. The problem arises in the fact that the Hessian is non–separable as the optimization is occurring over a sequence of convex cells. As a direct result, the theoretical complexity is O(k 1.5 m3.5 ) or O(k 3.5 m1.5 ) – once again depending upon the chosen ordering. Example: A simulation trial for a formation of 16 robots is shown at Figure 2.10. The path of the formation is specified by a higher level planner after a discrete optimization phase on the corresponding graph G as shown in Figure 2.10(a). The 45
2.7. HANDLING WORKSPACE CONSTRAINTS
(a)
(b)
Figure 2.10: (a) Cpath as specified by the high–level planner. (b) The corresponding motion sequence obtained from solving Problem (2.32) over a horizon of 15. 46
2.8. DISTRIBUTING COMPUTATION
0
500
1000
1500
2000
2500
3000
3500
4000 0
500
1000
1500
2000 2500 nz = 15378
3000
3500
4000
Figure 2.11: The Newton KKT system corresponding to Figure 2.10. Observe that the system remains mono–banded.
formation then solves the continuous optimization problem specified by Problem (2.32). The resulting path of the formation is shown in Figure 2.10(b). In this example, the optimization was over the entire path length (k = 15), the shape was held constant, and the minimum scale of the formation was specified as a premise for inter–agent collision avoidance. Additionally, note that the Newton KKT system remains mono–banded as seen in Figure 2.11.
2.8
Distributing Computation
In the previous sections, it was established that the centralized solution features both a mono–banded linear system as well as a separable objective function (w.r.t. the variables each agent introduces). In this section, these characteristics are leveraged
47
2.8. DISTRIBUTING COMPUTATION to distribute the computational workload evenly across the network. The resulting O(1) expected per–node workload enables this approach to be employed by a significantly less sophisticated class of processors, or applied to significantly larger– scale networks. Specifically, a hierarchical cluster–based architecture for solving the shape problem is now presented. Remark 2.8.1. Although designed and implemented for solving the shape problem, the proffered architecture can be readily applied to other problems that feature both mono–banded linear systems and separable objective functions.
2.8.1
Proposed Architecture
The proffered approach solves convex optimization problems in the context of a cluster–based network architecture under the direction of some root agent(s). The root is responsible for orchestrating the solve process; thus, it maintains a global state reflecting the status of the distributed computation. It performs such tasks as initializing the network and determining when the solve is complete. Although the root maintains a “global perspective”, its “data view” is primarily limited to that which affects the computation of its associated decision variables. The only exception is when it requests data needed to manage the IPM solve process, for instance, when it requests Newton decrement data. At the root’s disposal are the remaining nodes in the network, which are called supporting peers. These nodes are considered supporting, because they serve only as a distributed memory pool and a computational engine for the root during the solve process; individually, they lack a global view of the solver and only manage data relevant to their computations. Essentially, they wait until a data request originating from the root is received before transitioning into a new state of computation. A high–level view of the architecture is given in Figure 2.12. To reduce the communication overhead, the architecture is defined to have a hierarchical scheme based upon network clusters. In the context of this framework, the role of clusterheads is to ensure that each request of the root is satisfied at the lowest level. Members treat their clusterhead as a local accumulator and forward 48
2.8. DISTRIBUTING COMPUTATION
Figure 2.12: A high–level view of the proposed computationally distributed architecture for solving the shape problem.
the requested information to that agent where it is aggregated before being passed up the hierarchy, ultimately reaching the root. Remark 2.8.2. It should be noted that defining the system this way is hardly restrictive as recent advances enable the self–organization of such cluster–based multi–agent systems (e.g. (Kochhal et al. , 2003; Chan & Perrig, 2004; Lehsaini et al. , 2008)). Distributing and Solving the Newton KKT System Observing Algorithm 1 it is evident that each of the steps can be readily handled by the roots with the exception of executing the Newtonian minimization. As a result, distributing the barrier algorithm reduces to distributing the four steps associated with Newton’s method (see Algorithm 2), which are solving the Newton KKT system, computing the Newton decrement, updating the state variable, and computing the desired step length. Given that the objective function (and thus the corresponding gradient and Hessian) are separable, implementing these steps reduces to having each agent compute its contribution to the greater value when requested by a root. Illustrating this point, if ai ∈ χj is the ith agent in the j th cluster operating in Rd ,
49
2.8. DISTRIBUTING COMPUTATION Algorithm 2 Newton’s Method MinNewton(f0 , ψ, x0 , A, b, γk , ǫN ) Require: x0 ∈ dom γk f0 + ζ, Axk = b, ǫN > 0 xk ← x0 #Initialize x to feasible point while true do # Solve the Newton KKT system δxk ← SolveKKT(f0 , ψ, xk , A, b, γk ) # Compute the Newton decrement 1 λ ← δxTk [▽2 (γk f0 + ζ)] δxk 2 2 if λ2 ≤ ǫN then return xk end if # Determine step–size via back–tracking line search sk ← LineSearch(f0 , ψ, xk , A, b, γk , α, β) xk ← xk + sk δxk end while it only has to compute the following value for the Newton decrement at iteration k T
λi = δxik ▽2 (γk f0i + ζ i)δxik
(2.33)
where δxik ∈ Rd , ▽2 (γk f0i + ζ i ) ∈ Rd×d . As a result, the clusterhead for χj can simply
compute the following summation over the parts given by its members to aggregate the results before passing it up the hierarchy λχ j =
|χj | X
λi
(2.34)
i=1
After receiving the complete sum, the root(s) will then compute the square root and then update the state of the solver before requesting the line search to begin. Citing the ease with which the other calculations can be readily distributed, along with the fact that the per–iteration complexity is defined by solving the Newton KKT system, the focus herein will be on distributing the LU–based solver. Specifically, it shall be shown that distributing the process yields a per–iteration computational complexity of O(1) requiring each node to send only O(log m) messages in the overlay. Additionally, by construction, it shall also be established that 50
2.8. DISTRIBUTING COMPUTATION solving the problem requires only O(1) storage per–node. However, before these results are formalized, an overview of the distributed process is given to provide further intuition into their origins. To spatially distribute the Newton KKT system, K ∈ Ry×y , y ∈ Z+ , among m
nodes, assume that it has been made mono–banded by applying the deterministic
permutation given by Theorem 2.5.3. Let bu and bl respectively denote the upper and lower bandwidths and assume the K is represented in its equivalent compact form Kc , where Kc ∈ Ry×(bl +bu +1) (Press et al. , 1993). In this form, sub–diagonal
elements are stored in the first bl columns, super–diagonal elements are stored in the last bu columns, and the diagonal of the original system is located in column bl + 1. As a result, the total number of columns in Kc is equal to the total bandwidth (i.e. bl + bu + 1) of the system. Additionally, let the right–hand–side and permutation vectors be respectively denoted as b ∈ Ry and p ∈ Zy+ .
Figure 2.13: A non-zero dot-plot illustrating the decomposition of the compact Newton KKT system (i.e. Kc ) for a configuration of 5 agents in SE(2) minimizing the total distance metric. For this problem, bl = bu = 7. Notice that the middle (m − 3) agents (i.e. n3 and n4 ) are assigned sub–blocks with identical structure.
51
2.8. DISTRIBUTING COMPUTATION Adopting this representation for K, consider the LU–based solver with partial pivoting outlined in (Press et al. , 1993). Distributing this algorithm, begin by assigning the ith agent, ni , a sub-block Kci ⊂ Kc (see Figure 2.13). Additionally, assign each ai the corresponding sub–vectors bi ⊂ b and pi ⊂ p. Given the dependencies
between the equations in the linear system, devising a completely concurrent solution is not feasible. As a result, consider an implementation where the three phases of the solver (i.e. decomposition, forward substitution, and backward substitution) are done one agent at a time in a “pass–the–bucket” fashion, where agent ni decomposes or solves before handing the process off to agent ni+1 . Figure 2.13 illustrates this process and to facilitate understanding each phase is now discussed separately.
Figure 2.14: Solving the Newton KKT system in a computationally distributed fashion can be realized using the proposed “pass–the–bucket” LU solver. Phase I – Decomposition: During decomposition, the algorithm employs partial pivoting by searching at most bl sub–diagonal elements in order to identify one with greater magnitude. This implies that each agent performing its respective decomposition will only need information pertaining to at most bl rows, which can 52
2.8. DISTRIBUTING COMPUTATION be buffered at one or more peers. In the worst case scenario, where each agent only manages a single row, agent ni may have to query up to bl of its peers. This result is formalized in the following theorem Theorem 2.8.3. Let i ∈ I = {1, . . . , m} and let Kcj ∈ Ruj ×(bl +bu +1) , uj ∈ Z+ for
j = 1, . . . , m. Define ψ(i) : I → Z+ as a mapping to the number of agents that have
to be contacted by ni during the decomposition of Kci . The following holds: bl ψ(i) ≤ φ(bl , u1, . . . , um ) = =φ min ui i∈{1,...,m}
Proof. By contradiction.
Assume ψ(i) > φ(bl , u1 , . . . , um ). Choosing ui = 1, ∀i ∈ {1, . . . , m}, we see: bl = bl ψ(i) > 1 However, it must hold that ψ(i) ≤ bl , since ni will only ever require data about bl
rows during the decomposition of Kci . This yields a contradiction from which it is concluded that the initial assumption is false. As decomposition progresses, agent ni iteratively constructs a permutation vec-
tor, pi . It should be noted that the value assigned to the j th position of pi (i.e. pi [j]) will be no more than (j + bl ) (see (Press et al. , 1993)). This fact becomes important when forward substitution is started, because it implies that the solving agent will have to communicate with at most φ peers to resolve the values of bi during this phase. Using the data it acquired from its ψ(i) ≤ φ supporting peers (in particular,
ni+1 , . . . , ni+ψ(i) ), ni performs standard LU decomposition on Kci . Upon completion,
it sends an update to each of the supporting nodes. The update contains the modified row(s) information and the adjusted permutation vectors corresponding to the changes it made with respect to the data the recipient provided. This allows each supporting agent to update its cache before the process is handed off to ni+1 . Phase II – Forward Substitution: Similar to the decomposition phase, forward substitution step is done in an iterative manner. In order to successfully solve its 53
2.8. DISTRIBUTING COMPUTATION sub–block, ni requires information from each of its supporting ψ(i) peers. Specifically, it will have to acquire information from each of the ψ(i) agents that provided it with data during decomposition. These agents must provide the corresponding rows that may be required by ni ’s forward substitution as well as any relevant components of their respective b sub–vectors. Upon completion, ni sends a message to each of the peers with updated values for their respective b sub–vectors. It concludes by handing off the process to ni+1 . Since nm is the last agent to perform both decomposition and forward substitution, it is responsible for signaling the start of a new phase in the LU–solver process. Phase III - Backward Substitution: The backward substitution phase begins when nm completes the forward solve on its respective sub–block, Kcm . Unlike the forward substitution phase, this phase requires an agent to communicate with at most 2φ peers. The additional messaging is introduced via the upper triangular factor having a bandwidth constrained by (bu + bl + 1) as a result of partial pivoting (Golub & Loan, 1996). To complete the backward substitution phase, ni simply must acquire the row information and b vector data from said peers (all of which have already completed their respective backward substitutions). The process completes when the agent managing Kc1 (i.e. n1 ) solves its corresponding sub–block. Complexity Bounds Given the protocol detailed in §2.8.1, a number of results can be derived. In this section, computational, storage, and message complexity bounds are established
that highlight the tremendous efficacy of the proffered (computationally) distributed framework. To begin, consider the communication complexity of the protocol which is formally stated in the following theorem Theorem 2.8.4. Solving the Newton KKT system, Equation (2.17), with the protocol detailed in §2.8.1 yields a per–node message complexity of O(log m) with an aggregate message complexity of O(m log m)
Proof. Let ni denote the ith agent assigned its respective compact sub–block, Kci ∈ Ru×(bl +bu +1) in some arbitrary formation F . Let ψ(i) be defined as in Theorem 2.8.3 54
2.8. DISTRIBUTING COMPUTATION and assume bu = bl as the Newton KKT matrix is symmetric. Additionally, it is assumed that whenever ni request information from any agent, the data is received in a single message. This assumption is reasonable as the information (including row data) that has to be shared between any agent is directly a function of bu and bl . Noting that information is delivered upon request, the total number of messages sent by ni is given by O(2ψ(i) + 2ψ(i) + 4ψ(i) + γ(i)) ≡ O(8φ + 3) ≡ O(1)
(2.35)
where γ(i) ≤ 3 is a mapping to the number of hand-off/signal messages sent by
ni and φ is as given in Theorem 2.8.3. As each node only sends O(1) messages to solve, the overhead is thus only a function of the hierarchical routing which scales as O(log m). As all nodes send O(1) messages during the solve, the aggregate message complexity for the distributed LU process is thus O(m log m). Building upon this result, the aggregate message complexity of the entire solve process can be established as follows Theorem 2.8.5. The total message complexity for solving Problem (2.3.2) with the computationally distributed framework is O(m1.5 log m) Proof. To establish this result, begin by considering the per–iteration message complexity which is defined by four separate distributed computations: computing the Newton decrement, performing the line search, and updating the state variable. Recall that each of these steps requires each agent to compute only its part before forwarding the result to its clusterhead. This is due to the separability of the objective, gradient and Hessian (Boyd & Vandenberghe, 2004). As a result, these steps require a per–node message complexity of O(log m) and an aggregate message complexity of O(m log m). Since solving the Newton KKT system requires O(m log m) messages per–iteration by Theorem 2.8.4, the total per–iteration message complexity √ is O(m log m). As the total number of Newton iterations scales no worse than m, the aggregate message complexity for the entire solve process is O(m1.5 log m). Having established the message complexity, the following bound can be placed on the storage complexity 55
2.8. DISTRIBUTING COMPUTATION Theorem 2.8.6. The computationally distributed framework requires O(1) per–node storage complexity to solve Problem (2.3.2) Proof. Given the separable objective function, computing the Newton decrement, performing the line search, and updating the state variable requires each agent to store the constant number of decision variables introduced via its inclusion. As a result, performing these computations requires O(1) storage. Additionally, solving the Newton KKT system requires each agent to manage a fixed–size Kci , bi , pi , and row data received by as many as 2φ peers. As a result, the per–node storage complexity is O(1) – i.e. independent of formation size. √ Additionally, the per–node computational complexity scales as O( m), which is formalized in the following theorem √ Theorem 2.8.7. The computationally distributed framework requires O( m) per– node computational complexity to solve Problem (2.3.2) Proof. To establish this result, begin by noticing that each step of Algorithm 1 can be realized in O(1) time with the exception of solving the Newton KKT system. Further, observe again that each step of Algorithm 2 can be trivially made O(1) by simply exploiting separability with the exception of solving the Newton KKT system. However, as each agent only has to decompose and solve for the variables and constraints (represented by Kci and bi ) introduced via its inclusion in the system and process data that is only a direct functional of the upper and lower bandwidths of the compact KKT system, the per–node computational complexity per–iteration √ is O(1). Given the total number of Newton iterations is O( m), this corresponds to the upper bound on the number of basic operations. Remark 2.8.8. It should again be emphasized that the
√
m bound on the total
number of Newton iterations is highly conservative (Boyd & Vandenberghe, 2004). As a result, solving Problem (2.3.2) should be expected to require no more than O(m log m) messages with each robot having to perform O(1) basic operations.
56
2.9. A DECENTRALIZED SOLUTION
2.8.2
Experimental Results
The distributed framework presented in the previous section was implemented on a team of six Sony Aibos. The team was tasked with achieving an optimal delta formation where the objective was to minimize the total distance traveled by team members. Each Aibo was outfitted with a unique butterfly pattern (Bruce & Veloso, 2003) that was tracked via an overhead camera system serving as an indoor “GPS” system. Figure 2.15(a) shows the initial configuration, along with lines mapping each to its computed optimal position. Figure 2.15(b) shows the Aibos after transitioning to the optimal shape formation.
(a)
(b)
Figure 2.15: (a) An initial deployment of six Sony Aibos along with overlaid lines/points mapping each to its computed optimal position. (b) The team after reconfiguring to the desired delta shape formation.
2.9
A Decentralized Solution
In the previous section, a computationally distributed solution was presented for solving Problem (2.3.2). However, this approach suffers from key limitation in that communication is required across the robot formation during the solve process. In the sequel, the communication constraint is lifted and the question is explored whether optimal formation changes can still be obtained using only local 57
2.9. A DECENTRALIZED SOLUTION sensory feedback akin to biological systems such as schools of fish or a flocks of birds. In doing so, a decentralized solution is presented that affords O(1) computational overhead for all but only a few robots. In formulating said solution, a hierarchical model is assumed whereby a small number of leader nodes acting as exemplars solve Problem (2.3.2). Doing so allows the remaining follower nodes to infer their objective positions through local observations. Such a model is attractive to not only hierarchical network architectures (Govindan et al. , 2005) but is highly amenable systems where minimizing data communication is a primary objective (COMPASS, 2007). To yield our hierarchically decentralized formulation, the following assumptions are made A1: Each agent knows the objective shape icon S for the network. A2: Leaders (individually or collectively) know the current network shape. A3: Followers have no knowledge of the current network shape. A4: Followers can identify their neighbors and measure their relative position. A5: Followers can observe the relative heading of their immediate neighbors.
An O(1) Hierarchical Approach Key to this approach is the realization that although Equation (2.4) includes 2m decision variables (corresponding to the m robot positions), the feasible set is constrained to the equivalence class of the full set of similarity transformations for the objective formation shape. More concisely, there are only four degrees of freedom in determining an agent’s objective position on the plane which correspond to the translation, rotation, and scale of the objective shape. It is this observation that lays the groundwork for the following key result Theorem 2.9.1. Assume that four leaders have either individually or collectively solved Problem (2.3.2) . A follower can solve for its respective position in the optimal formation after observing the relative pose of those leaders. Proof. As noted, followers have limited knowledge, and, as a result, they are incapable of directly estimating their objective positions. However, notice that an 58
2.9. A DECENTRALIZED SOLUTION observation of the relative pose (pxl , pyl , ωl )T of leader l induces an additional constraint on the objective shape in the following form (see Figure 2.9) (ql − pl )T (sin ωl , − cos ωl ) = 0
(2.36)
where all measurements are relative to the follower’s coordinate frame F . If the headings of four leader nodes can be observed, the problem becomes fully constrained
via the equality constraints in Equation (2.8). More precisely, recall that in addition to this heading constraint, each robot imposes two additional equality constraints on the objective formation shape as seen in Equation (2.8). With four leader nodes and one follower node, this corresponds to a total of four bearing and ten shape constraints over fourteen decision variables. However, noting that the shape index (not coordinate) assignments are arbitrary, the follower can designate itself as the first index corresponding to the three–tuple {p1 , q1 , s1 } and associate one of the
observed leaders with {p2 , q2 , s2 }. This eliminates the associated shape constraints for these two agents and reduces the set to
k s2 − s1 k
k s2 − s1 k
(qlx (qly
(ql − pl )T (sin ωl , − cos ωl ) = 0,
l = 2, . . . , 5
−
l = 3, . . . , 5
−
q1x ) q1y )
−
−
(sxl , syl )T (q2 (syl , sxl )T (q2
− q1 ) = 0,
− q1 ) = 0,
l = 3, . . . , 5
(2.37)
where l ∈ {2 . . . 5} now corresponds to the set of observed leaders. The constraint
set is linear in q, and can be written in the form Aˆ q = b, where the solution vector qˆ ⊂ q is the objective positions of follower and four observed leader nodes. It is a
linear system of ten equations in ten unknowns, and is readily solvable via Gaussian elimination techniques. As a result, the problem can now be solved by the follower in a decentralized fashion and in O(1) time regardless of formation size.
Summarizing Theorem 2.9.1, each follower agent can solve for its objective position (and even its neighbors’ positions) so long as the relative position and headings of at least four neighbors can be observed. Such behavior is akin to biological systems (e.g. schools of fish, flocks of birds, etc.) capable of complex formation changes using only local sensor feedback. Furthermore, the solution is obtained from solving 59
2.9. A DECENTRALIZED SOLUTION
Figure 2.16: An observation of the relative pose of a leader that has solved Problem (2.3.2) induces an additional constraint corresponding to Equation (2.36). an O(1) sized (10 × 10) linear system of equations. The assumption of knowledge of the objective shape does however require O(m) storage for each agent.
It should also be noted that Theorem 2.9.1 lends itself to a hierarchical protocol that allows a team of m ≥ 5 agents in R2 to solve the shape problem in a decen-
tralized fashion. Specifically, after a follower solves Equation (2.37) for its objective
position, it can be “promoted” to leader status. Accordingly, as it migrates to this position, its relative pose can be observed by other followers to solve their own decentralized shape problem formulation. So, while in practice the actual number of leader agents will be a function of the sensor network topology, in theory only four are necessary. This process is illustrated in the following example. Example: In this example, consider the initial deployment of a mobile wireless sensor network in R2 as shown in Figure 2.17(a). The objective team configuration was a {4,4} tessellation on the plane with a tiling size of 10 meters. Unfortu-
nately, positional errors introduced during deployment – modeled as Gaussian noise ∼ N(0, σx =σy =7.5) – result in a significantly different geometry. To compensate 60
2.9. A DECENTRALIZED SOLUTION
(a) 6 5
4
2 2
2 1
7
3
2
5
4 5
6 6 6 6
3 4
5
1
5 9
2 2 33
5
3
6 4
7
5 4
5 5
5
6
7
6
6 6
10
10
8 9
8 8
9 9
9
10 9
9 8
9
8 8
7 7
7
9 9
8
7
5 7
6
9
7
6
5
5
9 7 7
4 4
8 9
6
4
3 4
4
6
7
7
3 1
7
5
11
8 10
8
(b)
61
2.9. A DECENTRALIZED SOLUTION
(c)
(d)
Figure 2.17: Decentralized motion planning: (a) The initial network configuration with leader (circle) and follower (triangle) agents. (b) Evolution of the decentralized solution. (c) Agent trajectories (d) Final network configuration. 62
2.10. DISCUSSION for these errors, four leaders (circles) solve the motion planning problem, and begin migrating to their objective positions. Relative sensor measurements allow the remaining followers (triangles) to solve for their objective positions in a decentralized fashion. The propagation of the decentralized solutions through the visibility network is reflected in Figure 2.17(b). The decentralized trajectories that minimize the maximum distance that any node must travel, and the optimal network configuration achieving the desired shape are shown in Figures 2.17(c–d). It was assumed that the sensing range of each agent was 25 meters. Note that, in this example, the orientation of the shape was not constrained. If a fixed orientation was desired (e.g. orthogonal to the x−y axes), the number of degrees of freedom would be reduced to three – as would the number of observations required to solve the decentralized problem. Fixing the scale would further simplify the problem, requiring only two observations for each solution. Remark 2.9.2. It should be stated that although in this example the decentralized solution was able to propagate through the entire network using the minimum number of leaders, this will typically not be the case in practice. In a more realistic application, a small clusters of leaders should be distributed throughout the formation to ensure convergence.
2.10
Discussion
In this chapter, centralized, computationally distributed, and hierarchically decentralized solutions to the shape problem were formulated. Additionally, hard complexity bounds and performance–in–practice were both investigated with results highlighting the tremendous efficacy of these approaches. Extensions were also considered for operating in both convex and non–convex static, polygonal workspaces, which included a hierarchical discrete–to–continuous optimization paradigm inspired by certain principles from model predictive control. In all cases, problems were readily formulated as second–order cone programs which can be solved online and 63
2.10. DISCUSSION in real–time for large–scale robot formations. It is also worth noting that the results of this chapter offer a concise solution to the shape problem that fully captures shape parameters without any relaxations in the form of linearization or other simplifications. It should be noted that a number of key extensions can be considered for this research. The thrust of this research was on identifying the optimal similarity transformation between a set of control points and an equivalent representation in the team configuration space. In order to formulate this problem, relative assignment is done a priori. An obvious extension would be to also incorporate assignment into the given program; however, it is not clear how to achieve this result as any direct formulation results in a loss of convexity as the constraints capturing shape become non–reducible. It is believed that perhaps a multi–phased approach provides the best option. In this framework, agents would first solve the shape problem and then, using this initial solution as a seed, solve a linear–program to determine a relative position in the first–pass solution. This heuristic approach would likely yield improved results; however, such a technique cannot guarantee optimality. Exploring an extension that affords optimal assignment is a topic of ongoing research. A key drawback of this approach is the inability to optimize with respect to formation orientation in SE(3). As it turns out, such a formulation yields a non– reducible constraint set that is not amenable to the proposed convex optimization constructs. One approach that was considered was perhaps relaxing the constraints imposed on the transformation applied to shape coordinates. Specifically, what would happen if the applied transformation was only affine and not a rigid rotation? Unfortunately, considering just an affine mapping poses additional complexities as it permits the resulting shape to become skewed which is hardly desirable in this context. As such, alternate extensions to SE(3) are also being considered. It should also be noted that the convexity requirement has its limitations, such as when considering non–holonomic kinematic motion models. However, the optimal paths generated for the fully actuated robots can still be followed by differential drive/tracked vehicles which see widespread use in mobile robotics. Although the
64
2.10. DISCUSSION results will no longer be optimal for such robots, they should still be quite good in practice. Gauging such performance for different models would be worth considering. In the following chapter, our attention is focused on enabling optimal target tracking subject to a number of impressive performance guarantees. Although the results of this section may seem unrelated, it shall be shown in Chapter 4 that these seemingly disparate results are, in fact, instances of a more general framework.
65
Chapter 3 Optimal Target Tracking In this chapter, the first steps are taken towards a more general optimization framework. In doing so, we focus on the target tracking problem and present a discrete– time optimization framework for solving it in the context of multi-agent systems. Specifically, a generic semidefinite programming (SDP) construct is considered that when paired with an appropriate objective yields an optimal robot configuration over a given time step. This approach also affords a number of impressive performance guarantees to include full target coverage (i.e. each target is tracked by at least a single team member) as well as maintenance of network connectivity across the team formation. Key to this research is the seminal result from spectral graph theory that states the second–smallest eigenvalue (i.e. λ2 ) of a weighted graph’s Laplacian (i.e. its interconnectivity matrix) is a measure of connectivity for the associated graph. Our approach allows us to articulate agent–target coverage and inter-agent communication constraints as linear-matrix inequalities (LMIs). Additionally, a pair of key extensions to said framework are presented that consider alternate tracking problem formulations. The first allows k–coverage target constraints, where each target is tracked by k or more agents, to be directly embedded into our framework. In the second, a relaxed formulation is considered for the case when network connectivity constraints are superfluous. The problem is modeled as a second–order cone program (SOCP) that in–practice can be solved
66
3.1. BACKGROUND significantly more efficiently than its SDP counterpart, making it suitable for real– time tracking with systems comprised of hundreds of agents. Simulation results are presented highlighting the utility of the proposed constructs.
3.1
Background
In this chapter, our focus is on enabling multi–agent systems for use in surveillance and monitoring applications. The idea of using teams of small, inexpensive robotic agents to accomplish such sophisticated tasks is one that has gained increasing currency as embedded processors and sensors become smaller, more capable, and less expensive. To this point, much of the work in multi-robot coordination has focused on control and perception. It has generally been assumed that each team member has the ability to communicate with any other member with little to no consideration for the the quality of the wireless communication network. This assumption, although valid in certain situations, does not generally hold, especially when considering the deployment of robot teams within dynamic environments. Our previous work in target tracking made similar simplifying assumptions, as no constraints were placed on sensing and communication ranges (Spletzer & Taylor, 2003). This allowed target coverage and network connectivity requirements to be ignored in order to simplify the optimization process. In this chapter, however, the problem controlling a team of mobile agents for target tracking subject to target coverage and inter–agent communication constraints is considered. Our methodology is based on the graph theoretic result where the second smallest eigenvalue of the interconnection graph Laplacian matrix is a measure for the connectivity of the graph. Recent system and control literature has shown that the maximization of the second smallest eigenvalue for a state dependent graph Laplacian matrix can be formulated as a semidefinite program (Kim & Mesbahi, 2006). In this research, these seminal results are adapted to the target tracking problem to yield a planning strategy that maintains target coverage and network connectivity while minimizing a given objective function. Specifically, robot–target assignments and
67
3.2. RELATED WORK inter–agent communication constraints are respectively embedded in visibility and network graphs. The target tracking problem is then formulated as a SDP where said constraints are modeled as linear–matrix inequalities. An important advantage of this formulation is that it is agnostic to the quality metric being minimized. So long as the objective function is convex, and it can be expressed in terms of linear, quadratic, or semidefinite constraints, the resulting problem will be a SDP. The convexity of semidefinite programs ensures any solution is globally optimal and obtainable in polynomial time (Boyd & Vandenberghe, 2004). In this case, the computational complexity will scale with the total number of agents and targets in the system.
3.2
Related Work
In recent years, increased attention has been focused on the effects of communication constraints on the multi-agent system motion planning and control. Earlier works generally assumed static communication ranges, (Winfield, 2000), and/or relied on coordination strategies that require direct line-of-sight, (Arkin & Diaz, 2002). In (Sweeney et al. , 2002) and (Pereira et al. , 2003) decentralized controllers were used for concurrently moving toward goal destinations while satisfying communication constraints by maintaining line-of-sight and assuming static communication/sensor ranges respectively. Coordination strategies based on inter-agent signal strength include (Wagner & Arkin, 2004), (Sweeney et al. , 2004), and (Powers & Balch, 2004). In (Hsieh et al. , 2006), low-level reactive controllers capable of responding to changes in signal strength or estimated available bandwidth are used to constrain robots’ movements in surveillance and reconnaissance tasks. Although much of the recent works have focused on the effects of communication maintenance on navigation, few have addressed the issue of communication maintenance in tasks such as collaborative/collective localization and data fusion where team connectivity is essential to the team’s ability to achieve its goals.
68
3.2. RELATED WORK Previous works in collaborative target localization include (LaValle et al. , 1997), (Stamos & Allen, 1998), and (Fabiani et al. , 2001) where strategies such as maintaining visibility constraints and determining optimal sensor placement are considered. More recent works include (Liu et al. , 2003) where artificial potential functions are used to coordinate a team of mobile agents to track multiple moving targets. (Jung, 2005) addresses the same problem by formulating it as two sub-problems: target tracking for a single robot and on–line motion planning strategy for a team of robots. In (Krishna et al. , 2005), the authors consider a motion planning strategy to enable a team of mobile sensors to detect multiple targets within a given region. (Mirzaei et al. , 2007) analyses the accuracy of cooperative localization and target tracking in a team of mobile robots using an Extended Kalman Filter (EKF) formulation and provide upper bounds for the position uncertainty obtained by the team. In (Shucker & Bennett, 2005), a distributed control strategy is used to maintain a team of mobile agents in a mesh formation to enable tracking of a discrete or diffused target. In (Spletzer & Taylor, 2003), the authors employ particle filters to minimize the expected error in tracking mobile target(s) for a team of mobile robots without the use of explicit switching rules, while (Stroupe & Balch, 2003) employ an Extended Kalman Filter (EKF) approach to minimize the the covariance representing uncertainty of estimated target positions. Lastly, maximizing the second smallest eigenvalue (i.e. λ2 ) of a state–dependent Laplacian matrix associated with a network graph has been considered in both (Kim & Mesbahi, 2006) and (Gennaro & Jadbabaie, 2006). The former is most related to our work where the problem of finding optimal node positions is reduced to assigning edge weights as a function of system state via SDP. Leveraging these results, the latter formulates a fully decentralized framework for approximating λ2 and implicitly maximizing network interconnectivity.
69
3.3. THE TRACKING PROBLEM
3.3
The Tracking Problem
The objective of this dissertation is to provide a general framework that facilitates optimal target tracking with performance guarantees. More precisely, we consider minimizing some convex objective function, Ψ : Rdn → R, d ∈ {2, 3}, over a given time step while ensuring
1. Full target coverage (i.e. each target is tracked by at least a single agent) 2. Network connectivity across the robot formation Accordingly, this problem shall be called the the tracking problem. In this context, Ψ is as a function of our decision variable X = (xa1 , . . . , xan )T ∈
Rdn denoting the concatenated positions of the robot team with xai ∈ Rd representing
the location of agent i with respect to some world frame W. Additionally, we assume a fully actuated motion model for each team member, i.e. where x˙ = u, u ∈ U ⊆ Rd
(3.1)
It should also be noted that each agent is assumed to feature an omnidirectional sensing modality (e.g. an omnidirectional camera system) with which to track.
3.4
Employing Spectral Graph Theory
Towards an optimal tracking framework, begin by exploiting the fact that a team of robots and the targets they collectively track define a finite, weighted graph where an agent–target edge corresponds to a single point-to-point sensor track. More precisely, letting A = {a1 , a2 , . . . , an } and T = {t1 , t2 , . . . , tm } respectively denote
the full set of agents and the full set of observation targets, the definition of the visibility graph is given as follows Definition 3.4.1. The visibility graph is the undirected, weighted graph GV (VV , EV )
where VV = A ∪ T and EV = A × (A ∪ T ). Its edges are associated with a mapping fV : EV → R+ that indicates the level of visibility between system elements. 70
3.4. EMPLOYING SPECTRAL GRAPH THEORY Although a number of choices can be considered for fV , it is assumed in this research that edge weights are a direct functional of the Euclidean distance separating an agent from some other element. As shall be seen in §3.5, defining fV
this way leads to a special–type of function that effectively governs inter–agent and agent–target behaviors. Deferring this point momentarily, let xtj ∈ Rd denote the
position of observation target tj in world coordinate frame W, and define ( fVa (k xai − xaj k2 ), y = (ai , aj ) ∈ A × A fV (y) = fVt (k xai − xtj k2 ), y = (ai , tj ) ∈ A × T
(3.2)
where 0 ≤ fVa (xai , xaj ), fVt (xai , xtj ) ≤ 1. Observe that by this definition that GV is
implicitly a function of the positional state vector X, and as such, we accordingly denote it GV (X). Figure 3.1 illustrates the visibility graph for a team of three agents
tracking a single target in R2 .
Figure 3.1: The weighted visibility graph, GV (VV , EV ), for a team of three agents tracking a single target in R2 . In our formulation, respective edge weights are a function of the team’s positional state vector, X. It should be noted that the definition of the visibility graph GV (X) is itself very intuitive as agents can theoretically possess the ability to observe their peers as well 71
3.4. EMPLOYING SPECTRAL GRAPH THEORY as specified targets. Furthermore, expressing edge weights as a function of distance is also very natural as the notion of inter–agent and agent–target proximity is often central to the target tracking task (possibly introduced via sensor range constraints).
3.4.1
Guaranteeing Full Target Coverage
Given the definition of GV (X), note that all targets in the system are tracked whenever the graph itself is connected as this implies edges between all targets and at least one member of the agent team. Accordingly, our constraint set should capture this notion of connectivity when determining the optimal state vector X. With this in mind, consider seminal results from spectral graph theory regarding the connectivity of an arbitrary graph G(V, E). In particular, note that the
constraint λ2 (L(G)) > 0 is both a necessary and sufficient condition for guaranteeing the connectivity of G (Fiedler, 1975), where λ2 (L(G)) denotes the second smallest eigenvalue of the weighted graph Laplacian L(G) given by −wij , i 6= j P [L(G)]ij = wik , i = j
(3.3)
i6=k
with wij being the weight associated with the edge shared between vertices i and j. In light of these observations, the following initial formulation for the target tracking problem can be posed min Ψ(X) s.t. λ2 (LV (X)) > 0
(3.4)
where LV (X) denotes the state-dependent Laplacian of the visibility graph GV (X). Noting the results of (Kim & Mesbahi, 2006), the following holds λ2 (LV (X)) > 0 ⇐⇒ PVT LV (X)PV ≻ 0
(3.5)
where PV ∈ R(n+m)×(n+m−1) comprises an orthonormal basis for an n + m − 1 di-
mensional subspace such that ∀x ∈ span(PV ), 1T x = 0. As such, Problem (3.4) can 72
3.4. EMPLOYING SPECTRAL GRAPH THEORY be reposed as follows min Ψ(X) s.t. PVT LV (X)PV ≻ 0
(3.6)
In this formulation, we adopt the standard L¨owner ordering. Additionally, we highlight that solving this problem minimizes the chosen objective while ensuring that all specified targets are tracked by at least a single team member.
3.4.2
Guaranteeing Network Connectivity
Although solving Problem (3.6) yields a positional configuration that will ensure all targets are tracked by at least a single agent, it makes no guarantees regarding the underlying network connectivity over the resulting formation. The ability to ensure a connected network graph while performing such a task is often desirable as it facilitates – among other things – distributed sensor fusion. To address this, our previous formulation is extended by introducing a traditional network proximity graph GN (VN , EN ) where VN = A and EN = A × A. Similar to the previous graph
formulation, we associate a weight function fN : EN → R+ that is a direct functional
of the Euclidean distance between network peers. For our purposes, fN will be modeled to reflect the quality of a communication link shared between said nodes. Given this definition, Problem (3.6) can be augmented to yield the following problem statement min Ψ(X) s.t. PVT LV (X)PV ≻ 0
(3.7)
PNT LN (X)PN ≻ 0
Solving this problem will yield an optimal positional configuration for the team that maintains both network connectivity as well as complete target coverage. However, it should be noted that Problem (3.7) is not a convex optimization problem. As shall be seen, this does not prevent us from formulating the target tracking problem as a discrete–time process whereby during each iteration a convex form of Problem (3.7) is solved. This is a key point that will be thoroughly explored in §3.6; however,
73
3.5. INTERACTIVE CONTROL FUNCTIONS before doing so, it is essential to discuss the state–dependent weighting functions that are central to forthcoming results.
3.5
Interactive Control Functions
Given this formulation, it is now essential to consider appropriate definitions of fVa , fVt , and fN . The choice of these functions is critical as they inherently govern the behavior of the team. As these functions dictate the relationship between one agent and another as well as any observation targets, they can be termed interactive control functions (ICFs). As ICFs are among the contributions of this dissertation and play a significant role in forthcoming results, they are now formally presented. Begin by considering the set of agents A as defined in §3.4 with associated state
vector defined in §3.3. Additionally, let the full set actors and/or objects (e.g. static/dynamic obstacles, observation targets, etc.) sharing the workspace W with
A be denoted by the set O = {o1 , . . . , om }. In the context of this research, O = T
where T again denotes the set of observation targets. Associate with each oj ∈ O
the positional state vector xoj . Additionally, consider the graph defined over these
elements G(V, E) where V = A ∪ O and E = A × (A ∪ O). Given these constructs, an interactive control function can be defined as follows
Definition 3.5.1. An interactive control function f (ai , aj ) , f (xai , xaj¯ ) : R2d →
[0, 1], ai ∈ A, aj ∈ (A ∪ O) \ {ai } associated with graph G(V, E) is a real–valued,
smooth function corresponding to the weight of edge (ai , aj ) whose gradient when followed induces a desired state of ai as a function of its proximity to aj . Although seemingly abstract, it shall be shown that ICFs are particularly powerful tools that enable rather sophisticated agent behaviors. In an effort to ground Definition 3.5.1, ICFs are now formulated and applied within the context of a simple target tracking scenario. Momentarily neglecting discussion of fVa , we focus instead on characterizing the desirable properties of weight function fVt , which governs the interactions of systems agents with respect to targets. In an ideal tracking scenario, the team will have an optimal number of tracks while 74
3.5. INTERACTIVE CONTROL FUNCTIONS each agent maintains a safe standoff distance between itself and its targets. In other words, letting rij denote the desired distance between agent ai and target tj , we would like fVt to promote behaviors yielding |rij − ǫlij | ≤ dtij ≤ |rij + ǫuij |, ∀i, j
(3.8)
where dtij ,k xai − xtj k2 and ǫlij , ǫuij ∈ R+ dictate the acceptable lower/upper bound tolerances for any tracks between agent ai and observation target tj . When the
right-hand inequality holds, the track is defined as being active. In this research, the case where ǫlij = ǫuij is considered and fVt is modeled using a symmetric Gaussian potential. Despite our choice, it should be noted that the forthcoming analysis can easily be adapted for an alternate (symmetric or non-symmetric) potential function formulation. That stated, consider the following definition 2 e− γ2 (dtij −rij )2 , |dt − rij | ≤ ǫ ij t a t (3.9) fV (xi , xj ) = 0, otherwise
where γ is a “strictness” parameter defined as function of ǫ. Figure 3.2(a) illustrates this function for an instance with rij = 10 and ǫ = 4. It should also be mentioned that this choice of fVt has two primary motivations. The first is the differentiability properties of the Gaussian which facilitates prob-
lem formulation. The second relates to our operational objectives for tracking. In many sensor systems, position tracking errors are proportional to standoff distance. For example, in stereo vision the propagation of errors in disparity measurements is proportional to the square of the scene depth. Additionally, when fusing bearing measurements from a pair of cameras on two different agents, the error is directly proportional to the standoff distances (Isler et al. , 1988). Thus, it is highly desirable for each agent to be as close to the target as possible to enhance tracking accuracy while still maintaining a sufficient standoff distance to avoid detection and/or collision. This desired behavior is implicitly captured in our choice of fVt . Regarding fVa , which governs inter–agent tracking behaviors, its definition is not as obvious since a variety of formulations may lead to favorable results depending 75
3.5. INTERACTIVE CONTROL FUNCTIONS
1
0.9
0.8
0.7
0.6
l ij
r −ε ij
0.5
u ij
r +ε
r
ij
ij
0.4
0.3
0.2
0.1
0 4
6
8
10
12
14
16
(a)
1
0.9
0.8
0.7
0.6
q
0.5
q
min
max
0.4
0.3
0.2
0.1
0 9
10
11
12
13
14
15
16
(b)
Figure 3.2: (a) An instance of fVa where rij = 10 and ǫlij = ǫuij = 4, ∀i, j. (b) An instance of an exponential-decay model for RF links with qmin = 10 and qmax = 15.
76
3.6. SOLVING VIA SEMIDEFINITE PROGRAMMING upon the chosen application and mission objectives. For instance, choosing fVa = 1 has the effect of removing any inter–agent observability requirements as it essentially says that no matter what the positional state of the team, the inter-agent links are connected or are observable everywhere. Whether this is feasible given the agents’ respective sensor suites is inconsequential for the task at hand, as in choosing the weights this way we are only concerned with ensuring that each target is tracked by at least a single team member. Conversely, choosing fVa = 0 has the effect of removing inter–agent edges which implies that graph connectivity is intact only when each agent is observing at least a single target. Another reasonable choice for fVa is a potential function that behaves similarly to that used to define fVt . This definition would be useful in a scenario where team members rely upon local observations of their peers for such things as localization or formation control. In this work, the first or second definition is employed depending upon context. In a similar manner, the issue of weighting the edges in the network graph GN (X) can be considered. In this case, edge weights are readily characterized by seeking inspiration from physics which dictates an appropriate exponential decay model. Such models have been utilized in recent literature (Gennaro & Jadbabaie, 2006; Kim & Mesbahi, 2006). For the sake of further discussion, it is assumed that edge weights follow the exponential decay model posed by (Gennaro & Jadbabaie, 2006). Adopting this model yields the following formulation for fN 1, daij ≤ qmin −5(da −qmin ) ij fN (xai , xaj ) = e qmax −qmin , qmin < daij ≤ qmax 0, daij > qmax
(3.10)
Figure 3.2(b) shows a single instance of fN for qmin = 10 and qmax = 15.
3.6
Solving via Semidefinite Programming
In this section, we consider formulating our problem as a discrete–time semidefinite process whereby the agent team collectively observes the relative positions of the 77
3.6. SOLVING VIA SEMIDEFINITE PROGRAMMING observation targets and then accordingly adjusts their respective trajectories so as to minimize the given objective. As the targets are assumed dynamic and control is inherently a discrete–time process, at best the team can only optimize Ψ over the period ∆t representing the rate at which they are able to effectively sample the environment and issue control signals. Although this approach does not guarantee optimally–convergent behavior for the team, it does ensure that the solution obtained will yield a trajectory that is optimal with respect to Ψ over ∆t.
3.6.1
A Discrete–time Formulation
Given this discussion, the key result of this section is now presented in Theorem 3.6.1. It should be noted that the associated proof mimics a proof by construction and illustrates how arbitrary ICFs can be effectively embedded into the proposed graphical framework. As such, it can also be seen as a recipe for constructing such systems. Accordingly, this result is now presented in the following theorem. Theorem 3.6.1. Let A, |A| = n and T , |T | = m be as defined with corresponding state vectors such that xai , xtj ∈ Rd . Assume x˙ ai = uai ∈ U a , ∀ai ∈ A and x˙ tj = utj ∈ U t , ∀tj ∈ T . Let Ψ(X) : Rdn → R denote a convex objective to be minimized over
some ∆t. By solving the following discrete–time SDP at time k min Ψ(X(k + 1))
s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n PVT LV (k + 1)PV ≻ 0
(3.11)
PNT LN (k + 1)PN ≻ 0
where vi denotes the maximum velocity of ai ∈ A and {PV , LV } and {PN , LN } being respectively defined as in Problems (3.6) and (3.7), Ψ(X) will be minimized over ∆t while ensuring full target coverage and network connectivity at time k + 1. Proof. In establishing this result, we exploit the work of (Kim & Mesbahi, 2006) who considered a discrete–time process for maximizing network connectivity in multi– agent systems. Following their approach, first–order Euler analysis can be performed 78
3.6. SOLVING VIA SEMIDEFINITE PROGRAMMING yielding a discrete–time representation for both fV and fN which correspond to interactive control functions. More precisely, we compute partials
∂fV ∂t
and
∂fN ∂t
and
sample each at rate ∆t yielding the following discrete–time representations of xai (t) and x˙ ai (t) at iteration k xa (k + 1) − xai (k) x˙ai (t) → i ∆t
xai (t) → xai (k),
(3.12)
Employing a slight abuse of notation, xai (t) shall be denoted xai with the understanding that the decision variable is a function of time. Without loss of generality, fV and fN are both chosen to correspond with the formulations presented in §3.5. Given this selection, the following discrete–time form of Equation (3.9) for fVt results
T fVt (xai , xtj )(k + 1) − fVt (xai , xtj )(k) = τVt (k) xai (k) − xtj ∆xai
(3.13)
where τVt (k) is given by
τVt (k) = −
γ 2 (k xai (k) − xtj k2 −rij ) t a fV (xi (k), xtj ) k xai (k) − xtj k2
(3.14)
and ∀l, al ∈ A, ∆xal = xal (k + 1) − xal (k).
Similarly for fN , the following discrete–time representation of Equation (3.10) is
readily obtained T a ∆xi − ∆xaj (3.15) fN (xai , xaj )(k + 1) − fN (xai , xaj )(k) = τN (k) xai (k) − xaj (k)
where τN (k) is given by
τN (k) = −
5fN (xai (k), xaj (k)) (qmax − qmin ) k xai (k) − xaj (k) k2
(3.16)
Given Equations (3.13) and (3.15) and recalling fVa is chosen constant (i.e. fVa (xai , xaj )(k + 1) − fVa (xai , xaj )(k) = 0), the discretized, state–dependent Laplacians
for both the visibility graph GV (X) and the network proximity graph GN (X) can be defined. With respect to the former, the following is obtained −fV (yu , yv )(k), u = 6 v P [LV (k)]uv = fV (yu , ys )(k), u = v u6=s
79
(3.17)
3.6. SOLVING VIA SEMIDEFINITE PROGRAMMING where yu and yv are defined such that ( xal , l ≤ n = |A| yl = xt(l−n) , n < l Similarly for GN (X), the following is obtained −fN (xau , xav )(k), u 6= v P [LN (k)]uv = a a fN (xu , xs )(k), u = v
(3.18)
u6=s
Combining these results and introducing n second–order conic inequalities to mitigate the effects of the linearization process yields the discrete–time SDP stated by the theorem. As the given LMIs constraining X(k + 1) enforce λ2 (LV (k + 1)) > 0 and λ2 (LN (k + 1)) > 0 (i.e. the connectivity of both GV and GN at time k + 1 remains intact), it directly follows that full target coverage is achieved and network connectivity is maintained at time k + 1 while minimizing Ψ(X(k + 1)). It should be noted that the n second–order conic inequalities are necessary as they serve to reduce the effects of the linearization process. Additionally, they can be used to model velocity constraints on the individual robots. Since Problem (3.11) is a SDP, it can also be solved efficiently using modern interior–point methods for convex analysis (Boyd & Vandenberghe, 2004).
3.6.2
Choosing an Objective
Until this point, we have avoided any detailed discussion regarding the statement of the convex objective, Ψ. In fact, in the context of target tracking there are many useful candidate functions that fit well within this framework. One possibility is to choose Ψ as the trace of the covariance representing the uncertainty in measured target positions (Mourikis & Roumeliotis, 2005). As such, Problem (3.11) would yield a position vector that minimizes the uncertainty in the estimated target positions while ensuring full target coverage and network connectivity. In this paper, we instead choose Ψ(X) = −λ2 (LV (X)). It should be noted that
our choice is convex as λ2 is a concave function in a space orthogonal to the vector 80
3.6. SOLVING VIA SEMIDEFINITE PROGRAMMING of all ones (Gennaro & Jadbabaie, 2006). Given this function, the second smallest eigenvalue of the state–dependent graph Laplacian associated with the proposed visibility graph can be maximized over a given time step. Intuitively, choosing Ψ this way aims to maximize the visibility level of observation targets. Such an objective may be quite useful in surveillance applications where each member of the multi–agent system is outfitted with a low–grade sensor suite. Other appropriate objective functions (e.g. weighted least–squares) can be imagined. However, what is important to note is the generality of our approach. So long as the objective function is convex, and can be expressed in terms of linear, quadratic, or semidefinite constraints, the resulting problem will be a SDP. Example: The proposed paradigm was implemented in Matlab using SeDuMi 1.1R3 (P`olik, 2005) via YALMIP (Lofberg, 2004). Figure 3.3 illustrates the results from one such trial in which the objective was to maximize connectivity in the visibility graph (i.e. Ψ(X) = −λ2 (LV (X))) with fVa = 0. In this scenario, eight networked agents were responsible for tracking five mobile targets while maintaining a desired standoff distance rij = 0.10. The minimal desired agent-target proximity bound for active tracks was set at 0.06 with a maximum of 0.14. In this case, each agent was modeled using as its a primary sensor an omnidirectional camera system, and the network was modeled to experience exponential decay between 0.08 and 0.18 units. The maximal velocity of respective team members was 1.4 times that of the observation targets. Given the contrived trajectories of the targets, the team ultimately converges to an optimal configuration – yielding Ψ(X) ≈ 0.6972.
Figure 3.4 reveals the progression of both λ2 (LV (X)) and λ2 (LN (X)) as the team
employs our discrete–time framework. In this plot, we highlight the monotonically increasing behavior of the objective while noting that λ2 (LN (X)) remains positive for the entire run. In other words, the team effectively optimizes its objective over each time step while maintaining network connectivity across the formation.
81
3.6. SOLVING VIA SEMIDEFINITE PROGRAMMING
Agent Target Network Link Sensor Track
0.15 0.1 0.05
z
0 −0.05 −0.1 −0.15 −0.2 0.1
0.2 0.15
0 0.1 0.05
−0.1 0 −0.05
y
−0.2 −0.3
x
(a)
0.25 Agent Start Position Target Start Position Agent Path Target Path
0.2 0.15 0.1
z
0.05 0 −0.05 −0.1 −0.15 −0.2 −0.25 0.3 0.2 0.1
0.3
0
0.2 0.1
−0.1
0 −0.1
−0.2
−0.2
y
x
(b)
82
3.7. GUARANTEEING K–COVERAGE OF TARGETS
0.08 0.06 0.04
z
0.02 0 −0.02 −0.04 −0.06 0.2
−0.08 0.15 −0.05
0.1 −0.1
0.05 −0.15 −0.2 y
0 x
(c)
Figure 3.3: (a) Initial visibility and network graphs for a team of eight agents in R3 charged with tracking five mobile targets. In this case, the team’s objective was to maximize target visibility. (b) The trajectories of team members as they obtain an optimal configuration. (c) The resulting visibility and network graphs for the team.
3.7
Guaranteeing k–Coverage of Targets
In §3.3, the tracking problem was formulated to simply ensure that each target is being tracked by at least a single agent at any given instant in time; however,
there are many scenarios in which it would be favorable to enforce a lower–bound kj ∈ Z+ , kj ≥ 2 on the number of agents tracking target tj . As an example, teams
of low–cost robots may employ sensors incapable of directly estimating the target’s positions without additional constraints – e.g. a minimum of two bearing sensors such as cameras may be needed to estimate a target’s pose. As it turns out, enforcing such constraints can be achieved by simply ensuring the connectivity of combinatoric
visibility graphs with a single graph being introduced for each target.
83
3.7. GUARANTEEING K–COVERAGE OF TARGETS 1 Visibility Graph Network Proximity Graph
0.9 0.8 0.7
λ2
0.6 0.5 0.4 0.3 0.2 0.1 0
20
40
60
80
100 120 140 Time (Iteration Number)
160
180
200
220
Figure 3.4: The progression of λ2 (LV (X)) and λ2 (LN (X)) corresponding to Figure 3.3. In this case, the objective was to maximize target visibility over each ∆t. Towards establishing this seminal result, begin by considering an alternate formulation of Problem (3.11) that introduces a separate state–dependent combinatoric graph, GVj (X), for each tj ∈ T . Unlike the visibility graph presented in §3.4, the vertices of GVj (X) correspond to the full set of q = n−knj +1 agent combinations
(denoted Cj = {A1 , . . . , Aq }) with an additional vertex corresponding to tj . At first glance, this choice seems unintuitive; however, as shall be shown, it is in defining the vertices this way that facilitates our key results.
Momentarily deferring discussion of this point, we now define the corresponding edge set EVj = Cj × {tj } and associate with each edge a weight function fVAji : (Ai , tj ) → R+ . For our purposes, fVAji is a direct functional of the Euclidean
distance separating tj from each agent in the ith combination Ai . If some agent
in Ai is able to observe the target tj (i.e. it is proximal to the target location), it
must hold that fVAji > 0. If all agents in Ai are unable to observe the target (i.e. none are proximal), it is required that fVAji = 0 to indicate the corresponding edge
is disconnected. As shall be seen, fVAji can be readily defined as a normalized linear 84
3.7. GUARANTEEING K–COVERAGE OF TARGETS combination of ICFs; however, such a formulation is momentarily deferred until both the intuition and theory surrounding the use of the proposed combinatoric graphs can be established. Combining the given vertex and edge sets as well as the chosen state–dependent weighting function yields the precise definition of the combinatoric visibility graph GVj (X). However, in an effort to ground the intuition behind this formulation, consider Figure 3.5 which illustrates GVj (X) for a team of four agents tasked with maintaining kj = 2 coverage of target tj . Observe that when fVAji > 0, i = 1, . . . , 4 the graph is connected implying that at least two agents are observing the target and thus satisfy the k–coverage constraint.
Figure 3.5: The combinatoric visibility graph, GVj (X), for a team of four agents A = {a1 , . . . , a4 } tasked with maintaining at least kj = 2 coverage of target tj . Observe that when fVAji > 0, i = 1, . . . , 4 the graph is connected (i.e. λ2 (GVj ) > 0) implying that the k–coverage constraint is satisfied. Given this formulation, the following lemma is presented that formally establishes the results of the preceding discussion.
85
3.7. GUARANTEEING K–COVERAGE OF TARGETS Lemma 3.7.1. Let GVj (X) denote the state–dependent visibility graph associated with target tj as previously formulated. Assume fVAji > 0, i = 1, . . . , n−knj +1 . (i.e. assume GVj (X) is connected) then at least kj ∈ Z+ agents are tracking target tj .
Proof. Observe that if GVj (X) is connected, it is implied that at least a single agent from each of the n−knj +1 combinations in Cj is able to observe tj . Additionally, n−1 note that agent ai ∈ A appears in exactly n−k combinations in Cj . As a result, j no more than this number of edges can be affected by ai ’s ability to observe target
tj . As a result, establishing Lemma 3.7.1 is equivalent to showing that the ratio of total agent combinations to the number of combinations containing ai is greater than or equal to kj . More precisely, the following must simply be established n n−kj +1 n−1 n−kj
≥ kj
(3.19)
To establish this result, begin by assuming contrary to Equation (3.19). Doing so reveals the following chain of implications n n!(n − kj )!kj ! n−kj +1 < kj ⇒ < kj n−1 (n − k + 1)!(k − 1)!(n − 1)! j j n−k j
nkj < kj n − kj + 1 ⇒ n < n − kj + 1
⇒
⇒ kj < 1
⇒ kj ∈ / Z+ This yields a contradiction. Given that the connectivity of combinatoric visibility graph with respect to each tj ∈ T implies each tj is tracked by at least k teammates at any given time, the original target tracking problem (i.e. Problem (3.7)) can be restated to enforce k– coverage requirements as follows min Ψ(X) s.t. PVTi LVi (X)PVi ≻ 0, i = 1, . . . , m PNT LN (X)PN ≻ 0 86
(3.20)
3.7. GUARANTEEING K–COVERAGE OF TARGETS where LVi (X) is the state–dependent graph Laplacian associated with GVj (X), PVi ∈ R(n−kj +2)×(n−kj +1) comprises an orthonormal basis for an n − kj + 1 dimensional
subspace such that ∀x ∈ span(PVi ), 1T x = 0, and PN is as defined in Problem (3.7). Lemma 3.7.1 is significant in that it provides a mechanism by which to readily in-
tegrate k–coverage constraints directly into a discrete–time framework provided that the weighting functions for GVj (X) satisfy the previously stated criteria. Although a number of feasible formulations for fVAji can be considered, an obvious choice is to extend our previous results regarding interactive control functions and define edge weights as a linear combination of mono–dimensional ICFs given as follows
fVAji =
n−k Pj +1 s=1
fVt (xas , xtj )
n − kj + 1
(3.21)
where xas and xtj respectively denote the positions of as ∈ Ai and target tj . Observe
that when target tj is being actively tracked by all agents in Ai with each agent ob-
serving its desired standoff distance it holds that fVAji = 1. Similarly, when no agent in Ai is engaging the target it is also the case that fVAji = 0. In other words, Equa-
tion (3.21) provides a mechanism by which to fully capture the collective visibility of target tj with respect to agent combination Ai .
Given this definition, it is evident that Problem (3.20) lends itself well to the
first–order discretization techniques employed to yield the results of §3.6.1 (see The-
orem 3.6.1). Accordingly, the same approach can be taken here to yield a discrete– time formulation that minimizes an arbitrary convex objective while enforcing the connectivity of the combinatoric visibility graph for each target at time k + 1. This result is now formalized in the following theorem Theorem 3.7.2. Let A, |A| = n and T , |T | = m be as defined with corresponding state vectors such that xai , xtj ∈ Rd . Let kj ∈ Z+ , kj ≥ 2 denote the k–coverage requirement for target tj . Assume x˙ ai = uai ∈ U a , ∀ai ∈ A and x˙ tj = utj ∈ U t , ∀tj ∈ T .
Let Ψ(X) : Rdn → R denote a convex objective to be minimized over some ∆t. By
87
3.7. GUARANTEEING K–COVERAGE OF TARGETS solving the following discrete–time SDP at time k min Ψ(X(k + 1)) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n PVTj LVj (k + 1)PVj ≻ 0, j = 1, . . . , m
(3.22)
PNT LN (k + 1)PN ≻ 0
where vi denotes the maximum velocity of ai ∈ A, and LVj (X) is the graph Laplacian for GVj (X) with edge weights given by Equation (3.21) and LN (X) is as defined in Theorem 3.6.1, Ψ(X) will be minimized over ∆t while ensuring kj –coverage ∀tj ∈ T
and network connectivity at time k + 1.
Proof. In establishing this result, we mirror the proof of Theorem 3.6.1. In particular, first–order Euler analysis can be readily performed yielding a discrete–time representation for fVAji , j = 1, . . . , m; i = 1, . . . , q and fN which are themselves defined to be ICFs (see Definition 3.5.1) or some linear combination of ICFs. Accordingly, fVAji , j = 1, . . . , m; i = 1, . . . , q and fN are well–defined and their respective partials exist. Sampling at rate ∆t yields a discrete–time representation for xai (t) and x˙ ai (t) at iteration k given by Equation (3.12). Recalling the proof of Theorem 3.6.1, a similar approach can be applied to discretize fN . Accordingly, the remainder of this proof will focus on fVAji . Begin by choosing fVAji , j = 1, . . . , m; i = 1, . . . , q as in Equation (3.21). As these weights are simply a linear combination of mono–dimensional interactive control functions, they are well–defined and lend themselves well to first–order discretization method (see Theorem 3.6.1). Accordingly, this approach yields the following for fVAji
fVAji (k + 1) − fVAji (k) =
n−k Pj +1 s=1
fVt (xas , xtj )(k + 1) − fVt (xas , xtj )(k) n − kj + 1
(3.23)
where again xas and xtj respectively denote the positions of as ∈ Ai and target tj and fVt and fN can be chosen without loss of generality as in Theorem 3.6.1.
Given Equations (3.23) and (3.15), the discretize–time, state–dependent Laplacians corresponding to the combinatoric visibility graphs GVj (X), j = 1, . . . , m and 88
3.7. GUARANTEEING K–COVERAGE OF TARGETS the network proximity graph GN (X) can be directly formulated to yield LVj (k), j = 1, . . . , m and LN (k). Exploiting these discrete–time forms and introducing n second– order conic inequalities to mitigate the effects of the linearization process yields the discrete–time SDP stated in the theorem. As the given LMIs constraining X(k + 1) enforce λ2 (LVj (k + 1)) > 0, j = 1, . . . , m and λ2 (LN (k + 1)) > 0 (i.e. the connectivity of each GVj and GN at time k + 1 remains intact), it follows that kj –coverage, ∀tj ∈ T is achieved (by Lemma 3.7.1) and network connectivity is maintained at time k + 1 while minimizing Ψ(X) over ∆t.
Summarizing these results, the original problem has been transformed into a SDP that guarantees each target is observed by at least some minimal number of agents and that the communication network remains intact over a given time step. It should also be noted that the utility of this approach is hindered by the fact that it introduces m graphs – each having a combinatoric number of edges. As a result, this method is best suited for problems featuring modest k–coverage requirements or a small to moderate number of system agents and targets. However, for multiagent systems employing cameras or range sensors where 2–coverage is required for tracking, the approach can still be run in real–time. To illustrate the effectiveness of this approach, the proposed discrete–time formulation given by Problem (3.22) was simulated using SeDuMi 1.1R3 (P`olik, 2005) via YALMIP (Lofberg, 2004) in Matlab for a team of four agents tracking two diverging targets while ensuring kj = 2 coverage for each. Each agent was modeled as having an omnidirectional camera allowing it to track within 0.11 units while maintaining a desired standoff of 0.07 units. Figure 3.6 shows the resulting trajectories for the team. In this example, the objective was to maximize the minimum connectivity of each visibility graph – i.e. ψ(X) = min {−λ2 (LVj (X))}. j∈{1,2}
Example: Figure 3.7(Top) shows the progression of λ2 (LV1 (X)) with respect to time. As λ2 (LV1 (X)) > 0, tj is guaranteed to be tracked by at least k1 = 2 agents. Given the contrived trajectories of the targets and the chosen velocities, the algebraic connectivity of GV1 (X) converges to an optimal value λ2 (LV1 (X)) ≈ 0.3333. 89
3.8. A RELAXATION FOR LARGE–SCALE TEAMS
Figure 3.6: A team of four agents tracks two diverging targets while ensuring each tj is observed by at least kj = 2 team members. In this scenario, the objective was to maximize the minimal connectivity of each of the visibility graphs. Figure 3.7(Bottom) shows the approximate agent–target distances (i.e. dt11 , dt21 ) with respect to time. In this example, both agents readily maintain their active tracks while converging to the requested standoff. This behavior truly highlights the governing nature of ICFs. The agent velocities were set at 1.125 times that of the observation target velocities, which were assumed equal.
3.8
A Relaxation for Large–scale Teams
In the previous two sections, formulations were considered that enforced connectivity of the network graph. However, in some cases, it may be beneficial to sacrifice connectivity of this underlying graph in order to ensure each target is tracked. In others, the range of communication links may far exceed the sensing range of the mobile robot team. In such scenarios, the constraint PNT LN (X)PN ≻ 0 is superfluous
and can be safely eliminated from the problem statement. In doing so, the original tracking problem formulation (i.e. Problem (3.6)) is obtained. In this section, it 90
3.8. A RELAXATION FOR LARGE–SCALE TEAMS
Figure 3.7: (Top) The algebraic connectivity of the GV1 (X) corresponding to Figure 3.6 increases monotonically until reaching λ2 (LV1 (X)) ≈ 0.3333. (Bottom) The relative agent–target distances (i.e. dt11 , dt21 ) yielded from maximizing λ2 (LV1 (X)) – both agents obtain the desired standoff of 0.07 units. shall be shown that solving this problem lends itself well to large–scale tracking scenarios where inter–agent communications may be of secondary importance. As it turns out, the key to obtaining this result is observing that the constraint T
P L(X)P ≻ 0 in Problem (3.6) reduces to a single non–linear inequality when the graph in question features only a single pair of nodes. Keeping this in mind, consider a relaxed formulation of the the tracking problem whereby each target tj is associated with a single binodal graph GVj (Vj , Ej ) having one vertex to represent
the agent team (i.e. A) and another to represent the target (see Figure 3.8). By
enforcing the connectivity of each of these graphs in our problem formulation, each observation target is guaranteed to be tracked by at least a single member of the team. Implicit in this statement is that an appropriate weight function can be formulated for GVj (X) that fully captures the level of connectivity between the agent team 91
3.8. A RELAXATION FOR LARGE–SCALE TEAMS
Figure 3.8: The binodal visibility graphs, GVj (X), j = 1, 2, 3, for a team of three agents A = {a1 , . . . , a3 } tasked with maintaining full target coverage of T = {t1 , t2 , t3 }. Observe that when fVj > 0, i = 1, . . . , 3 the graphs are connected (i.e. λ2 (GVj ) > 0, j = 1, 2, 3) and thus each target is tracked by at least one agent. and target tj . Although a variety of functions can be considered, we extend upon our previous analysis using interactive control functions and define
fVj (X, xtj ) = fVj (xa1 , . . . , xan , xtj ) =
n P
i=1
fVt (xai , xtj ) n
(3.24)
Again, observe that just like Equation (3.21) appearing in §3.7, this function is simply a normalized, linear combination of mono–dimensional ICFs. Additionally,
notice that by this definition, when target tj is being actively tracked by all agents with each agent observing its desired standoff distance, it holds that fVj = 1. Similarly, when no agent is actively engaging the target, the formulation reveals fVj = 0. Accordingly, these observations are now formalized in the following theorem Theorem 3.8.1. Let A, |A| = n and T , |T | = m be as defined with corresponding state vectors such that xai , xtj ∈ Rd . Assume x˙ ai = uai ∈ U a , ∀ai ∈ A and x˙ tj = utj ∈ 92
3.8. A RELAXATION FOR LARGE–SCALE TEAMS U t , ∀tj ∈ T . Let Ψ(X) : Rdn → R denote a convex objective to be minimized over
some ∆t. By solving the following discrete–time SOCP at time k min Ψ(X(k + 1)) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n
(3.25)
P T LVj (k + 1)P > 0, j = 1, . . . , m
where vi denotes the maximum velocity of ai ∈ A, P = (1, −1)T , and LVj (X) ∈ R2×2
is the state–dependent graph Laplacian for binodal graph GVj (X) (corresponding to tj ∈ T ) having edge weights given by Equation (3.24), Ψ(X) will be minimized over ∆t while ensuring full target coverage at time k + 1.
Proof. This result trivially follows from the preceding discussion by simply applying the techniques utilized in establishing Theorems 3.6.1 and 3.7.2. The result of Theorem 3.8.1 is a standard linearly–constrained SOCP. It can be solved using modern interior–point algorithms for SOCP that are oftentimes more efficient (typically by exploiting certain problem features) in practice than SDP approaches (Boyd & Vandenberghe, 2004). Example: To illustrate the effectiveness of this novel extension, the proposed discrete–time SOCP formulation was implemented in simulation. Figure 3.9 shows the results of one trial where a team of three robots operating in R2 breaks an initial path formation in order to successfully track three evading targets. Although contrived, this example serves to highlight the governing behavior of our paradigm. By ensuring the connectivity of the m = 3 binodal visibility graphs, the team is able to ensure full target coverage. In this case, the objective was to maximize the minimal respective connectivity of these graphs with rij = 0.08, and ǫlij = ǫuij = 0.04, ∀i, j. Simulation Results: In an effort to gauge the comparative difference in complexity between the two approaches, instances of Problems (3.11) and (3.25) were solved for team sizes up to 150 nodes operating in R2 . In our SDP implementation, 93
3.8. A RELAXATION FOR LARGE–SCALE TEAMS
Agent Target Sensing Range Agent Start Target Start SOCP Path SDP Path
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Figure 3.9: A team of three agents breaks formation from an inline path to ensure active tracks of three mobile targets. Connectivity of the state–dependent visibility graph, GV (X), is ensured by using the proposed SOCP relaxation. Trajectories corresponding to the SDP formulation are also shown for comparison. we considered the objective Ψ(X) = −λ2 (LV (X)) and maintained fVa = 1. In the SOCP implementation, we considered maximizing the minimal connectivity among the m = 3 bi-nodal visibility graphs. In both cases, SeDuMi was used as the underlying solver. All computations were done on a standard desktop computer having a 2.4 GHz Core 2 Duo Pentium Processor with 2GB RAM. Figure 3.10(Top) shows the results of these trials where each data point corresponds to the mean utilization time obtained from solving ten random problem instances. Not surprisingly, the computational overhead associated with solving the SDP formulation scales cubicly in time. In contrast, the computational load incurred by solving our SOCP relaxation exhibits highly linear growth (r 2 = 0.9205). Using 94
3.8. A RELAXATION FOR LARGE–SCALE TEAMS SDP Formulation vs SOCP Relaxation 20
CPU Time (sec)
SDP Formulation SOCP Relaxation 15
10
5
0
0
50
100
150
Team Size
SOCP CPU Utilization Rescaled 0.25 Using SeDuMi Using MOSEK
CPU Time (sec)
0.2 0.15 0.1 0.05 0
0
50
100
150
Team Size
Figure 3.10: (Top) CPU time obtained from solving both the SDP formulation and SOCP relaxation via SeDuMi for teams having up to 150 agents tracking three targets in R2 . (Bottom) CPU time trends for solving the relaxed problem using both a non–industrial (SeDuMi) and industrial solver (MOSEK). SeDuMi, which is a non-industrial grade solver, we see that solving a single iteration of (3.25) for a team of 150 agents requires 174 milliseconds. In practice, however, it is far more likely that an industrial grade solver will actually be used. As such, we also solved our SOCP relaxation considering the same random problem instances using the MOSEK industrial solver package (MOSEK, 2006). The results of these trials are shown in Figure 3.10(Bottom). Again, the computational overhead exhibits an approximately linear trend (in this case, r 2 = 0.7435). However, what is perhaps more impressive is that solving a single iteration of Problem (3.25) for a team of 150 nodes requires only 33 milliseconds! This result highlights the tremendous efficacy of this approach and shows that large–scale tracking can be achieved in real–time.
95
3.9. ON CHARACTERIZING OPTIMALITY
3.9
On Characterizing Optimality
As mentioned earlier, the proposed framework guarantees optimal solutions for a given time step. Such a paradigm is ideal for operating in highly dynamic environments where a team can at best optimize over the time it takes for it to sense its surroundings and issue corresponding control signals. However, we are also interested in characterizing the performance of the discrete–time approach in scenarios featuring static target placements. Specifically, we aim to address whether the team will exhibit convergence to global optimality in such cases. Key to this analysis is the realization that each iteration of our process provides a desired descent direction and an implicit step–length towards some optimal team configuration with respect to the current target arrangement. Accordingly, we consider a contrived pair of static target arrangements that allow global solutions to be computed offline given geometric constraints. Although results presented in this section are preliminary, they do highlight the ability of our framework to “seek” a level of optimality extending beyond the current time step. In the first scenario, a team of ten agents was charged with observing a single target in R2 while maximizing connectivity of the visibility graph (i.e. Ψ(X) = −λ2 (LV (X)) subject to network connectivity constraints. Initial agent positions
were randomly chosen with xai ∼ N(0, σx = σy = 0.15) centered around the target’s location. Such a scenario mimics a surveillance team being deployed to a region based upon the believed position of the observation target. As the desired standoff distance for each member was 0.30 units (with ǫlij = ǫuij = 0.15, ∀i, j), any globally
optimal configuration was required to lie on the circle having that radius centered
at the target. A total of ten trials were run with each ultimately converging to a globally optimal configuration – yielding Ψ(X) ≈ 1. Network links were modeled to experience exponential decay between 0.05 and 0.45 units. In our implementation,
the process terminated when ∆Ψ(X) = Ψ(X(k + 1)) − Ψ(X(k)) < 1 × 10−4. Figure 3.11(a) illustrates the convergent behavior exhibited during one of the random trials. In the second scenario, a team of four agents was charged with tracking five targets arranged in a static “cross” formation. Once again, the objective was to 96
3.9. ON CHARACTERIZING OPTIMALITY
0.4 Target Optimal Solution Set Agent Start Position Agent Path Agent
0.3
0.2
0.1
0
−0.1
−0.2
−0.3
−0.4 −0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
(a) 0.4 Target Optimal Position Agent Start Position Agent Path Agent
0.3
0.2
0.1
0
−0.1
−0.2
−0.3
−0.4 −0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
(b)
Figure 3.11: (a) Ten agents observing a static target converge to a globally optimal arrangement in R2 . (b) Four agents also converge to a globally optimal arrangement while observing five static targets. In both Ψ(X) = −λ2 (LV (X)) 97
3.10. DISCUSSION maximize target visibility. In this configuration, each target was separated by 0.35 units along the x and y axes of the coordinate frame. Given the desired standoff distance of 0.175 units (with ǫlij = ǫuij = 0.125, ∀i, j), the globally optimal configuration corresponded to the midpoints along the adjoining line segments of the cross.
Initial agent positions were chosen according to the random model employed for the first scenario with the mean corresponding to the center of the target arrangement. Both the same network model and termination criteria were utilized for these trials as well. Of the ten instances, eight resulted in convergence to global optimality – yielding Ψ(X) ≈ 0.382. In the remaining cases, locally optimal configurations
were attained. This local convergence can be attributed to initial arrangements
that placed a majority of the team in close proximity to a single periphery target – leaving one target well–beyond sensing range. Such behavior is intuitive as the control functions governing agent–target behaviors are likely to have reduced influence on an agent’s position when the target lies outside the exponential well that models sensor visibility. Figure 3.11(b) illustrates the globally convergent behavior attained from one of these trials. Expanding these preliminary results to include more diverse scenarios as well as to explore the effects of the interactive control functions on convergence is the focus of continued research on this topic.
3.10
Discussion
In this chapter, an optimization framework for dynamic target tracking with performance guarantees was presented for multi–agent systems. To realize said framework, the notion of state–dependent visibility graph was presented that captures the level of visibility between system elements featuring omnidirectional sensing modalities. Noting that dynamic target tracking lends itself well to a discrete–time framework, standard linearization and discretization techniques were employed to define an iterative SDP approach for solving the target tracking problem subject to network connectivity constraints. An extension was also formulated that ensures a minimal
98
3.10. DISCUSSION cardinality coverage for each of the observation targets. In cases where communication constraints can be relaxed, a novel SOCP relaxation to the target tracking problem was presented that ensures connectivity of the state–dependent visibility graph while providing a tremendous reduction in computational cost in–practice when compared to an equivalent SDP formulation. In solving said relaxation, the resulting configuration guarantees at least a single team member is tracking each target. Although elegant, the proposed paradigm can be improved with a number of key extensions. Among them is the ability to effectively embed support for directional sensing modalities such as traditional camera systems that are constrained by the optical field–of–view. Exploring this topic has revealed that a multi–dimensional generalization of the proposed interactive control functions may yield a possible solution. Specifically, such ICFs would be both a functional of the distance as well as the bounds dictating the solid angle subtending the sensor’s field–of–view. Exploring such capabilities are the topic of ongoing research. Another key area of exploration will be the formulation of this problem in a decentralized form. Typically, SDPs do not lend themselves well to distributed computation or decentralization; however, in this case, decentralization may be feasible (see §5.1). Illustrating this point, recent advances have been made where a fully decentralized eigenvector computation can be done to approximate the algebraic connectivity of the network graph (Gennaro & Jadbabaie, 2006). Specifically, they have shown that as λ2 is a concave function in a space orthogonal to the vector of all ones and that supergradient techniques may be employed to perform a decentralized computation of λ2 ’s eigenvector to generate a direction that increases graph connectivity. As such, adapting such a technique may be perfectly within reason. Finally, the convergence properties of the proposed algorithm are still somewhat unknown with regards to global optimality. Assuming the targets remain static, it is clear that each of our formulations yield a descent direction. Additionally, since the value of λ2 is bounded above, the algorithm must terminate in a finite number of iterations as noted by (Kim & Mesbahi, 2006). However, it would be
99
3.10. DISCUSSION interesting to see whether further bounds can be placed perhaps as guarantees for global optimality. Such a characterization is non–trivial and will be explored. It is now worth stating that the discrete–time paradigm has a number of key advantages including its flexibility. In the following section, it is shown that the discrete–time framework presented in this section has laid the foundation for a more general formulation that affords a number of other impressive performance guarantees and behaviors. Among them are formation control (obtained by melding the results of Chapter 2 with Chapter 3, goal–directed behaviors via multi–objective optimization and even static and dynamic collision avoidance in potentially unexplored workspaces. Accordingly, these results are presented in the next chapter.
100
Chapter 4 A Convex Framework In this chapter, it shall be shown that the seemingly unrelated results presented in Chapters 2 and 3 are, in fact, instances of a more general convex optimization framework that facilitates multi–agent motion planning (and, in some sense, control) in dynamic environments. Specifically, by employing both SOCP and SDP, motion planning and coordination is obtained once again by considering a discrete– time approach to the optimization process. The result is an impressive toolkit for multi–agent planning that affords a number of impressive performance guarantees to include obstacle avoidance in unknown environments, full target coverage (or k–coverage depending on the application), network interconnectivity, and even formation shape control. Additionally, the framework is general enough to allow minimizing an arbitrary convex function while also pursuing high–level goals represented as objective waypoints in the agents’ configuration space.
4.1
Related Work
In the context of this research, it is assumed that the multi–agent system has been assigned goals or sub–goals in the form of sparse waypoints. Such points can be dictated by a high–level planning process or perhaps by a human technician guiding the team from a remote location. In any case, having a team successfully navigate using such a high–level plan is a problem that has necessarily garnered much attention 101
4.1. RELATED WORK
(a)
(b)
Figure 4.1: (a) A single robot follows the minimum potential valley of an artificial potential function modeling the environment. The function is constructed such that the final goal position of the agent (cross) is a global minimum. (b) The corresponding artificial potential visualized in R3 . from the robotics community. Among the more popular approaches to addressing this problem are those that exploit the use of artificial potential functions (Khatib, 1986) (see Figure 4.1). These techniques model the world as a potential–field featuring attractive and repulsive forces. Desired (unoccupied) positions are modeled as attractors, while the location of perceived obstacles are modeled as repulsers. At each time step, the gradient of the potential–field is measured and used to generate desired trajectories that direct the robot(s) toward their objective position(s) while avoiding obstacles. Essentially, each robot follows the trajectory through the minimal potential valley with respect to its current location. As such, this technique is hindered by local minima which can “trap” robots at a non–goal location. Other seminal results include navigation functions (Koditschek & Rimon, 1990; Cowan et al. , 2002). Intuitively, navigation functions are dipolar cost functions constructed in worlds where obstacles can be modeled as “star–shaped” objects. In contrast to the cost function afforded by simple potential field methods, following the gradient of a navigation function is guaranteed to converge to a global minimum denoting the objective waypoint even in environments where local–minima would
102
4.2. GOAL–DIRECTED BEHAVIOR otherwise be present. Although impressive, this approach is hindered as the process of generating such functions is tedious for complex environments and requires potentially extensive parameter tuning. However, efforts have been made toward extending this notion to multi–agent systems (Loizou et al. , 2004). Evaluating each of these state–of–the–art approaches, it is clear that at the core of both is an embedding of goal–directed behavior and collision avoidance. As such, the focus of this chapter is aimed at embedding such features in the discrete–time paradigm. In doing so, results are generated that mimic general potential field approaches with the added benefit of minimizing an additional convex objective while providing many of the previously stated performance guarantees (see Chapters 2 and 3). More precisely, the notion of an artificial potential function is supplanted by a convex combination of objectives with trajectories determined by gradient– descent over a higher dimensional space representing the intersection of the cone of positive semidefinite matrices with an affine space. Highlighting the benefit of such an optimization framework, consider team–based motion planning in the context of a search and rescue mission. One can readily envisage the problem of having a team of robots obtain a desired waypoint (or sequence of waypoints) while maximizing sensor area coverage to increase the likelihood of identifying a specific target. Key to enabling such behaviors is to drive agents as a functional of team proximity to some location. In the following section, it shall be shown that by considering a straightforward point–mass abstraction, such behavior can be readily integrated into the discrete–time framework.
4.2
Goal–directed Behavior
Key to the forthcoming results is the realization that goal–directed behaviors, where the goal is an objective waypoint, can be effectively embedded by considering a point–mass abstraction where the current agent formation is projected as a single point in Rd . More precisely, the team can be guided by minimizing the norm capturing the distance between the expected value of agent positions and the goal position
103
4.2. GOAL–DIRECTED BEHAVIOR (see Figure 4.2). Such an approach is very intuitive, and it allows the team to consider goal–directed motions while still affording each agent individual discretion to satisfy local constraints. It is this idea that is central to the following formulation.
Figure 4.2: One approach to embedding goal–directed behavior in the proffered framework is to minimize the norm between a point–mass abstraction (corresponding to the weighted formation centroid) and the objective waypoint. Following the conventions of Chapter 3, let X = (xa1 , . . . , xan )T ∈ Rdn denote the
concatenated positions of the team members with xai ∈ Rd , d ∈ {2, 3} representing
the location of ai with respect to some world frame in Rd . Again, assume the team is charged with obtaining a sequence of sparse waypoints serving as goal positions while minimizing an additional objective Ψ(X) : Rdn → R. Let X G = (xg1 , . . . , xgs )T ∈
Rds denote the time–based sequence of these points. Assuming an obstacle–free
workspace represented by some continuous space W ⊂ Rd without boundary, the
following theorem is readily established
Theorem 4.2.1. Let X and X G be as defined and assume the current objective d dn waypoint is given by xG → R is a i ∈ W ⊆ R . Additionally, assume Ψ(X) : R
convex objective with optimum X ∗ . A team of n fully–actuated robots will converge to xgi ∈ W and minimize Ψ(X) by iteratively solving min Ψ(X(k + 1)) + αz z(k + 1) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n k n1 W T X(k + 1) −
xgi
104
k2 ≤ z(k + 1)
(4.1)
4.2. GOAL–DIRECTED BEHAVIOR where αz ∈ R+ and W ∈ Rn is a weight vector with wi ∈ R+ , ∀ai ∈ A. Proof. Begin by noticing that minimizing Ψ(X) and converging to xgi is readily obtained by solving following convex program min Ψ(X) + αz z s.t. k n1 W T X − xgi k2 ≤ z In this formulation,
1 WTX n
(4.2)
corresponds to the weighted centroid (i.e. expected
value) of agent positions. Noting the second–order conic inequality on the Euclidean distance separating this value from xgi and noting that wi ∈ R+ , ∀ai ∈ A it follows
that z ∗ = 0. As W is assumed convex and xgi ∈ W, it must also hold that z ∗ = 0
is always feasible. As such, a global minimizer of f (X, z) = Ψ(X) + αz z must correspond to the point (X ∗ , 0)T . Noting this observation, a first–order discretization
with sample rate ∆t can be applied to Problem (4.2) yielding Problem (4.1). As the team is assumed fully–actuated (i.e. holonomic), solving Problem (4.1) corresponds to following a descent direction of the given objective with a step length governed by the assigned velocity constraints. As such, since ∆t → 0 and k → ∞, it holds that f (X(k + 1), z(k + 1)) → f (X ∗ , 0).
Remark 4.2.2. The weight vector W ∈ Rn , wi ∈ R+ serves as a mechanism for assigning importance to an agent’s individual, goal–directed motions. Intuitively,
team members having a higher–weight will govern the expected value and are drawn more quickly to the goal position. Accordingly, those assigned less weight are allowed to stray more freely from the agent formation. At first glance, the result presented in Theorem 4.2.1 may seem inconsequential. After all, the goal of this research is to establish optimal motion planning with goal–directed behavior for teams operating in dynamic environments. However, it is important to notice that Theorem 4.2.1 takes the first steps towards establishing such behaviors as it has effectively provided a means by which to model attractors in our discrete–time optimization framework. Additionally, the importance of the αz term should also be noted. In some sense, this parameter serves to assign priority to the inherently multi–objective optimization problem. In cases where αz ≫ 0, the 105
4.3. COLLISION AVOIDANCE trajectory of the team will be governed by the strong desire to obtain the waypoint. Conversely, when α ≈ 0, the trajectory of the team will be generally governed by reducing Ψ(X). For the moment, a detailed example highlighting the utility of this approach as well as the role of αz is deferred until §4.3 where a more illuminating
example is provided. Accordingly, the focus of this discussion now turns to the issue of collision avoidance, which is addressed in the following section.
4.3
Collision Avoidance
The results of §4.2 show that embedding goal–directed behavior is a rather straight-
forward and intuitive process. However, the assumptions imposed to yield Theorem
4.2.1 are highly restrictive and are not suited for use by multi–agent systems operating in potentially dynamic environments. Such environments may feature actors that have differing objectives or even exhibit competitive behaviors towards the multi–agent system. As such, in developing a motion planning framework, it is paramount to embed reactive behaviors that enable the team to effectively adapt to the potentially unpredictable motions of others. Accordingly, in this section, initial steps are taken towards embedding such intelligence in a team employing the proposed framework. Specifically, a bottom–up approach is adopted in designing a state–dependent graphical construct called the extended collision graph which facilitates inter–agent collision avoidance as well as static/dynamic obstacle avoidance for the case when ∆t is sufficiently small. It should also be noted that in this context a “collision” is defined as the violation of proximity constraints (in this case, defined as a direct function of distance) by one or more actors operating in the environment. Enforcing Inter–agent Proximity Constraints Before proceeding, it should also be mentioned that (Kim & Mesbahi, 2006) consider an approach to enforce inter–agent proximity constraints via SDP through first–order linearization and discretization of a Euclidean Distance Matrix (EDM). 106
4.3. COLLISION AVOIDANCE Although elegant, such a solution involves computations performed on an inherently dense matrix structure. In contrast, the results presented herein stay with the overall theme of graph theory and are easily amenable to a variety of applications that could benefit from potentially more efficient second–order cone solutions. In contrast to employing a dense EDM, the forthcoming binodal graphical formulation reduces the number of constraints by an order of magnitude to grow linearly with team size. The key to the obtaining said results is the realization that a team of agents coupled with inter–agent proximity constraints induce a weighted graph. This graphical construct features a set of vertices corresponding to system elements and edge weights given by an indicator function measuring whether proximity constraints are satisfied. Intuitively, a pair of connected vertices represents an agent pair that is separated by the maximum of their respective desired standoff distances. Formalizing this definition, consider the multi–agent system A = {a1 , . . . , an }
operating in W ⊂ Rd and define the inter–agent collision graph as follows
Definition 4.3.1. The inter–agent collision graph is the weighted, undirected, non– reflexive graph GC (VC , EC ) where VC = A and EC = {(ai , aj ) : ∀ai , aj ∈ A, i 6= j}. The weight of each edge is given by the state–dependent indicator function IC : EC → {0, 1} defined such that
0, k xa − xa k2 ≤ r a i j ij IC (ai , aj ) = 1, k xa − xa k > r a j 2 ij i
(4.3)
where xai again denotes the position of the ith agent and rija = max ria , rja , such that ria , rja ∈ R+ , denotes the maximal desired standoff distance between agents ai and aj . Figure 4.3 illustrates GC (VC , EC ) for four agents operating in R2 .
Although elegant, the definition of the inter–agent collision graph prevents it from being directly embedded into a state–dependent, discrete–time framework. The problem arises due to the non–differentiability of the indicator function which features a discontinuity at rija . To mitigate this problem, inspiration can be sought from the results of Chapter 3 and an auxiliary graphical formulation can be defined 107
4.3. COLLISION AVOIDANCE
Figure 4.3: The inter–agent collision graph GC (VC , EC ) for a team of four agents operating in R2 . Edge weights are given by an indicator function IC which enforces that two agents are adjacent if and only if they are separated by the maximum of their respective obstacle standoff distances. that employs interactive control functions (see Section 3.5) to replicate the behavior of IC . As these functions are smooth and implicitly embed desired agent behaviors, they lend themselves well to first–order discretization techniques and are highly amenable to the proffered framework. In this research, the chosen interactive control function fCa : EC → [0, 1] is defined as the state–dependent sigmoid function fCa xai , xaj =
1 a 2 −kxa −xa k2 γ (rij i j 2)
1+e
(4.4)
where γ serves as a tuning parameter (see Figure 4.4). Observe that as γ → +∞,
fCa → IC and thus both formulations can be considered equivalent for some γ ≫ 0. It is also worth noting that by supplanting IC with fCa that GC becomes a state–
dependent graph and accordingly from herein it shall be denoted GC (X).
108
4.3. COLLISION AVOIDANCE 1 γ = 0.5 γ=1 γ=2 γ=8
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0 0
2
4
6
8
Figure 4.4: fCa with rija =
√
10
12
14
16
18
20
10 units for different values of γ.
Given Equation (4.4), first–order discretization techniques can now be applied (Kim & Mesbahi, 2006). Doing so yields the following update at time step k for fCa fCa (k + 1) = fCa (k) T a + τA (k) xai (k + 1) − xaj (k + 1) xi (k) − xaj (k) − k xai (k) − xaj (k) k2
(4.5)
where τA (k) is given as follows ) τA (k) = h i2 a2 a a 2 1 + eγ (rij −kxi (k)−xj (k)k2 ) 2γeγ (rij
a 2 −kxa (k)−xa (k)k2 2 i j
Following the approach presented in Chapter 3, the discrete–time formulation of the state–dependent graph Laplacian at time k can be directly formulated as follows −fCa (xai , xaj )(k), i 6= j P a a a (4.6) [LC (k)]ij = fC (xi , xj )(k), i = j i6=j
109
4.3. COLLISION AVOIDANCE Furthermore, ensuring that λ2 (LC (k + 1)) = n implies that the graph remains complete at step k + 1. As such, inter–agent proximity constraints are satisfied and the corresponding trajectories do not intersect for sufficiently small ∆t (i.e. such that max {vi } ∆t ≪ min rija ). Perhaps equally important is that this i=1,...,n
i=1,...,n
constraint can be effectively modeled as a linear system of n equations. As a result,
it lends itself well to SOCP techniques. This result is formalized by the following theorem Theorem 4.3.2. Let A, |A| = n, be as defined with state vector X ∈ Rdn , and
assume x˙ ai = u ∈ U, ∀ai ∈ A. Let Ψ(X) : Rdn → R denote a convex objective to be
minimized over a sufficiently small ∆t. By solving the following SOCP at time k min Ψ(X(k + 1)) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n P a a a fC (xi , xj )(k + 1) = (n − 1), i = 1, . . . , n
(4.7)
j6=i
with fCa (xai , xaj )(k + 1) being given by Equation (4.5), Ψ(X) will be minimized over ∆t while ensuring no inter–agent collisions. Proof. Given the previous discussion, it trivially follows that if the state–dependent graph GC (X(k + 1)) remains complete that the positions yielded at time step k + 1 do not result in collision and the corresponding trajectories do not intersect for sufficiently small ∆t. As such, the problem is equivalent to solving the following SDP at time k min Ψ(X(k + 1)) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n
(4.8)
PCT LC (k + 1)PC − nIn−1 0
where the vectors of PC ∈ R(n−1)×n span a subspace that is orthogonal to the vector
of all ones of the same dimension (Kim & Mesbahi, 2006), and In−1 denotes the (n − 1) × (n − 1) identity matrix.
Assuming an initially feasible configuration, observe that the linear–matrix in-
equality of Equation (4.8) is only satisfied when λ2 (LC (k + 1)) = n. In other words, 110
4.3. COLLISION AVOIDANCE this constraint only holds when λ2 (LC (k + 1)) achieves its upper–bound of n (see (Godsil & Royle, 2001) for further details regarding this bound). Since λ2 (LC (k+1)) is required maximal (i.e. GC (X(k + 1)) is complete), it must hold that ∀k, LC (k + 1) is of the form
[LC (k + 1)]ij =
(
−1,
i 6= j
(n − 1), i = j
(4.9)
Given its definition, this form holds if and only if ∀ai , aj ∈ A, i 6= j, fCa (xai , xaj )(k+
1) = 1. As such, the following relationship is established X j6=i
fCa (xai , xaj )(k + 1) = (n − 1), i = 1, . . . , n ⇐⇒ PCT LC (k + 1)PC − nIn−1 0
In other words, the former provides a necessary and sufficient condition for ensuring the connectivity of the associated state–dependent graph. Replacing the latter in Problem (4.8) yields the SOCP stated in Theorem 4.3.2. It should also be noted that the linear constraints appearing in Problem (4.7) have an alternate interpretation. Specifically, they can be viewed as corresponding to n binodal graphs where the ith graph captures the proximity of the team to ai ∈ A. Supporting this observation is the following Corollary to Theorem 4.3.2 Corollary 4.3.3. Let A, |A| = n, be as defined with state vector X ∈ Rdn , and assume x˙ ai = u ∈ U, ∀ai ∈ A. Let Ψ(X) : Rdn → R denote a convex objective to be
minimized over a sufficiently small ∆t. By solving the following SOCP at time k min Ψ(X(k + 1)) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n
(4.10)
PCTi LCi (k + 1)PCi = 2, i = 1, . . . , n
where PCi = (1, −1)T and LCi is the Laplacian corresponding to binodal graph GCi
with edge weights given by
fCai (X) =
1 X a a a fC (xi , xj ) n−1 j6=i
Ψ(X) will be minimized over ∆t while ensuring no inter–agent collisions. 111
(4.11)
4.3. COLLISION AVOIDANCE Proof. To obtain this result, associate with each ai ∈ A a graph GCi (VCi , ECi ) where VCi = {ai , a ¯i } and a¯i represents the set A \ {ai }. In this context, ECi is comprised
of a single edge with weight that gauges the proximity of ai to each of its (n − 1)
teammates. Define fCai : Rd(n−1) → [0, 1] as in Equation (4.11) and let it denote
the associated edge weight with graph GCi . Notice that as ai maintains its desired
standoff from every other agent that fCai = 1. When all agents violate minimal proximity to ai (i.e. are within ria units of ai ), it similarly holds that fCai = 0. As such, GCi fully captures the proximity relationship between ai and each of its teammates. Applying a first–order discretization and enforcing the completeness of GCi , results in the following linear constraint at time k fCai (X)(k + 1) =
1 X a a a f (x , x )(k + 1) = 1 n − 1 j6=i C i j
(4.12)
As such, the constraints appearing in Problem (4.7) are exactly those appearing in Equation (4.12). Accordingly, they can be interpreted as enforcing the connectivity of n binodal graphs over ∆t for step k + 1. Observe that this is only realized when the edge has the maximal weight and yields λ2 (LCi (k + 1)) = 2. Following directly from this is the result stated in Corollary 4.3.3. Figure 4.5 illustrates the intuition that leads to Corollary 4.3.3. Specifically, it depicts the binodal representation of the inter–agent collision graph corresponding to the team arrangement given in Figure 4.3. The astute reader will note that enforcing the proximity of ai with respect to aj in GCi where i < j is sufficient to also ensure that aj is safely apart from ai . In other words, Equation (4.11) can be replaced by the following to yield an equivalent binodal graph formulation fCi (X) =
n−1 a X fC (xai , xaj )(k + 1)
(n − i)
j>i
(4.13)
It should be noted that such a formulation serves only to reduce the number of constraints by one as the total number of variables will remain the same in the resulting SOCP.
112
4.3. COLLISION AVOIDANCE
Figure 4.5: The binodal graphs GCi that implicitly enforce the completeness of the inter–agent collision graph GC appearing in Figure 4.3. Given Corollary 4.3.3, it is natural to combine it with Theorem 4.2.1 to yield a discrete–time framework that exhibits goal–directed behavior while enforcing inter– agent collision avoidance. However, before doing so, it is essential to consider foreign elements occupying the environment and how they can be properly integrated. Accordingly, in the following section, it shall be shown that embedding static obstacles is a straightforward extension to the results given by Theorem 4.3.2. Embedding Static Obstacle Avoidance Given the results of §4.3, it is a straightforward exercise to embed static obstacle
avoidance in the proposed framework. However, it should be emphasized that the results presented in this section stand in contrast to those presented in §2.7 where
a hierarchical discrete–to–continuous optimization approach was employed over a tessellation of the fully–known, polygonal workspace. In generating the forthcoming results, it is only assumed that the environment is known to be static and that
113
4.3. COLLISION AVOIDANCE agents can either directly (via their onboard sensing modalities) or indirectly (perhaps through human feedback) detect/obtain the location of (or distance to) static obstacles in their operating environment. In an effort to illustrate the realistic nature of said assumptions as well as provide additional context, consider the scenario where soldiers are operating in tandem with a multi–robot system in an urban war zone. In a surveillance application, the soldiers could “paint” obstacles and feed those locations to the robot team. It may also be the case that the team knows the workspace will be static (e.g. perhaps for demining a cordoned area) and the robots can utilize their onboard sensing and communication capabilities to identify hazards and safely avoid them. As shall be seen, such applications are quite feasible within the proffered framework. Key to the forthcoming results, is the realization that the set of agents and the full set of stationary obstacles collectively define an undirected, weighted graph with edge weights indicating the proximity of one element to another. Recalling the results of the previous section, it is natural to consider an extension of the inter– agent collision graph that also includes vertices corresponding to static obstacles. In such a representation, edges would again represent whether safe proximity is maintained between the corresponding pair of elements. Enforcing the completeness of this graph ensures both inter–agent and agent–obstacle collisions are avoided. This intuition leads to the formulation of the extended collision graph which can now be formally presented. Begin by letting O = {o1 , . . . , om } denote the set of stationary obstacles occupy-
ing the agents’ configuration space W ⊆ Rd . Let the vector X O = (xo1 , . . . , xom )T ∈ Rdm be the state vector corresponding to the concatenated positions of said ele-
ments. Additionally, assume each agent ai is assigned a minimum standoff distance rijo ∈ R+ with respect to each oj . Given these assumptions, consider the definition Definition 4.3.4. The extended collision graph is the undirected, weighted, non– reflexive graph GCˆ (VCˆ , ECˆ ) where VCˆ = A ∪ O and ECˆ = ((A ∪ O) × (A ∪ O)) \
{(ai , ai ) : ai ∈ A}. Edge weights are given by the state–dependent indicator function
114
4.3. COLLISION AVOIDANCE ICˆ : ECˆ → {0, 1} defined such that a ICˆ (yi , yj ), (yi, yj ) ∈ A × A ICˆ (yi, yj ) = ICoˆ (yi , yj ), (yi , yj ) ∈ A × O 1, (yi , yj ) ∈ O × O
where ICaˆ corresponds to Equation (4.3) and ICoˆ is given as follows ( 0, k xai − xoj k2 ≤ rijo ICoˆ (ai , oj ) = 1, k xai − xoj k2 > rijo
(4.14)
(4.15)
where xoi denotes the position of the ith obstacle and rijo ∈ R+ represents the desired standoff of ai with respect to oj ∈ O.
Figure 4.6: The extended collision graph GCˆ (VCˆ , ECˆ ) for a pair of agents operating among two static obstacles in R2 . Edge weights are given by an indicator function ICˆ that gauges the proximity of elements corresponding to adjacent vertices. At first glance, the definition of the extended collision graph may appear somewhat off–target as inter–obstacle edges are also included in its structure. However, 115
4.3. COLLISION AVOIDANCE their inclusion/exclusion is simply a matter of preference. In this formulation, it is assumed that these edges are included with maximal weight so that the completeness constraint of the forthcoming SOCP can be expressed more naturally as a function of both n and m (i.e. λ2 (LCˆ ) = (n + m)). However, it is also quite reasonable to assume that inter–obstacle edges are excluded (i.e. receive null weight), in which case the corresponding constraint would be λ2 (LCˆ ) = n. In either case, the same result is obtained as only the weights associated with the sub–graph capturing agent–agent and agent–obstacle proximity need only be defined as a direct functional of the state vectors X and X O . Figure 4.6 illustrates the extended collision graph for an agent pair operating in R2 among two static obstacles. Given this definition, recall the discussion of the previous section in which an auxiliary (but ultimately equivalent) formulation was considered since IC was non– differentiable. Similarly, ICˆ is not amenable to the proposed framework. Accordingly, it is replaced by weight function fCˆ : R2d → [0, 1] that exploits the differen-
tiable state–dependent sigmoid appearing in Equation (4.4). More precisely, edge weights are governed by
fCˆ (yi , yj ) =
a a a fCˆ (xi , xj ), (yi, yj ) ∈ A × A
fCoˆ (xai , xoj ), (yi , yj ) ∈ A × O
(4.16)
(yi , yj ) ∈ O × O
1,
where fCaˆ is as previously defined and fCoˆ corresponds to Equation (4.4) with respect to distances rijo and k xai − xoj k2 .
Given this formulation, a first–order linearization and discretization can be re-
alized. Doing so yields the discrete–time update in Equation (4.5) for fCaˆ , while the following update is obtained for fCoˆ fCoˆ (k + 1) = fCoˆ (k) + τO (k) {xai (k + 1) − xai (k)}T xai (k) − xoj
where τO (k) is given as follows
) τO (k) = h i2 o2 a o 2 1 + eγ (ri −kxi (k)−xj k2 ) 2γeγ (ri
o 2 −kxa (k)−xo k2 i j 2
116
(4.17)
4.3. COLLISION AVOIDANCE In light of these results, the state–dependent weighted graph Laplacian corresponding to GCˆ (X) can be defined as follows for time step k −fCˆ (yi , yj )(k), i 6= j P [LCˆ (k)]ij = fCˆ (yi, yj )(k), i = j
(4.18)
i6=j
where yi and yj are defined such that ( xal , l ≤ n = |A| yl = xo(l−n) , n < l
As a direct result of this analysis is the following contribution Theorem 4.3.5. Let A, |A| = n, be as defined with state vector X ∈ Rdn , and
assume x˙ ai = u ∈ U, ∀ai ∈ A. Let O, |O| = m, be as defined with positions X O ∈ Rdm co–occupying agent workspace W ⊂ Rd . Assume Ψ(X) : Rdn → R is a convex objective to be minimized over ∆t. By solving the following SOCP min Ψ(X(k + 1)) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n n m P P 1 1 a a a (x , x )(k + 1) + f fCoˆ (xai , xoj )(k + 1) = 2, i = 1, . . . , n i j ˆ n−1 m C j=1,j6=i
j=1
(4.19)
with fCaˆ (xai , xaj )(k + 1) given by Equation (4.5) and fCoˆ (xai , xoj )(k + 1) given by Equation (4.17), Ψ(X) will be minimized while ensuring inter–agent and agent-obstacle collisions are avoided. Proof. Proving this result is a trivial extension of the proof for Theorem 4.3.2. As such, the reader is referred to said result appearing in the previous section. It is also worth noting that ensuring a complete extended collision graph is the same as maintaining the connectivity of an equivalent binodal graph representation. Recalling the discussion from the previous section, it was shown that an alternate interpretation of the resulting linear constraints can be considered where each constraint corresponds to maintaining the connectivity of a binodal graph GCi for a 117
4.3. COLLISION AVOIDANCE specific definition of fCi . A similar result is readily obtained here by simply defining the binodal graph GCˆi (VCˆi , ECˆi ) with VCˆi having vertices corresponding to ai and a ¯i , where a ¯i captures the set {A ∪ O} \ {ai }. In this context, ECˆi is comprised of
a single edge with weight that gauges the proximity of ai to each of the remaining (n−1) teammates and m static obstacles. As this result mirrors that of the previous
section, the reader is referred to that section for additional details about the binodal formulation. Accordingly, the following result is presented without proof. Corollary 4.3.6. Let A, |A| = n, be as defined with state vector X ∈ Rdn , and assume x˙ ai = u ∈ U, ∀ai ∈ A. Let Ψ(X) : Rdn → R denote a convex objective to be
minimized over ∆t. By solving the following SOCP at time k min Ψ(X(k + 1))
s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n
(4.20)
PCTˆ LCˆi (k + 1)PCˆi = 2, i = 1, . . . , n i
where P = (1, −1)T and LCˆi is the Laplacian corresponding to binodal graph GCˆi with fCˆi (X)(k + 1) given by
fCˆi (X)(k + 1) = fCˆi (X)(k)+ " # n m 1 1 X o a o 1 X a a a f (x , x )(k + 1) + f (x , x )(k + 1) (4.21) 2 n − 1 j6=i C i j m j=1 C i j Ψ(X) will be minimized over ∆t while ensuring no inter–agent collisions. Proof. Proving this result is a trivial extension of the proof for Corollary 4.3.3. As such, the reader is referred to said result appearing in the previous section. Again the astute reader will note that the SOCP given by Corollary 4.3.6 can be further reduced by considering n − 1 binodal graphs with edge weight given by fCˆi (X) =
n−1 a X fC (xai , xaj )(k + 1) j>i
2(n − i)
+
m X fCo (xai , xoj )(k + 1) j=1
2m
(4.22)
Just as in the results of the previous section, this equation facilitates an auxiliary SOCP that fully captures the desired feasible set of agent positions while safely 118
4.3. COLLISION AVOIDANCE
(a)
(b)
Figure 4.7: (a) A team of five agents initially obtains the desired delta formation (Ψ(X) ≤ 1×10−3 ) while en–route to a given goal g = (3, 05)T . (b) The team deforms in the obstacle field in order to maintain the feasibility of the n binodal collision graphs before converging back to the desired shape geometry upon clearance. 119
4.3. COLLISION AVOIDANCE avoiding inter–agent and agent–obstacle collisions. Observe that when no static obstacles are embedded in the graph, Equation (4.22) reduces to Equation (4.13). Remark 4.3.7. The extended collision graph should be considered a dynamic graphical construct as vertices can be introduced/removed during each time step as agents and/or obstacles enter/leave the system. Such a view is very natural, and all of the results presented thus far hold under such a definition. Example: To validate the results of Corollary 4.3.6, a scenario is considered where a team of five mobile agents, initially deployed on an inline path, are charged with obtaining a delta formation (i.e. Ψ(X) =k AS X k2 where AS captures the desired
formation shape as in Theorem 2.4.1) and an objective waypoint xgi = (3, 0.5)T . In this example, the workspace was littered with five static obstacles, and goal–directed behavior was embedded via Theorem 4.2.1 with each agent’s position given unitary weight (i.e. wi = 1, ∀ai ∈ A). The desired standoffs were chosen as ria = 0.15, ∀ai ∈ A (i.e. rija = 0.15, ∀ai , aj ∈ A, i 6= j) and rijo = 0.2 units ∀ai ∈ A, ∀oj ∈ O.
Figure 4.7 shows the results from one trial with αz = 2.55. Observe that the team
initially transforms to the desired delta formation (Ψ(X) ≈ 1 × 10−3 ) as illustrated in Figure 4.7(a). However, upon encountering the obstacles at the desired minimum standoff rijo , the team formation deforms (see Figure 4.7(b)) to maintain feasibility. Although the team can’t maintain formation given the obstacle positions, it continues en–route to the specified waypoint, which in this case is the primary objective as indicated by αz = 2.55. Once beyond the obstacle field, the team again obtains the desired shape formation, just at a larger scale. It should be emphasized that in this implementation, there was no minimum bound on the formation scale. As such, the lower bound on feasible scale values was dictated by the inter–agent proximity constraints (i.e. rija ). Highlighting this fact, consider Figure 4.8 which shows the minimum distance over each of the 52 agent combinations during each time step.
Maintaining maximal connectivity for each of the n binodal collision graphs ensures that the uniform inter–agent proximity bound of 0.15 units is enforced.
120
4.3. COLLISION AVOIDANCE 0.5 0.45
Min Agent−to−Agent Distance
0.4 0.35 0.3 0.25 0.2 0.15
a
ri
0.1 0.05 0
500
1000
1500 k
2000
2500
3000
Figure 4.8: The minimum distance with respect to the full set of 52 agent combinations during each iteration of the process. In this case, ∀ai ∈ A, ria = 0.15 units (dashed line). Maintaining the maximal connectivity of the n binodal collision graphs ensures the inter–agent proximity constraints are satisfied. Dynamic Obstacle Avoidance In the previous section, it was shown that static obstacle avoidance can be readily embedded into the discrete–time framework; however, real–world applications will often require teams of robots to effectively coordinate in dynamic environments occupied by other mobile actors perhaps with differing objectives and potentially exhibiting competitive behaviors. Accordingly, in this section, the initial steps are taken to embed dynamic collision avoidance into the proffered framework. Although a number of approaches may be considered (e.g. projecting obstacle positions based on velocity estimates), the forthcoming results consider a state–dependent formulation of the extended collision graph (see Definition 4.3.4) that permits the evolution
121
4.3. COLLISION AVOIDANCE of obstacle positions as a function of time. For appropriately small ∆t, the formulation provides a straightforward handling of the dynamic collision avoidance problem. Before proceeding, however, a few observations must be made. Among them is the realization that a collision between a robot and a dynamic obstacle is the direct result of at least one of these actors choosing a trajectory that ultimately results in the violation of the agent’s proximity constraints. Naturally, this violation of feasibility can be the result of the agent’s miscalculation (i.e. poor planning), the obstacle’s miscalculation, or miscalculation on the part of both actors. As the state of each dynamic obstacle is assumed independent of team motion (i.e. each obstacle’s state is not influenced by the team’s actions), scenarios can arise where collisions are simply unavoidable. Illustrating this point, consider the pathological case presented in Figure 4.9. At step k, xai (i.e. ai ) lies at the o o o intersection of the three circles of radius ri1 = ri2 = ri3 , each denoting the desired
proximity of ai ∈ A with respect to each oj ∈ O having non–zero velocity vector
vio . To ensure feasibility at k + 1, the agent will be forced to maintain its current position (i.e. xai (k + 1) = xai (k)); however, if one of the obstacles continues along its vj , feasibility will be violated. As such, it is clear that the team can at best react to the behaviors of dynamic obstacles without imposing constraints or modeling their motions. Accordingly, the focus of this research is to simply formulate a framework that ensures the team maintains feasibility provided such a solution actually exists. Given these observations, a construct is now presented that embeds such reactive behavior into the multi–agent system. Begin by assuming that the team can capture the position of dynamic obstacles at each time step k. Doing so yields a set of time– dependent obstacle positions O(k) = {xo1 (k), . . . , xom (k)}. Given this information, the team can employ a number of strategies to thwart hazardous encounters to
include projecting obstacle positions as a function of some assumed kinematic model while enforcing some minimal distance rio that ensures agent and obstacle paths do not intersect. An alternate, and more straightforward approach is to assume that ∆t is again sufficiently small such that xoi (k + 1) ≈ xoi (k) and enforce the connectivity
of the extended collision graph (see Definition 4.3.4) over ∆t. Doing so results in a time–dependent formulation of GCˆ where obstacle positions are updated during each 122
4.3. COLLISION AVOIDANCE
Figure 4.9: A contrived example in R2 where some obstacles may violate the feasibility of the agent–obstacle proximity constraints. Despite obtaining a feasible o o o position (i.e. at the intersection of the three discs of radius ri1 = ri2 = ri3 ) for time o (k + 1), feasibility will be violated by the obstacles assuming max {vi } > 0. i∈{1,2,3}
iteration of its discrete–time formulation. In other words, Equation (4.17) becomes fCoˆ (k + 1) = fCoˆ (k) + τO (k) {xai (k + 1) − xai (k)}T xai (k) − xoj (k)
(4.23)
where τO (k) is adjusted to yield
2γeγ (ri
) τO (k) = h i2 o2 a o 2 1 + eγ (ri −kxi (k)−xj (k)k2 ) o 2 −kxa (k)−xo (k)k2 2 i j
Although such an approach is straightforward, it demonstrates the ease with which sophisticated “reactive” behaviors can be embedded into the discrete–time paradigm while still providing certain performance guarantees. It should also be noted that in order to effectively generate a corresponding ∆t (or, equivalently, maximum agent velocities) some knowledge is required regarding the maximum velocity of dynamic obstacles sharing the workspace. This to facilitate ∆t being defined 123
4.3. COLLISION AVOIDANCE a o a o a a such that max {vmax , vmax } ∆t ≪ min {rmin , rmin } where vmax and rmin respectively
denote that maximum velocity and minimum inter–agent proximity bound of any o o team member with vmax and rmin similarly defined with respect to environmental
obstacles. A formal presentation of the associated construct is unilluminating given its direct correspondence with Definition 4.3.4. Instead, an example is now presented highlighting the utility of this approach. Example: For this example, consider a team of nine robots charged with obtaining some objective waypoint xgi = (3, 0.5)T while minimizing the team’s deviation from a desired {4, 4} tessellation at time k + 1. Accordingly, the example enforces inter–
agent and agent–obstacle proximity constraints (rija = rijo = 0.238, ∀ai ∈ A, oj ∈ O) in a space shared with three obstacles. In this scenario, two obstacles (o1 starting in the the upper–right of Figure 4.10(a) and o2 beginning in lower–left of the same figure) traveled at constant, non–zero velocity while agent velocities were uniformly chosen at 1.25 and 1.66 times that of o1 and o2 respectively. One obstacle was chosen to remain static. Each agent’s position was also assigned unitary weight (i.e. wi = 1, ∀ai ∈ A). At time k, connectivity of the extended collision graph was
enforced with respect to obstacle positions while the team attempted to achieve the desired waypoint. Figures 4.10(a-d) illustrate the results of a representative trial with αz = 1.8. To highlight the role of αz in governing team behavior, a second trial is presented in Figures 4.10(a–c) for αz = 2.2. In order to provide a direct comparison, Figure 4.11(a) directly corresponds with the same time instant (k = 3200) used to generate Figure 4.10(b). Similarly Figures 4.11(b) and 4.10(c) correspond to another time instant as do Figures 4.11(c) and 4.10(d). The former pair corresponds to the formation at k = 5750 while the last shows the results at the final iteration after convergence to the desired shape formation. At time k = 1150, which corresponds to Figure 4.10(a) the same exact trajectory and formation were obtained for both αz = 1.8 and αz = 2.2. As such, this figure has not been included. Additionally, it should be noted that in both trials, the multi–agent system did not violate inter– agent or agent–obstacle proximity bounds. 124
4.3. COLLISION AVOIDANCE
(a)
(b)
125
4.3. COLLISION AVOIDANCE
(c)
(d)
Figure 4.10: (a) (k = 1150) The team obtains the desired {4,4} tessellation while en–route to a given waypoint xgi = (3, 0.5)T (αz = 1.8). (b) (k = 3200) The team deforms to maintain feasibility of the n binodal collision graphs (c) (k = 5750) The team continues towards establishing the desired formation shape. (d) At the final time step, the team has obtained xgi and the desired tessellation shape. 126
4.3. COLLISION AVOIDANCE
(a)
(b)
127
4.4. A UNIFIED APPROACH
(c)
Figure 4.11: Operating with αz = 2.2 (a) (k = 3200) The team deforms to maintain feasibility of the n binodal collision graphs while en–route to xgi = (3, 0.5)T . (b) (k = 5750) The team continues towards establishing the desired formation shape. (c) At the final time step, the team has obtained xgi and the desired tessellation. Compare (a) to Figure 4.10(b), (b) to Figure 4.10(b), and (c) to Figure 4.10(d).
4.4
A Unified Approach
Thus far, it has been shown that both goal–directed behaviors and collision avoidance lend themselves well to the discrete–time optimization framework. Accordingly, a generalized optimal motion planning paradigm can now be formulated that exploits these fundamental capabilities while providing performance guarantees that encapsulate key results of Chapters 2 and 3. In fact, the proposed framework lends itself well to a number of rich applications ranging from optimal surveillance with formation control to optimal sensor network deployment in unknown environments subject to network connectivity constraints. The key to yielding the forthcoming result is the realization that generating a desired state for a multi–agent system reduces to following a descent direction corresponding to some convex objective over the intersection of an affine space with 128
4.4. A UNIFIED APPROACH the cone of positive semidefinite matrices. In this context, such a framework can be viewed as an extension to traditional potential field approaches (see §4.2) where the notion of an artificial potential is supplanted with a higher–dimensional functional that when feasibly minimized over ∆t provides a number of substantial performance guarantees (e.g. network interconnectivity, full target coverage, inter–agent collision avoidance, etc.). Accordingly, it is with this observation in mind that the fundamental theorem uniting many of our key results is now presented Theorem 4.4.1. Let A, |A| = n, again define a multi–agent system with state vec-
tor X ∈ Rdn , and let O, |O| = m ≥ 0 denote the set of obstacles with corresponding state–vector X O ∈ Rdm . Assume x˙ ai = uai ∈ U a , ∀ai ∈ A and x˙ oi = uoi ∈ U o , ∀oj ∈ O.
Let Ψ(X) : Rdn → R be a convex objective to be minimized over sufficiently small ∆t. If the team has objective waypoint xgi ∈ X G and some desired shape formation given by S ∈ Rdn then by solving the following SDP at time k
min Ψ(X(k + 1)) + αy y(k + 1) + αz z(k + 1) s.t. k xai (k + 1) − xai (k) k2 ≤ vi ∆t, i = 1, . . . , n k n1 1Tn X(k + 1) − xgi k2 ≤ z(k + 1)
k AS X(k + 1) k2 ≤ y(k + 1)
(4.24)
PCTˆ LCˆi (k + 1)PCˆi = 2, i = 1, . . . , n i
PNT LN (k + 1)PN ≻ 0 where AS captures the desired shape geometry S as in §2.4, {PCˆi , LCˆi } and {PNˆ , LNˆ }
are respectively defined as in Corollary 4.3.6 and Theorem 3.6.1, αs , ∈ R, αs ≥ 0,
αz ∈ R, αz ≥ 0 then Ψ(X(k + 1)) +αy y(k + 1) + αz z(k + 1) will be minimized over ∆t
while providing a desired level of formation control and goal–directed behavior while ensuring network connectivity and no collisions provided a feasible solution exists. Proof. This result directly follows as the given objective is a linear combination of convex objectives (i.e. a convex combination) and from the results presented in Theorems 2.4.1, 3.6.1, 4.2.1, and Corollary 4.3.6. Accordingly, the reader is referred to their corresponding proofs.
129
4.4. A UNIFIED APPROACH In the formulation of Problem (4.24) the objective is given by a convex combination of functions where y(k + 1) bounds shape deformation while z(k + 1) retains its previous role as discussed in §4.2 and §4.3. Each bound is again associated with
a gain or tuning parameter (i.e. αy and αz ) which can be loosely employed to assign
“priority” to its corresponding component. Choosing αy ≫ αz will intuitively force the solver to choose a descent direction that reduces y more rapidly, in some sense
placing higher priority on formation shape. Conversely, choosing αy = 0 has the effect of eliminating shape deformation from the objective and permits the team to arbitrarily deform to minimize Ψ(X(k + 1)) and αz z(k + 1). Similarly, αz can be loosely interpreted as specifying the desired level of attraction between the goal and formation centroid. Illustrating this point, choosing αz = 0 is well–suited for applications requiring team formation control while minimizing some additional convex objective (see Chapter 2). Highlighting the utility of the framework, the reader is again referred to Figures 4.10 and 4.11. These trials represent two separate instances of Problem (4.24) withΨ(X) = 0, αy = 1, and αz ∈ {1.8, 2.2}.
Theorem 4.4.1 provides a high–level SDP formulation of the multi–agent mo-
tion planning problem that combines key results from Chapters 2 and 3. To see the direct relationship between Theorem 4.4.1 and the results of Chapter 2 consider the total distance shape problem formulation (see Theorem 2.4.1. By choosing m P k qi − pi k2 with αy = αz = 0, ∆t = ∞ and eliminating the network Ψ(X) = i=1,...,m
connectivity and proximity constraints reduces Theorem 4.4.1 to the appropriate SOCP given in Theorem 2.4.1. Theorem 4.4.1 also supports optimal target tracking simply by augmenting Problem (4.24) with constraints to maintain connectivity of a corresponding visibility graph as discussed in Chapter 3. Furthermore, observe that the only LMI in Problem 4.24 enforces network connectivity. As such, eliminating the network interconnectivity constraint yields a discrete–time SOCP that provides each of the remaining performance guarantees. As a direct consequence, the proffered framework appears amenable to applications involving large–scale motion planning, as illustrated by key results presented in Chapters 2 and 3.
130
4.5. DISCUSSION
4.5
Discussion
In this chapter, the results of Chapters 2 and 3 were unified into a general motion planning framework that also affords goal–directed behaviors and dynamic collision avoidance. Key to the former is abstracting the team as a point mass and attracting this point to an objective waypoint by minimizing an upper–bound on their separation. Regarding the latter, a bottom–up approach is taken where the inter–agent collision graph is generalized as a time–dependent construct that embeds team–based reactive behaviors to obstacles. It was shown that this graph can be interpreted as a state–dependent binodal graph whose weight must remain maximal in order to ensure safe proximity between an agent and other actors sharing the environment. As the capstone of this research, these results were united with those of the preceding chapters to yield a rich, convex optimization framework for motion planning with substantial performance guarantees. It is also necessary to reiterate the place of this paradigm with respect to other research. Given its formulation, our framework is similar in some sense to artificial potential–fields. In fact, as noted earlier, our solution can be readily interpreted as supplanting the potential–function with a convex combination of objectives with team–based trajectories corresponding to descent directions in a higher–dimensional space given by the intersection of an affine space with the cone of positive semidefinite matrices. However, unlike purely reactive potential approaches, the proffered framework effectively embeds the notion of goal–driven behavior while providing a number of additional performance guarantees that include network connectivity, k–coverage of observation targets etc.. Accordingly, it can be viewed as superior to potential field approaches. In contrast to tunable navigation functions, the proffered framework is not as effective in embedding goal–directed behaviors. After all, navigation functions ensure global convergence to a desired state even in non–convex environments. Our formulation cannot guarantee such convergence; however, it still provides a number of other performance guarantees that make it a strong candidate for many applications. An interesting extension of this research would be to consider defining a class 131
4.5. DISCUSSION of convex objectives that embed such global convergence properties in non–convex environments while still affording all of the impressive performance guarantees. Obviously, such a formulation would be tremendously beneficial as it would essentially combine the best that each approach has to offer. It is also essential to acknowledge the significance of the low–level planning process which inherently relies upon environmental perceptions and sensing capabilities. More precisely, the key assumption made in generating the proffered framework was that the team had access to perfect environmental and state information. Clearly, a motion planning framework can only be as effective as the underlying algorithms utilized to generate these estimates. Although going into the nuances of such techniques is beyond the scope of this research, this dissertation includes Appendices B and C for the interested reader. They detail the results of concurrent research that was focused on developing robust perception techniques for intelligent vehicle systems operating in urban environments.
132
Chapter 5 Future Work In the previous chapter, it was shown that the discrete–time optimization framework lends itself well to arbitrary, multi–agent motion planning tasks. In this chapter, our attention is redirected towards exploring a pair of key extensions to this research. The first aims to decentralize the results of Chapter 4 by adapting recent results enabling the decentralized control of network connectivity to the more general motion planning problem. The second aims to adapt seminal results from algebraic topology to mobile sensor networks by developing online control strategies for managing/eliminating coverage holes. Key to this extension will be considering a state–dependent formulation of the combinatoric Laplacian, a generalization of the graph Laplacian considered in Chapters 3 and 4 to combinatoric spaces. Defined with respect to homological constructs known as simplicial complexes, it can be used to capture the topological invariants of the team’s sensor cover.
5.1
Decentralization
In the field of robotics (and, perhaps more generally, computer science), the ability to fully decentralize a solution is highly desirable as it often (but not necessarily always) guarantees some level of fault–tolerance/robustness, global convergence via localized computations, and scalability to large–scale systems. Accordingly, it is the goal of future research to demonstrate that the proffered framework lends itself well 133
5.2. EXPLOITING ALGEBRAIC TOPOLOGY to decentralization – at least at some level or within some well–defined context (e.g. as in the hierarchically decentralized formulation of the shape problem presented in §2.9). As it turns out, given recent advances in the decentralized control of network
connectivity via spectral graph theory, decentralized solutions may be within reach. Enabling such formulations will depend upon the ability of the team to either directly or indirectly estimate the second smallest eigenvalue (i.e. λ2 ) for each of the
state–dependent graphical constructs supporting Theorem 4.4.1. As it turns out, such a decentralized computation is possible given both the results of (Kempe & McSherry, 2004) and (Gennaro & Jadbabaie, 2006). The former establish a fully decentralized algorithm for eigenvector computation, while the latter demonstrates that a supergradient direction for λ2 of the weighted graph Laplacian is a functional of its corresponding eigenvector (i.e. its Fiedler Vector). Exploiting this result and applying those of (Kempe & McSherry, 2004), the authors develop a potential–based control scheme to ensure agents obtain the distances given by the supergradient algorithm in order to maximize connectivity of the communication graph. As their solution is fully decentralized, they are able to guarantee convergence by having each team member consider only information obtained from its single–hop neighbors. Given these results, it is quite natural to consider their application to the framework proffered by this dissertation. It is believed that successfully doing so will yield a planning or low–level control framework that retains many (if not all) of the performance guarantees given by Theorem 4.4.1. Such a formulation is the topic of ongoing research.
5.2
Exploiting Algebraic Topology
In recent years, algebraic topology has garnered much attention from the robotics community. At the highest level, it can be viewed as providing precise mathematical tools and mechanisms for identifying topological invariants (typically the number of “holes”) characterizing some deformable space. Sparking interest in this area was the seminal work of (Ghrist & Muhammad, 2005; De Silva & Ghrist, 2006)
134
5.2. EXPLOITING ALGEBRAIC TOPOLOGY
Figure 5.1: Sensor cover topology can be fully captured by exploiting a subfield of algebraic topology known as simplicial homology where the visibility network is represented as a chain of oriented k–simplices with each 0–simplex (i.e. vi ) representing an individual agent. Figure from (Ghrist & Muhammad, 2005). who successfully employ simplicial homology (a subfield of algebraic topology) for coverage verification and hole–detection in static sensor networks. Key to these results is bounding the topology given by the union of agent coverage discs with instances of a higher dimensional combinatoric structure known as the Rips–Vietoris Complex. The utility of such an approach is highlighted by the fact that the solution does not require localization capabilities which are necessary for solutions rooted in computational geometry, e.g. (Huang & Tseng, 2003; Li et al. , 2003b). Central to simplicial homology (in some sense, a generalization of algebraic graph theory) is the notion of an oriented k–simplex (see Figure 5.1), which is a combinatoric structure that captures some level of proximity between system elements (Hatcher, 2002). In the context of this research, each agent naturally corresponds to a 0–simplex with oriented 1–simplices capturing the proximity between agent pairs perhaps as a function of communication and/or sensing ranges. Intuitively, a pair of 0–simplices can be combined to yield the faces of an oriented 1–simplex which can be combined with two other 1–simplices to yield the faces of an oriented 135
5.2. EXPLOITING ALGEBRAIC TOPOLOGY 2–simplex (i.e. a tetrahedron) and so on. Each simplex has the property that it is closed with respect to its lower–dimensional faces and when chained together with other simplices yields a potentially high–dimensional combinatoric structure known as a simplicial complex. Such complexes have a k th –homology group whose dimensions (i.e. the number of group generators) dictates the number of k–dimensional holes (i.e. non–reducible k–cycles) in the corresponding topology. Conveniently, combinatoric structures (and their topological invariants) can be captured in matrix form by formulating what is known as the combinatoric Laplacian – a generalization of the graph Laplacian discussed in Chapters 3 and 4. Defined with respect to the k–simplices comprising such a complex, it is given by T Lk = BkT Bk + Bk+1 Bk+1
(5.1)
where Bk denotes the boundary operator mapping oriented k–simplices to their k −1
dimensional faces. The utility of this matrix construct becomes apparent when one considers the classic result establishing the isomorphism ker(Lk (Y )) ∼ = Hk (Y )
(5.2)
where Y denotes a simplicial complex representing some topology and Hk (Y ) denotes its corresponding k th –homology group. Given Equation (5.2), it follows that the nullity of Lk (Y ) corresponds to the number of generators of Hk (Y ) (a.k.a. its k th Betti–number) which gives the number of non–reducible k–cycles in the complex Y . Given the results of this dissertation, it is quite natural to consider a state– dependent formulation of the combinatoric Laplacian to facilitate intelligent motion planning strategies that aim to bring about a desired topological structure. Such a framework is highly desirable as it would enable multi–agent systems to postulate in a space encapsulating topological invariants which is, in some sense, agnostic to pose estimates and less susceptible to related sources of error. Developing such a framework is the focus of ongoing research inspired by this dissertation.
136
Appendix A Hybrid FSO/RF Mobile Ad–hoc Networks for Robot Teams In this appendix, the Hybrid Free–space Optics/Radio Frequency Mobile Ad–hoc Network paradigm is introduced for mobile robot teams. The reason for including this appendix is two–fold: it provides the interested reader with a detailed context in which to explore the difficulties associated with deploying multi–agent systems, and it highlights the critical role of perception in ensuring successful plan execution. Accordingly, our work with these novel systems is now presented.
A.1
Background
Consider the scenario where a natural or man–made event (e.g. an earthquake) destroys critical infrastructure in an urban area, including fiber links in the Metropolitan Area Network (MAN). Patching these high-throughput channels using traditional radio frequency (RF) wireless nodes would be impractical due to bandwidth and link distance requirements.
Instead, a team of robots could be automati-
cally deployed to provide carrier grade patches to the MAN using free-space optic transceivers (FSO) - all under conditions unsafe for human operators. This is
137
A.2. RELATED WORK only one application we envision for the hybrid FSO/RF paradigm introduced in (Derenick et al. , 2005). FSO is a commercial technology used primarily in static configurations for high bandwidth, wireless communication links (Willebrand & Ghuman, 2002). With FSO, an optical transceiver is placed on each side of a transmission path to establish a network link. The transmitter is typically an infrared (IR) laser or LED that emits a modulated IR signal. Link availability can be maintained under most weather conditions (heavy fog is a notable exception, but unlike RF based technologies, heavy rain has little effect). Perhaps the greatest advantage of FSO technology is its high throughput. The need for such a channel is evident when considering bandwidth hungry applications such as surveillance and monitoring. Commercial FSO transceivers currently provide throughputs of several Gbps with link distances of a kilometer or more (Systems Support Solutions, 2005). Terabit per second throughputs have been demonstrated under laboratory conditions. In contrast, widespread RF technologies (e.g., 802.11x) are limited to link throughputs on the order of 10s of Mbps across distances of 10s of meters. Even much anticipated ultra wideband (UWB) technology - with throughputs of 100s Mbps - drops to levels lower than 802.11a at modest ranges (r ≥ 15m) (Wilson, 2002).
The major shortcoming of FSO technology is the requirement for optical links to
maintain line-of-sight (LOS). As such, we do not suggest that FSO technology will replace RF wireless communications in mobile robotics applications. Instead, the two technologies can serve complementary roles in a hybrid FSO/RF mobile ad-hoc network (MANET).
A.2
Related Work
Sensor networks and mobile ad-hoc networks (MANETs) have become burgeoning research areas during recent years in both the networking and robotics communities. Significant work has been done in the areas of network coverage (Cortes et al. , 2002;
138
A.3. A HIERARCHICAL LINK ACQUISITION SYSTEM A. Howard, 2002; Poduri & Sukhatme, 2004), localization (Langendoen & Reijers, 2003), network deployment (Corke et al. , 2003), and navigation (Li et al. , 2003a). However, the primary motivation for our research is network bandwidth constraints. Traditional networks are bound by the provable limits in per-node throughput for radio frequency (RF) based communications. In (Gupta & Kumar, 2000), the authors proved that when n nodes are optimally distributed on the plane, the 1
asymptotic bound for per-node throughput was λ(n) = Θ(W n− 2 ) where each node was capable of transmitting W bits-per-second. However, this result is based upon a non-interference protocol. To protect against the potential for interfering with neighbors’ broadcasts, individual node transmissions were explicitly scheduled. As a consequence, the throughput at each node is reduced by a factor of
1 , c+1
where c
corresponded to the maximum number of interfering neighbors. For FSO, such a reduction is inappropriate. Links are directed and full-duplex, and as a result no such interference region exists. This characteristic, in conjunction with the high throughput afforded by FSO, will provide an enabling technology that can expand the capabilities of wireless sensor networks.
A.3
A Hierarchical Link Acquisition System
Whether employed as primarily FSO links as in a disaster relief application or as a mobile ad-hoc backbone in an RF sensor network, a fundamental challenge for the hybrid nodes is establishing and maintaining optical links. FSO technology achieves exceptional ranges by employing laser transceivers with a beam divergence on the order of 0.6 milliradians. To place this in perspective, at 100 meters this corresponds to a cone diameter of approximately 6 cm. In commercial systems, transceivers typically require an initial alignment by technicians using a telescope assembly. Instead, this research proposes a hierarchical, vision–based link acquisition system (LAS) relying upon a “coarse–to–fine” approach whereby a robot pair can autonomously establish directional FSO links.
139
A.3. A HIERARCHICAL LINK ACQUISITION SYSTEM It is assumed that each robot has a priori knowledge of its link partner’s objective position. Additionally, each is able to infer its relative pose (position and orientation) to a common reference frame. With our current platforms, this can be accomplished through GPS and inclinometer sensors. Under these assumptions, the link acquisition phase is decomposed into three stages (see Figure A.1): 1. Coarse alignment through local sensor measurements. 2. Refinement using vision-based robot detection. 3. Final FSO alignment for link acquisition. In the third stage, we rely upon the internal FSO tracking system. Thus, our discussions are limited to the former two.
A.3.1
Coarse Alignment Phase
Let x1 , x2 ∈ R3 denote the objective positions for the two robots in our navigation frame W. We assume for now that the robots are able to accurately measure their
position and orientation in W. After migrating to their objective positions, each
robot will adjust its relative orientation to roughly align the optical transceivers. If the coordinate frames Fi (for transceiver i) and W are equivalent up to a translation then the desired link directions would simply be xij =
xj − xi , i, j ∈ {1, 2}, i 6= j k xj − xi k
(A.1)
In practice, local ground contours will only allow for a coarse transceiver alignment. Based upon our assumptions, the orientation of each robot in W, as defined by the rotation matrices R1 , R2 ∈ SO(3), can be inferred from local sensor measure-
ments. With these, we can compute the objective link direction in Fi as xˆij = RiT xij . Converting to spherical coordinates we obtain " # x ˆij (3) θij arctan √(ˆxij (1)2 +ˆxij (2)2 ) = h i x ˆij (2) ψij arctan xˆij (1) 140
(A.2)
A.3. A HIERARCHICAL LINK ACQUISITION SYSTEM
Figure A.1: Autonomously aligning a pair of FSO transceivers can be realized using a coarse–to–fine approach having as its core a robust vision–based tracking system for identifying link partners. This figure provides a high–level view of the process. where θij , ψij denote the required pan/tilt angles of the optical transceiver on robot i to establish a link with robot j, and xˆij (k) denotes the k th element of vector xˆij . Under ideal conditions, Equation (A.2) would be sufficient to establish the link. However, our estimates for robot pose may have significant errors in both position and orientation. Imperfect calibration of the pan/tilt heads will introduce additional uncertainty. As a result, we rely upon a second alignment phase to refine the relative orientation estimates of the two link partners. In doing so, we exploit the fact that the ability to establish an optical link implies a clear line-of-sight (LOS). This enables the robots to employ high–zoom camera systems to assist in alignment.
A.3.2
Vision-based Alignment Phase
At longer link distances, the uncertainty in robot orientation will most likely be larger than the solid angle subtended by its camera’s field-of-view (FOV). As a consequence, it will often be necessary for the robot to search in azimuth and elevation with its camera’s pan-tilt-zoom (PTZ) system to locate its link partner. For the
141
A.3. A HIERARCHICAL LINK ACQUISITION SYSTEM vision-based alignment phase, we propose a two-dimensional pattern matching approach where the coarse estimates derived in Equation (A.2) serve as the starting point for this search. A major limitation of two-dimensional pattern matching approaches for tracking three dimensional targets without scale or orientation constraints is that a huge family of templates is required. Such a “brute force” approach is computationally intractable. However, our problem is not so ill posed. First, the scale of our target is approximately known since we have a priori knowledge of the target position (the same effect could also be achieved by fitting each node with a time-of-flight transceiver to obtain relative range measurements). Second, the relative orientation of the robots are limited by both line of sight requirements and physical constraints inherent with the pan/tilt head. As a result, the set of orientations that would permit the establishment of optical links can be quite constrained. For many applications, relying upon a small (≈ 20) set of templates in conjunction with a robust tracking algorithm would be sufficient. In this approach, each robot maintains in memory a set of image templates T ,
where each template Tθφ ∈ T corresponds to a reference image of its link partner
acquired at camera azimuth and elevation (θ, φ), respectively. All templates are acquired at a fixed reference distance dT and focal length fT . The vision system then searches in azimuth and elevation to acquire a set of
images I, where each image Iθφ ∈ I is associated with a candidate link partner
location. The search limits are dictated by the estimated uncertainty in robot pose. These can be inferred directly through sensor models, or (more appropriately) using
the second order statistics of the probability density function in Kalman/particle filter based localization implementations. By setting the focal length to fsearch =
k xj − xi k fT dT
(A.3)
we ensure the robot scale in captured images is consistent with the template set T , To detect the robot in a given image, we use normalized intensity distribution
(NID) as a similarity metric (Fusiello et al. , 1999). An advantage of this formulation is that it explicitly models both changes in scene brightness and contrast from the 142
A.3. A HIERARCHICAL LINK ACQUISITION SYSTEM reference template image. Our matching approach, while static, will suffer from similar error effects in the form of inconsistencies between the reference template and the viewed target (link partner) position and orientation. We shall discuss this in further detail in the sequel. Given an image I ∈ I, an m×n template T of the robot, and a block region B ∈ I
of equivalent size corresponding to a hypothetical robot position, the similarity of T and B can be defined as ǫ(T, I, B) =
m X n X T (u, v) − µT u=1 v=1
σT
B(u, v) − µB − σB
2
(A.4)
where T (u, v) denotes the grayscale value at location (u, v), µ is the mean grayscale value, and σ the standard deviation. The image region B ∗ = arg min(ǫ) ∀ B ∈ I, ∀ I ∈ I, ∀ T ∈ T
(A.5)
would then correspond to the target robot position in the image sequence. The corresponding estimates (θ∗ , φ∗ ) would be used to direct the FSO transceivers for the final alignment stage. Even for a single template/image pair, template based pattern recognition approaches are very computationally expensive using traditional matching techniques. In our paradigm, we rely upon such an approach for a set of templates over an image sequence. To implement this in real-time, we use a multi-resolution image representation (Adelson et al. , 1984). Prior to the pattern recognition phase, images are converted into Gaussian pyramid data structures. Templates are also stored in memory using a similar representation. Pattern matching is then performed at the highest level of the pyramid, with only local refinements performed at each lower level. This allows for speedups of several orders of magnitude, and real-time template–based pattern recognition. The main advantage of the vision-based approach for aligning the FSO transceivers is the reduction of the dimension of the search space. We could, in fact, pan/tilt the FSO transceivers and allow the internal alignment systems to search directly. The critical difference is while the vision systems can search independently, the 143
A.4. NETWORK ARCHITECTURE transceivers must both be aligned to a given tolerance simultaneously in order to establish the link. This increases the search space from R2 to R4 . Exponential increases in search times would result.
A.4
Network Architecture
A FSO/RF MANET can be logically decomposed into two subcomponents. The first subcomponent is a collection of RF-based ad-hoc networks that will be required to communicate in real-time over ranges that render the use of RF communication channels impractical. It is assumed (though not required) that each of the networks in the collection will be deployed beyond the radio transmission range of any other network in the collection. Logically, each ad-hoc network can be considered a hierarchically defined domain space characterized by a dynamic topology. The second component is a high throughput backbone FSO network. This network serves as the cornerstone of the FSO/RF MANET architecture since its presence enables the dissemination of information among otherwise disjoint ad-hoc domains. It can be constructed dynamically by the hybrid node(s) that have been assigned as members of each respective ad-hoc network. The hybrid nodes serve to bridge the gap between the RF domain and the FSO backbone domain space. As such, the routing protocol employed by each RF-based network must account for this constraint so that all ad-hoc RF nodes have knowledge of where to forward packets destined for a non-local address. In order to solve the routing problem in RF ad-hoc domains, we propose the use of Hierarchical State Routing (HSR) (Iwata et al. , 1999; Hong et al. , 2002). Among the more appealing features of this protocol is its support for “logical” network partitions and differentiated services. The latter is enabled by the architecture’s definition over a collection of physical network clusters. As such, its underlying theory lends itself nicely to the task of propagating time-critical information. It is also highly scalable, facilitating the support of a large number of wireless nodes.
144
A.5. EXPERIMENTAL RESULTS The routing protocol used among the hybrid peers depends upon the number of hybrid nodes in the system. For an FSO backbone consisting of only a few hybrid nodes, flat link-state routing can be used. When the number of nodes increases, HSR can also be applied among these nodes, providing an additional level of hierarchy.
A.5
Experimental Results
Our experiments employed two hybrid nodes (Balrog and Gimli) based upon the pioneer P3-AT platform. Each was equipped with a Laserwire 100 Mbps FSO transceiver mounted on a pan/tilt head, 802.11g RF interface, and a Sony EVID70 18X PTZ system that allows for the automatic control of camera azimuth, elevation, and focal length settings (see Figure A.2).
(a)
(b)
Figure A.2: (a) Hybrid nodes employed in the experiments. Each is a Pioneer P3– AT equipped with a 100 Mbps pan/tilt FSO transceiver, 802.11g RF link, and 18X PTZ vision system. (b) A close–up of the FSO transceiver and pan/tilt mechanism.
A.5.1
Vision-based Alignment Validation
In an attempt to characterize the performance of our vision approach under range and orientation errors, two sets of trials were conducted. In the first, the robots were 145
A.5. EXPERIMENTAL RESULTS initially placed at a distance of 40 meters and the camera focal length was adjusted appropriately. The target robot was set at an orientation approximately equal to one of the templates used by the tracking robot. The camera then scanned over a 30◦ arc composed of 15 images, and the tracking robot attempted to detect its link partner within the scan. The distance between the robots was then decreased without adjusting the focal length, and detection was again attempted. The procedure iterated until the tracking robot consistently failed to detect its link partner. The entire process was repeated a minimum of 5 times for each test range. The test procedure for the second set of trials was identical to the first, except the target robot was placed at an orientation roughly equispaced between two of the tracking robot’s template images. This corresponds to an orientation error of about 11◦ . A total of 80 detection runs were conducted. In each, 16 template images were used by the vision system for detecting the link partner. These were taken from roughly equispaced robot orientations (every π/8 radians) on the plane. Although only 3 templates were actually required given the transceiver pan/tilt constraints, all 16 were used for each trial to better approximate the computational resource requirements for outdoor operations. Template size was typically 128x64 pixels, and a 3-level Gaussian pyramid representation was used. Under this configuration, a single 640x480 pixel image could be searched with all 16 templates in 180ms using a 1.7 GHz Centrino notebook computer. This included the time for image pyramid generation. A SIMD implementation of the NID would further reduce these times. Ambient illumination of the target robot varied from 20-200 lumens. The performance of the robot detection algorithm was very consistent. It correctly identified the target robot every time for each orientation and range errors up to 25%. However, at range errors of 30%, it always failed to detect the robot in the image sequence. To provide a little more insight into detection robustness, we define a signal-to-noise ratio (SNR) metric as SNR =
min[NID(I \ robot)] NID(robot)
(A.6)
where min [NID(I \ robot)] was the minimum normalized intensity distribution
value obtained across the image scan that was not associated with the target robot, 146
A.5. EXPERIMENTAL RESULTS and NID(robot) was the target robot’s NID value. This is an appropriate definition, as anything else in the image sequence that is not a robot can be categorized as background noise. A plot of the SNR values vs. range for both sets of trials is shown in Figure A.3.
Signal to Noise vs. Range Error 2.5 θ=0 θ ≈ 11°
2
SNR
1.5
1
0.5
0
0
5
10
15
20
25
30
Range Error (%) Figure A.3: SNR versus range error for two target robot orientations. As expected, the SNR drops as the pose error increases. With the exception of when range errors reached 30% (where all trials failed), the target robot was detected every time. As expected, the SNR value decays as range and orientation errors are introduced. However, the performance is still quite good at range errors up to 15%. These initial results are promising, and indicate that the approach will exhibit certain levels of robustness to robot localization errors.
147
A.5. EXPERIMENTAL RESULTS
Figure A.4: Deployment strategy for the hybrid FSO/RF network. The map corresponds to the fourth floor of Packard Laboratory at Lehigh University.
A.5.2
Network Deployment
Our deployment experiments of an FSO/RF MANET included two hybrid and two RF nodes (notebook computers). Once physically linked, the two hybrid nodes defined a symmetric, mono-link FSO backbone that spanned the length of a corridor in Packard Laboratory (≈ 45 meters in length) as shown in Figure A.4. Each RF node was associated with a single hybrid node in the context of an 802.11g ad-hoc network. Logically, this configuration consisted of two spatially disjoint subnetworks connected via a high-throughput channel. The hybrid nodes were assigned the responsibility of routing packets between the RF address spaces. In order to endow them with this functionality, each was configured as an IP router. Each was outfitted with a manually configured routing table in order to expedite the process. A total of 5 deployments were conducted. In each the hybrid nodes relied upon the hierarchical LAS, and were successful in establishing the hybrid network and routing remote video data in real-time to the RF nodes. The mean optical link acquisition time for these trials was 65 seconds, with min and max values of 60 and 73, respectively. Acquisition time was dominated by 2 components: the final FSO alignment stage, and camera motion constraints. In our trials, the internal FSO alignment system can require 30 seconds or longer to establish the optical link 148
A.6. DISCUSSION - even when the transceivers are well aligned. Furthermore, at each pan angle we required the camera system to remain static for 2 seconds prior to capturing an image. This was a consequence of the large changes in scene depth, which would initiate the camera’s auto-focus feature. This delay ensured the auto-focus system had stabilized. For a 15 image sequence, this added 30 seconds of acquisition time. The image processing portion was by far the smallest component, amounting to only several seconds of link acquisition time. To put these times into perspective, consider instead only relying upon the internal FSO alignment system. In this case, link acquisition would be possible for only 1 of 225 possible transceiver orientations, for a worst case link acquisition time of 112.5 minutes (using a 30 second timeout for failure). This two orders-of-magnitude difference demonstrates the efficacy of our hierarchical link acquisition paradigm. Images from a sample network deployment are shown in Figure A.5.
Figure A.5: Partial mosaic from the images acquired during a link acquisition trial. Using an 18X optical zoom, each robot was able to accurately localize its link partner at a range of ≈ 45 meters to enable fast FSO link acquisition.
A.6
Discussion
In this appendix, we offer link acquisition and routing protocols to realize the deployment of a hybrid FSO/RF MANET. Our primary motivation is the tremendous throughput provided by FSO over long link distances. However, the technology
149
A.6. DISCUSSION also offers additional advantages such as lower per–bit power consumption and a secure/robust transmission medium. To demonstrate the feasibility of the link acquisition approach, a series of experiments was conducted whereby a FSO/RF MANET was deployed and FSO links established dynamically using a hierarchical vision/FSO based link acquisition system (LAS). To date, our LAS experiments have been limited to distances of ≈ 45
meters. From extrapolating these results, we believe that extending this to 150200 meters is within the capabilities of our current camera system. Extending this further would require an increase in image resolution. Additionally, exploring the application of scale–independent feature–based tracking (SIFT) (Lowe, 2004) is the topic of ongoing research.
150
Appendix B Robust Perception for the 2007 DARPA Urban Challenge In this appendix, an approach for robust terrain classification and traffic line detection are presented. The algorithms discussed herein were employed by the fourth place entry in the 2007 DARPA Urban Challenge entry, “Little Ben” (see Figure B.1). Ben was the product of the Philadelphia–based Ben Franklin Racing Team which was lead by the University of Pennsylvania with Lehigh University and the Advanced Technology Laboratory (ATL) at Lockheed Martin as collaborators (Bohren et al. , 2008). Given the significance of perception in the low–level stages of planning, these results are included in this dissertation for the interested reader. Accordingly, our results in enabling robust LIDAR–based perception is now presented.
B.1
Background
The ability of an agent to generate accurate perceptions of its environment is paramount to safe motion planning. More precisely, an agent should be able to loosely classify its surroundings as “passable” or “non–passable” so as to generate both a feasible and safe path to a target position. In this appendix, this problem is addressed by formulating a robust approach to terrain classification that was used
151
B.1. BACKGROUND
Figure B.1: “Little Ben” crosses the finish line on November 3, 2007 becoming the fourth place finisher of the 2007 DARPA Urban Challenge. Ben was the product of a collaboration lead by the University of Pennsylvania with Lehigh University and the Advanced Technologies Laboratory (ATL) at Lockheed Martin. by our 2007 DUC entry, “Little Ben.” Key to the success of said approach is the ability to reliably track the ground plane (with respect to time) using a regularized iterative least–squares method (Guitton, 2000). In contrast to other estimators, this approach incorporates a level of temporal filtering that facilitates reliable results. Building upon the application of our robust classifier for the DUC, an approach to traffic line detection using filtered reflectivity (i.e. remission) returns from our planar LIDAR units is also presented. In this case, the results of the classifier were fed as inputs into the traffic line detection module enabling reliable detection that was used to help correct positional errors (e.g. due to GPS bias). Accordingly, the results of the classification and traffic line detection processes were fused with results from other sensing modules yielding a perception of the environment that Ben could then use in conjunction with a local planner to generate low–cost trajectories.
152
B.2. RELATED WORK
B.2
Related Work
In recent years, there has been much research dedicated to generating accurate environmental perceptions using LIDAR based systems. Given the vast body of literature on the topic, we cannot possibly present all relevant research. Instead, we opt to focus on similar work associated with ground–based vehicles navigating in unrehearsed environments. To begin, consider the body of literature amassed from top–entrants in the recent DARPA Grand Challenge (DGC) and DARPA Urban Challenge (DUC) events. Stanley – The Stanford Racing Team’s entry and ultimately the winner of the 2005 DGC – fused results from a collection of planar LIDARs and a color monocular camera system to determine drivable areas (Thrun et al. , 2006). In this case, the LIDARs surveyed the near–field to determine unobstructed/passable areas. These results were then projected into the image frame as a quadrilateral that defines an area whose color statistics seed a probabilistic learning algorithm. Sandstorm and H1ghlander – The Red Team’s (CMU’s) second and third place entries in the 2005 DGC – employed a suite of LIDARs to detect obstacles and identify safe areas of passage (Urmson et al. , 2006). Their terrain evaluation approach is an adapted 2D application of the 3D Morhpin Algorithm – whereby returns from a single LIDAR scan are considered as opposed to a 3D point cloud over time. In the DUC, many teams employed LIDAR based approaches for environmental perception, including (Campbell et al. , 2007; Leonard et al. , 2007; Reinholtz et al. , 2007; Stanford Racing Team, 2007; Urmson et al. , 2007) – to name just a few. In fact, nearly all entrants in the race used some form of laser range finder to taxonomize elements of the environment. In addition to the body of literature amassed from DGC/DUC participants, there has also been tremendous strides made in related contexts. In (Manduchi et al. , 2005), the authors propose a complementary sensor suite using planar LIDARs and stereo cameras to robustly classify terrain. They propose an algorithm using LIDAR measurements to discriminate grassy areas from traditional obstacles such as rocks and trees in non–urban environments. In (Montemerlo & Thrun, 2004), the authors 153
B.3. TERRAIN CLASSIFICATION present a 3D mapping approach using a rotating planar LIDAR to accurately map urban structures. Simultaneously localizing and mapping (SLAM), the robot uses scan gradient features from a rigidly mounted planar LIDAR to detect hazardous areas. In (Hashimoto & Yuta, 2006), the authors propose a method for determining path traversability by considering a wheel–based heuristic to decide whether the robot can safely travel over rough terrain. Key to this approach is building a terrain map using a variety of sensors, including LIDARs. In this work, we consider planning in an urban environment and forgo fine– grained classification by attempting to label points as only ground, obstacle or unknown before merging them into a high–level cost map. To this end, we consider the fore/aft sensing planar LIDARs on Little Ben (see Figure B.2). Their role was to supplement the near–field returns of the Velodyne HD (3D) LIDAR by providing additional close proximity perception. This included detecting curbs and other small obstacles that were often too small for reliable detection via the algorithms employed to process the Velodyne data. Unlike other work that fuses multiple sensor returns before classification, our algorithm decides using only a filtered LIDAR scan – while exploiting weak temporal filtering and regularization. The result is an elegant solution to the problem of reliable terrain classification using limited data.
B.3
Terrain Classification
To accurately detect obstacles, our vehicle employs a two–phased algorithm applying the geometric configuration given in Figure B.2. During the first phase, the range returns from the Sick LMS 291–S14 LIDARs are mapped to z values by making the standard assumption of a locally–flat ground plane. Employing a robust estimator, a linear fit is performed giving an estimate of the true ground surface. During the second phase, normal distance offsets as well as gradient features are used to classify binned points as either ground, obstacle, or unknown. The features are modeled as Gaussian random variables that are input into a Bayesian classifier.
154
B.3. TERRAIN CLASSIFICATION
Figure B.2: Configuration geometry for the Sick LMS 291–S14 LIDARs affixed to Little Ben. This configuration was utilized for both terrain classification and traffic line detection. To mitigate the effects of spurious returns due to to airborne particles, a temporal median filter was applied to a set of three consecutive range returns. In the sequel, whenever we refer to a scan we are referring to the results of this median filter. As the Sick LMS 291–S14 stream both range and reflectivity (echo amplitude) returns at a rate of 75Hz, this yields an update rate of 25Hz for the process.
B.3.1
Ground Plane Extraction
Letting ri and γi respectively denote the ith range and reflectivity return corresponding to scan angle θi , we define the k th scan concisely as the set Sk = k k k (r1 , γ1 , θ1 ), . . . , (rnk , γnk , θnk ) . From the associated Cartesian points relative to the vehicle frame, Ck = (xki , zik ) : rik , γik , θik ∈ Sk , ∀i ∈ [1, n] , we aim to extract gk = [mk , bk ]t that best fits the true plane.
To this end, we employ an Iterative Re-weighted Least-squares (IRLS) approach (Guitton, 2000). For our purposes, employing IRLS allows us to exploit the temporal filtering inherent in its design by seeding the solver at iteration k with gk−1. More precisely, for a set Ck we solve: Wk Ak gk = Wk zk
(B.1)
where Wk is a diagonal weighting matrix with elements wii = f (xki , zik , gk−1) : R4 →
[0, 1] being a direct functional of the normal distance of (xki , zik ) to the line given by gk−1 , and zk = [z1k , . . . , znk ]T .
155
B.3. TERRAIN CLASSIFICATION Intuitively, f should place more weight on points that are proximal to gk−1 . A monotonically decreasing, fat–tailed exponential is a natural choice. For our purposes, we chose f (xki , zik , gk−1)
d (xk ,zk ,g )−αg k−1 −ρ ⊥ i αi −α o g = (1 − β) min e ,1 + β
(B.2)
where d⊥ : R4 → R+ denotes the functional mapping of the point xki , zik and line
gk−1 to the normal distance separating both, αg ∈ R+ denotes the maximum normal distance of a ground point from gk−1, αo ∈ R+ , αo > αg denotes the minimum normal
distance for an obstacle point. In this formulation, β ∈ R+ denotes the fat–tail value
while ρ ∈ R+ simply acts as a tuning parameter. Choosing the parameters as follows
yielded the best performance results: αg ∈ [2, 5], αo ∈ [10, 15], β ∈ [0.01, 0.10], and
ρ = 0.5. Figure B.3 illustrates a few parameterizations of Equation (B.2). IRLS Weight Functions αg = 5, αo = 15, β = 0.05 αg = 10, αo = 60, β = 0.10
1
αg = 15, αo = 90, β = 0.15
Weight
0.8
0.6
0.4
0.2
0
10
20
30
40 50 Normal Distance (cm)
60
70
80
90
Figure B.3: Example weighting function for IRLS–based approach under different values for αg , αo , β with ρ = 0.5.
156
B.3. TERRAIN CLASSIFICATION To thwart ground plane drift, a Tikhonov–type regularization of Equation (B.1) can be done. This yields the following closed form solution for the parameters gk : gk = (ATk Wk Ak + ΓT Γ)−1 ATk Wk zk
(B.3)
where Γ = αI, α ∈ R. Intuitively, this has the effect of drawing emphasis to the
calibrated zero-line when a majority of the points deviate largely from its proximity.
A similar behavior can be obtained by simply augmenting Ck with a small number
(e.g. 5–8) of additional bias points uniformly distributed over the zero-line and then assigning maximal weight to each.
B.3.2
Point Classification
Given an accurate ground plane extraction, the task then becomes to accurately classify the scan points. For our purposes, we considered a ternary classifier, aimed at assigning points one of three labels: safe (S), unsafe (U), or unknown (Q). Noting that the LIDARs were configured on the vehicle to scan 7m fore and 5m aft (with respect to the front and rear bumpers), we observed that a total of four points could be binned in a single cell in our 0.5m resolution cost map. Letting Bi denote the ith bin in the k th LIDAR scan, we consider the set DBi = {d⊥ (x, z, gk−1 ) : (x, z) ∈ Bi } and define the feature dBi , max {DBi }. Simi-
larly, we consider the set δZBi = {|zi − zj | : (xi , zi ), (xj , zj ) ∈ Bi } and define δzBi , max {δZBi }. Notice that the first feature has the effect of capturing the elevation of
the points above the believed ground plane while the second captures their smoothness. In practice, we’d expect the true ground plane to have an elevation ≈ 0 without any severe discontinuities.
Observing X = [dBi δzBi ]T is a discrete random variable, we aim to effectively characterize S, U, and Q. To this end, we model P (X|S) as a zero mean Gaussian. More precisely, X ∼ N (0, Σ) where Σ = σ0d σ0δz . Given this model, we defined the
157
B.3. TERRAIN CLASSIFICATION following policy: B ∈ S, (dBi < τd ) ∧ (δzBi < τδz ) i P(dBi , δzBi ) ≡ Bi ∈ U, (dBi ≥ τd ) ∨ (δzBi ≥ τδz ) B ∈ Q, otherwise i
(B.4)
where τd = 3σd and τδz = 3σδz . In practice, we employed σd = 5 cm and σδz = 3 cm.
Figure B.4: Classification results of the Sick LMS obstacle detection (Matlab) process running on our DUC/UFE entry, Little Ben. The solid line denotes the ground plane estimate using the adapted IRLS approach. Figure B.4 depicts the classification results along with corresponding ground plane from both front and back Sick LIDAR scans. In the topmost figure, we note that despite the vehicle’s pitch yielding a 15cm drop in the computed z values, the algorithm is still able to accurately track the ground plane. In the bottom figure, the rear facing Sick LMS process detects a vehicle approaching. Take note that despite the highly linear obstacle elements, the process is still able to accurately discriminate the true ground due to the regularization and temporal filtering. An implementation 158
B.4. TRAFFIC LINE DETECTION employing RANSAC (Fischler & Bolles, 1981) would yield unreliable results given such a data set.
B.4
Traffic Line Detection
To mitigate positional bias due to shifts in GPS, we also implemented a Matlab process to extract points corresponding to traffic lines using the reflectivity returns of the Sick LMS 291–S14. Assuming that a standard line would be ≈ 10–15cm, it was guaranteed that any traffic line would incur at least two (no more than three)
direct hits by the LIDAR during each scan. Moreover, assuming a standard lane width of ≈ 4m, we reduced the feasible set of values to include only those within ± 2.25m from vehicle center.
Given these observations, we map the feasible set to a binary representation using adaptive thresholding based upon the mean reflectivity of the road surface. Letting Gk = γik : |xki | ≤ 2.25, xki , zik ∈ Ck denote said feasible set, we construct Gˆk = Gk / γ k : τ min ≤ γ k − µG ≤ τ max where τ min and τ max are set thresholds. i
γ
i
k
γ
γ
γ
They respectively represent the minimal/maximal difference in reflectivity from the mean that is required/allowed for a scan point to be considered a possible traffic line hit. Next, we eliminate any reflectivity values in Gˆk that correspond to points classified as either Q or U by our terrain classification module. In other words, we generate G¯k = Gˆk / γik : xki , zik ∈ Q ∨ xki , zik ∈ U . Given G¯k , the points are collected into regions of interest. For our purposes,
a region of interest is defined simply as two or three consecutive scan points having corresponding reflectivity values in G¯k . All regions of interest are considered candidates for traffic lines.
In an effort to ensure robustness, we enforce the rule that if there is any ambiguity in a result then to ignore the data. In particular, traffic lines are either set in pairs (e.g.a double yellow line) or in single (e.g.a solid white line) in a typical urban setting. As such, our classifier also enforces the rule that if there are more than two lines detected on either side of the vehicle then the results are ignored.
159
B.4. TRAFFIC LINE DETECTION Sick LMS 291−S14 (Back) Reflectivity w/ Line Extraction 120
100
γ
80
60
40
20
0
−600
−400
−200
0 cm
200
400
600
Figure B.5: Classification results of the traffic line detection (Matlab) process running on Little Ben. The two peaks (denoted with vertical dashed lines) correspond with two separate traffic lines on either side of the vehicle. The solid vertical lines bound the feasible set of reflectivity values with the horizontal line denoting µGk . Figure B.5 shows the Matlab process output of our traffic line extraction algorithm run on our DUC entry. Values shown correspond to the reflectivity returns from our Sick LMS units. In this figure, the horizontal black line represents µGk . In this case, two regions of interest remain after employing our traffic line detection algorithm. Both correspond to traffic line hits and were fed to our high–level planner to help adjust positional bias. In practice, we chose τγmin = 10 and τγmax = ∞. These
values were chosen to properly classify the returns from a faded painted yellow line (potentially on concrete) to freshly set (highly–reflective) traffic line tape.
160
B.5. DISCUSSION
B.5
Discussion
In this chapter, we presented an elegant approach to terrain classification using the limited information returned from a planar LIDAR unit. Key to the results of our ternary classification was the ability to accurately track the true ground plane using an adaptation of the Iteratively Re–weighted Least Squares robust estimator. Modeling terrain as a function of two Gaussian variates that capture both elevation as well as surface gradient features, we were able to implement an algorithm that reliably identified obstacles as small as a typical curb height. The ability to perceive such small perturbations from the ground plane aided Little Ben in planning his local trajectories during the DUC. An approach to traffic marker extraction using raw reflectivity returns from planar LIDAR units was also presented. As an input to this process, we use the terrain classification results given by our algorithm. Being able to accurately perceive traffic lines with the LIDARs (as well as other on board sensors) allowed Little Ben to adjust his trajectory to mitigate any positional drift due to time variable GPS bias. Although elegant, these algorithms suffer from certain drawbacks that fail to make them an ideal solution. One drawback of this approach is the linear road surface model. Although simple, it is far from ideal in many cases. In urban environments that feature road surfaces that are severely crowned, the algorithm will report false obstacles as the elevation feature used for classification is a function of the normal distance to the believed ground plane. One approach may be to use a robust estimator to parameterize a non–linear model to the believed road surface.
161
Appendix C The Sick LIDAR Matlab/C++ Toolbox This appendix is included to highlight an open–source project developed in support of this dissertation. The Sick LIDAR Matlab/C++ Toolbox enables rapid data– acquisition and visualization via Matlab for the ubiquitous Sick LMS/LD LIDAR families. It provides robust, object–oriented, multi–threaded C++ drivers as well as Matlab executable (MEX) interfaces. The latter is attractive as it enables the user to stream returns directly into Matlab facilitating the use of high–level, vector– based operations for rapid–prototyping. The toolbox has been well–received by the robotics community having been downloaded approximately 1, 400 times during the first year since its release. It is being utilized by both academic and industry leaders located in Europe, Australia, Canada, and the United States to include the Robotics Institute at Carnegie Mellon University, the Stanford Artificial Intelligence Laboratory (SAIL), Texas A&M, and Google, Inc (see §C.6 for additional groups).
C.1
Background
Introduced in early 1990’s, Sick non–contact range measurement systems have become ubiquitous in the robotics community. An industrial–grade sensor, they have
162
C.1. BACKGROUND been utilized in a wide variety of applications. These include, but are by no means limited to, autonomous wheelchair systems for assisted living (Gao et al. , 2007), urban reconnaissance for military applications (Yamauchi, 2006), simultaneous localization and mapping (SLAM) (Thrun et al. , 2000; Newman et al. , May 15-19, 2006) including volumetric environment mapping (Wang & Thorpe, 2004; Skrzypczynski, 2007; Montemerlo et al. , 2002), robotic tour–guides (Philippsen & Siegwart, 14-19 Sept. 2003), and autonomous road–vehicle systems (Urmson et al. , 2007; Thrun et al. , 2006; Ben Franklin Racing Team, 2007). In fact, all eleven finalists in the recent DARPA Urban Challenge chose to include Sick LMS 2xx and/or Sick LD laser range finders in their sensor suites. Despite the overwhelming popularity of the Sick sensors – attributed to their reliability, accuracy, range, and reasonable cost – until this point the robotics community has been lacking an open–source software package designed specifically for their use. Many opt to use a basic implementation that is bundled in the context of an auxiliary framework such as Player/Stage (Gerkey et al. , 2003), which may not be well–suited for the desired application. As a result, there are a few open– source drivers; however, they provide little more than the ability to grab data from the device and process it using C/C++, offering minimal configuration capabilities (Leon, 2005; Yuen, 2004). Additionally, no publicly available framework (nor driver) currently offers support for the long–range Sick LD units, and none support a high–level scripting interface for either Sick family. Relevant to our work is that of (Vale, 2005), who present a Sick LMS laser interface for Matlab. The authors employ Matlab’s serial capabilities and offer a tutorial on building a basic m–file Sick LMS driver implementation. The described interface requires the user to employ file access commands to read/write the byte– stream and ultimately construct/parse low–level packets in the Matlab scripting environment. It supports standard baud rates up to 38,400bps. In contrast to these efforts, the Sick LIDAR Matlab/C++ Toolbox takes a holistic view of interfacing to the Sick LMS 2xx and Sick LD families of laser range finders. The focus is on making data processing as simple as possible while keeping the high–level interface clean and concise. In so doing, it provides Matlab MEX 163
C.2. COMPATIBLE DEVICES interfaces and easy–to–use (but well–featured) C++ device drivers for data acquisition and configuration under Linux OS. Together these elements allow range and reflectivity data to be streamed directly into Matlab at full data rate. At first glance, this may seem like a secondary contribution; however, we cannot over–emphasize the importance of this feature as it facilitates the rapid development of algorithms by exploiting the high-level functionality afforded by Matlab’s vector operations. Motivating the use of Matlab, we note that the toolbox is branched from the source code used by the Ben Franklin Racing Team during the DARPA Urban Challenge. Our entry – Little Ben – employed five Sick LMS 291 and LD–LRS LIDARs during the race and was one of only six vehicles to successfully complete the 55 mile Urban Challenge Final Event. Little Ben successfully completed the UFE using fewer than 5,000 lines of Matlab script – using MEX interfaces to low–level device drivers and messaging facilities. This was one to two orders of magnitude less than some other finalists, and it highlights the potential of Matlab as a run–time environment for real–time, experimental robotics. In the first year of its release, the toolbox has garnered much attention. It received approximately 1, 400 downloads in that time and has been utilized by both academic and industry leaders. For a listing of some of the groups that have utilized the toolbox, see §C.6.
C.2 C.2.1
Compatible Devices Compatible Sick LMS 2xx Units
The communication protocol implemented with the low–level Sick LMS 2xx C++ driver is compatible with LMS 200/220 firmware version: V02.30 Q501, LMS 211/221/291 firmware version: X01.27 Q501, and LMS 211/221–S19/–S20 firmware version S01.31 Q393 (SICK AG Waldkirch, 2006).
164
C.2. COMPATIBLE DEVICES
(a)
(b)
Figure C.1: The Sick LIDAR Matlab/C++ Toolbox supports both the LMS 2xx and LD families of Sick LIDARs. A few of the Sick LIDARs used to explicitly test the toolbox: (a) A Sick LMS 291–S05. (b) A Sick LD–LRS 1000. 165
C.3. TOOLBOX FEATURES
C.2.2
Compatible Sick LD Units
The communication protocol implemented with the low–level Sick LD C++ driver is compatible with LD–LRS 1000/2100/3100 models (Waldkirch, 2006). The code has been tested explicitly with Sick LD–LRS 1000 units.
C.3
Toolbox Features
In this section, we present a more detailed outline of the features offered by the toolbox regarding each Sick family.
C.3.1
Sick LMS 2xx Support
For the Sick LMS 2xx models, the toolbox provides/supports the following features: 1. Multi–threaded (POSIX) C++ RS–232/422 driver: (a) Supports 500Kbps baud via a USB–COMi–M adapter (b) Auto–detects Sick LMS 2xx baud rate (c) Supports a variety of data streams, including: i. High–resolution partial/interlaced scans ii. Mean–measured value scans iii. Measured value subrange scans iv. Mean measured–value subrange scans v. Range with reflectivity values concurrently (only LMS Fast, i.e. LMS 211/221/291–S14) vi. Reflectivity–only (non LMS fast models) (d) Stream data with (or without) real–time indices (e) Advanced configuration via device driver: i. Variant {180◦/0.5◦ , 100◦ /0.25◦ } – i.e. scan resolution and field–of– view
ii. Measuring mode 166
C.3. TOOLBOX FEATURES iii. Availability level (useful for enabling dazzle recovery) iv. Measuring units {cm, mm}
v. Sensitivity {high, standard, medium, low}
2. Configure (and view) device parameters via lms config utility (operates using said driver) 3. Matlab MEX interface (i.e. sicklms): (a) Stream range and/or reflectivity returns (b) Set device variant (c) Display parameters
C.3.2
Sick LD Support
Regarding the Sick LD family of laser range finders, the toolbox provides/supports the following features: 1. Multi–threaded (POSIX) C++ Ethernet driver: (a) Stream range with (or without) reflectivity returns (b) Supports multiple scan sectors (c) Advanced configuration via device driver: i. Motor speed (5–20Hz) ii. Resolution (up to 1◦ in multiples of 0.125◦ ) iii. Multiple active/measuring scan sectors (a maximum of four) iv. Temporary scan sectors (does not require a write to flash) v. Absolute and relative time vi. Device signals (including LEDS) vii. Nearfield suppression viii. Sensor ID 2. Configure (and view) basic device parameters via ld config utility (operates using said driver). 3. Matlab MEX interface (i.e. sickld): 167
C.4. ARCHITECTURAL OVERVIEW (a) Stream range with (or without) reflectivity returns from multiple sectors (b) Display configuration parameters. Additionally, the toolbox provides a quick start manual as well as numerous examples illustrating the various ways in which the C++/Matlab interfaces can be used. A tutorial for enabling 500Kbps baud for Sick LMS 2xx models via a USB– COMi–M industrial adapter is also included. All C++ code is commented using Doxygen, and the package was written to exploit the capabilities of GNU Autotools for portability across multiple platforms.
C.4
Architectural Overview
Figure C.2 (a) shows a high–level schematic of the Sick LIDAR Matlab/C++ Toolbox. At the highest level, the toolbox is comprised of a set of Matlab MEX files that directly interact with a corresponding low–level C++ driver. All drivers are comprised of two fundamental components that are derived from a common base. In particular, each driver features an interface to acquire data and configure the device as well as a buffer monitor thread responsible for ensuring only the most recent scan information is returned via said interface. As expected, the interface is used to send commands to the device. All replies are handled by the buffer monitor where they are encapsulated as a message object before being stored in the container where the interface can acquire them. This process is illustrated in Figure C.2 (b). It is important to note that although the Sick LMS 2xx driver is RS–232/422 and the Sick LD driver is Ethernet, both share a high–level of commonality that allows them to exploit the base framework. In this case, the base code simply provides a primitive definition of a Sick LIDAR driver from which more specialized drivers can be built (via inheritance, etc.). It is especially useful as it promotes modularity and emphasizes code reuse where possible.
168
C.4. ARCHITECTURAL OVERVIEW
(a)
(b)
169
C.5. APPLYING THE TOOLBOX
(c)
Figure C.2: (a) The high–level schematic for the Sick LIDAR Matlab/C++ Toolbox. (b) Schematic of a typical application using the interface provided by the low–level driver. (c) The quaternary state machine implemented by Matlab/C++ programs using the toolbox for data–acquisition.
C.5
Applying the Toolbox
Using the toolbox to acquire data is straightforward, requiring only three separate function calls: one to initialize the device, one to grab the measurements and one to uninitialize the device. This approach is similar with both the C++ and Matlab interfaces. Any application utilizing the toolbox to acquire data should follow the quaternary state machine presented in Figure C.2 (c). To highlight the elegance of the toolbox, we now focus on its high–level Matlab capabilities, referring the reader to the quick start manual and examples for specifics regarding the C++ interface. The toolbox provides two high–level interfaces: sicklms and sickld. Each provides access to the most useful functionality of its corresponding low–level C++ driver. For additional details regarding its software interface the reader is referred to (Derenick & Miller, 2006). It can be obtained via its SourceForge site: http://sicktoolbox.sourceforge.net
170
C.5. APPLYING THE TOOLBOX
(a)
(b)
171
C.6. GROUPS UTILIZING THE TOOLBOX
(c)
Figure C.3: (a) Range/reflectivity streamed from a Sick LMS 291–S14 via the sicklms device interface. The two peaks in reflectivity correspond to two separate retro–reflectors within the scan area of the device. (b) A Cartesian map of the VADER Laboratory generated from range values returned from a Sick LD–LRS 1000 scanning 360◦ with 0.5◦ resolution. (c) Range and reflectivity values from a Sick LD–LRS 1000 scanning with the same parameters as (b).
C.6
Groups Utilizing the Toolbox
A number of academic and industry leaders located in Europe, Australia, Canada, and the United States are utilizing the Sick LIDAR Matab/C++ Toolbox. The following is just a brief compilation of some of the groups that have written regarding its use • The Robotics Institute @ Carnegie Mellon University
• The Stanford Artificial Intelligence Laboratory (SAIL) • Texas A&M
• Istanbul Technical University, Turkey 172
C.7. DISCUSSION • University of Ottawa, Canada
• University of Adelaide, Austrailia • Willow Garage • Google, Inc.
• and many others...
C.7
Discussion
In this appendix, we presented the Sick LIDAR Matlab/C++ Toolbox – branched from the source code used by one of only six vehicles to successfully complete the DARPA Urban Challenge. The toolbox is designed to emphasize easy data acquisition from both Sick LMS 2xx and Sick LD laser range finders. It provides both well–featured, multi–threaded C++ drivers as well as a high–level MEX interface for streaming data directly into Matlab and is released under a flexible open source license. In addition to source code, the toolbox also provides a step–by–step tutorial for enabling 500Kbps baud using a Sick LMS 2xx interfaced with a USB–COMi–M adapter. A quick start manual is also included. There are obvious ways in which the toolbox can be expanded and or improved. For instance, it is currently available to work only under Linux OS. A version for Windows would be a next obvious step. Additionally, efforts are underway to expand the scope of toolbox to beyond just Sick LIDAR support. In particular, we are currently looking to generalize the toolbox by supporting Hokuyo and Velodyne LIDAR systems – both of which were also featured in Little Ben’s sensor suite. Like Sick LIDARs, both are capable of providing range and reflectivity information.
173
References A. Howard, M. Mataricand G. Sukhatme. 2002. Distributed Autonomous Robotic Systems. Springer. Chap. Mobile Sensor Network Deployment using Potential Fields: A Distributed, Scalable Solution to the Area Coverage Problem. Abbeel, P., Coates, A., Quigley, M., & Ng, A. Y. 2007. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In: Proceedings of Neural Information Processing Systems Conference (2007). Adelson, E., Anderson, C., Bergen, J., Burt, P., & Ogden, J. 1984. Pyramid Methods in Image Processing. RCA Engineer, 29-6. Arkin, R., & Diaz, J. 2002. Line–of–sight Constrained Exploration for Reactive Multiagent Robotic Teams. In: AMC 7th International Workshop on Advanced Motion Control. Bachmayer, R., & Leonard, N. E. 2002 (Dec). Vehicle Networks for Gradient Descent in a Sampled Environment. In: Proc. IEEE Conf. on Decision and Control. Belta, C., & Kumar, V. 2004a. Abstraction and Control for Groups of Robots. IEEE Trans. on Robotics and Automation. Belta, C., Isler, V., & Pappas, G. 2005. Discrete Abstractions for Robot Motion Planning and Control in Polygonal Environments. TRO, 21(5), 864–874. Belta, Calin, & Kumar, Vijay. 2004b. Optimal Motion Generation for Groups of Robots: a Geometric Approach. ASME Journal of Mechanical Design, 126. 174
REFERENCES Ben Franklin Racing Team. 2007. 2007 DARPA Urban Challenge: The Ben Franklin Racing Team B156 Technical Paper. Bohren, J., Foote, T., Keller, J., Kushleyev, A., Lee, D. D., Stewart, A., Vernaza, P., Derenick, J., Spletzer, J., & Satterfield, B. 2008. Little Ben: The Ben Franklin Racing Team’s entry to the DARPA Urban Challenge. Journal of Field Robotics, 25(9), 598–614. Boyd, S., & Vandenberghe, L. 2004. Convex Optimization. Cambridge Unviersity Press. Brooks, R. 1986. A Robust Layered Control System for a Mobile Robot. IEEE Trans. on Robotics and Automation, 2(1), 14–23. Bruce, James, & Veloso, Manuela. 2003 (May). Fast and Accurate Vision–based Pattern Detection and Identification. In: Proceedings of ICRA-03, the 2003 IEEE International Conference on Robotics and Automation (ICRA ’03). Campbell, Mark, Garcia, Ephrahim, Huttenlocher, Dan, Miller, Isaac, Moran, Pete, Nathan, Aaron, Schimpf, Brian, Zych, Noah, Catlin, Jason, Chelarescu, Filip, Fujishima, Hikaru, Kline, Frank-Robert, Lupashin, Sergei, Reitmann, Max, Shapiro, Adam, & Wong, Jason. 2007 (April). Team Cornell: Technical Review of the DARPA Urban Challenge Vehicle. Tech. rept. Cornell University. Chan, Haowen, & Perrig, Adrian. 2004. ACE: An Emergent Algorithm for Highly Uniform Cluster Formation. Pages 154–171 of: in Proceedings of the First European Workshop on Sensor Networks (EWSN. Choset, H., Lynch, K., Hutchinson, S., Kantor, G., Burgard, W., Kavraki, L., & Thrun, S. 2005. Principles of Robot Motion Planning. MIT Press. COMPASS. 2007. COMPASS: COllaborative Multiscale Processing and Architecture for SensorNetworkS. http://compass.cs.rice.edu/. Corke, P., Peterson, R., & Rus, D. 2003. Networked Robots: Flying Robot Navigation with a Sensor Net. In: International Symposium on Robotics Research. 175
REFERENCES Cortes, J., Martinez, S., Karatas, T., & Bullo, F. 2002. Coverage control for mobile sensing networks. IEEE Transactions on Robotics and Automation. Cort´es, J., Mart´ınez, S., Karatas, T., & Bullo, F. 2004. Coverage Control for Mobile Sensing Networks. IEEE Trans. on Robotics and Automation, 20(2), 243–255. Cowan, Noah J., Weingarten, Joel D., Member, Student, Koditschek, Daniel E., & Member, Senior. 2002. Visual servoing via navigation functions. IEEE Transactions on Robotics and Automation, 18, 521–533. C.W. Reynolds. 1987 (July). Flocks, Herds, and Schools: A Distributed Behavioral Model. Pages 25–34 of: Proceedings of ACM SIGGRAPH ’87. Das, A. K., Fierro, R., Kumar, V., Ostrowski, J. P., Spletzer, J., & Taylor, C. J. 2002. A vision-based formation control framework. IEEE Trans. on Robotics and Automation, 18(5), 813–825. d’Aspremont, A., & Boyd, S. 2003. Relaxations and Randomized Methods for Nonconvex QCQPs. Stanford University. De Silva, V., & Ghrist, R. 2006. Coordinate-free Coverage in Sensor Networks with Controlled Boundaries via Homology. Int. J. Rob. Res., 25(12), 1205–1222. Derenick, J., Thorne, C., & Spletzer, J. 2005. Accepted to Multi-Robot Systems. Kluwer. Chap. Hybrid Free-space Optics/Radio Frequency (FSO/RF) Networks for Mobile Robot Teams. Derenick, Jason, & Spletzer, John. 2005. TR LU-CSE-05-029: Optimal Shape Changes for Robot Teams. Tech. rept. Lehigh University. Derenick, Jason, & Spletzer, John. 2007. Advances in Cooperative Control and Optimization. Springer. Chap. Second-order Cone Programming (SOCP) Techniques for Coordinating Large-scale Robot Teams in Polygonal Environments.
176
REFERENCES Derenick, LIDAR
Jason
C.,
&
Matlab/C++
Miller,
Thomas
Toolbox:
A
H.
2006. Quick
The Start
Sick Guide.
http://sicktoolbox.sourceforge.net/docs/sicktoolbox-quickstart.pdf. Dijkstra, E. 1959. A Note on Two Problems in Connexion with Graphs. Numerische Mathematik, 1, 269–272. Dryden, I. L., & Mardia, K. V. 1998. Statistical Shape Analysis. John Wiley and Sons. Dubowsky, S., Iagnemma, K., Liberatore, S., Lambeth, D., Plante, J. S., & Boston, P. 2005 (February). A Concept Mission: Microbots for Large-Scale Planetary Surface and Subsurface Exploration. In: Proceedings of the 2005 Space Technology and Applications International Forum (STAIF). Fabiani, P., Gonzalez-Banos, H., Latombe, J., & Lin, D. 2001. Tracking a Partially Predictable Object with Uncertainties and Visibility Constraints. Journal of Autonomous Robots. Feddema, J. T., Robinett, R. D., & Byrne, R. H. 2003 (October 31). An Optimization Approach to Distributed Controls of Multiple Robot Vehicles. In: Workshop on Control and Cooperation of Intelligent Miniature Robots, IEEE/RSJ IROS. Fiedler, M. 1975. A Property of Eigenvectors of Nonnegative Symmetric Matrices and its Application to Graph Theory. Czechoslovak Mathematical Journal, 25. Fischler, Martin A., & Bolles, Robert C. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM, 24(6), 381–395. Fusiello, A., Trucco, E., Tommasini, T., & Roberto, V. 1999. Improving Feature Tracking with Robust Statistics. Pattern Analysis and Applications, 2, 312–320.
177
REFERENCES Gao, C., Hoffman, I., Panzarella, T., & Spletzer, J. 2007 (July). Automated Transport and Retrieval System (ATRS): A Technology Solution to Automobility for Wheelchair Users. In: Proceedings of the 2007 International Conference on Field and Service Robotics (FSR ’07). Gennaro, M. De, & Jadbabaie, A. 2006 (Dec). Decentralized Control of Connectivity in Multi–agent Systems. In: Proc. IEEE Conf. on Decision and Control. Gerkey, B., Vaughan, R., & Howard, A. 2003. The Player/Stage Project: Tools for Multi-Robot and Distributed Sensor Systems. Ghrist, Robert, & Muhammad, Abubakr. 2005. Coverage and hole-detection in sensor networks via homology. Page 34 of: IPSN ’05: Proceedings of the 4th international symposium on Information processing in sensor networks. Piscataway, NJ, USA: IEEE Press. Godsil, Chris, & Royle, Gordon. 2001. Algebraic Graph Theory. Springer-Verlag. Golub, Gene H., & Loan, Charles F. Van. 1996. Matrix computations (3rd ed.). Baltimore, MD, USA: Johns Hopkins University Press. Govindan, R., et al. . 2005 (Nov). TENET: An Architecture for Tiered Embedded Networks. Tech. rept. Center for Embedded Networked Sensing (CENS). Guitton, Antoine. 2000. Stanford Lecture Notes: The Iteratively Reweighted Least Squares Method. Gupta, P., & Kumar, P.R. 2000. The Capacity of Wireless Networks. IEEE Transactions on Information Theory. Hashimoto, Kazuma, & Yuta, Shin’ichi. 2006. Field and Service Robotics. Springer. Chap. Autonomous Detection of Untraversability of the Path on Rough Terrain for the Remote Controlled Mobile Robots. Hatcher, Allen. 2002. Algebraic Topology. Cambridge University Press.
178
REFERENCES Hong, X., Xu, K., & Gerla, M. 2002. Scalable Routing Protocols for Mobile Ad hoc Networks. IEEE Network Magazine, 16(4). Hsieh, M. A., Cowley, A., Kumar, V., & Taylor, C. J. 2006 (April). Towards the Deployment of a Mobile Robot Network with End–to–end Performance Guarantees. In: International Conference on Robotics and Automation (ICRA) 2006. Huang, Chi-Fu, & Tseng, Yu-Chee. 2003. The coverage problem in a wireless sensor network. Pages 115–121 of: WSNA ’03: Proceedings of the 2nd ACM international conference on Wireless sensor networks and applications. New York, NY, USA: ACM. Isler, V., Khanna, S., Spletzer, J., & Taylor, C.J. 1988. Target Tracking with Distributed Sensors: the Focus of Attention Problem. Journal of Computer Vision and Image Understanding: Special Issue on Attention and Performance in Computer Vision,. Iwata, A., Chiang, C., Pei, G., Gerla, M., & Chen, T. 1999. Scalable Routing Strategies for Ad-hoc Wireless Networks. IEEE Journal of Select. Areas Communication, 17(8), 1369–1379. Jadbabaie, A., Yu, J., & Hauser, J. 2001. Unconstrained Receding-horizon Control of Nonlinear Systems. IEEE Trans. on Automatic Control, 46(5), 776–783. Jung, B. 2005 (May). Cooperative Target Tracking Using Mobile Robots. Ph.D. thesis, University of Southern California, Los Angeles, CA. Kempe, David, & McSherry, Frank. 2004. A decentralized algorithm for spectral analysis. Pages 561–568 of: STOC ’04: Proceedings of the thirty-sixth annual ACM symposium on Theory of computing. New York, NY, USA: ACM. Kendall, D., Barden, D., Carne, T., & Le, H. 1999. Shape and Shape Theory. John Wiley and Sons.
179
REFERENCES Khatib, Oussama. 1986. Real–Time Obstacle Avoidance for Manipulators and Mobile Robots. International Journal of Robotic Research, 5(1), 60. Kim, Y., & Mesbahi, M. 2006. On Maximizing the Second Smallest Eigenvalue of a State–dependent Graph Laplacian. IEEE Transactions on Automatic Control,. Kochhal, Manish, Schwiebert, Loren, & Gupta, Sandeep. 2003. Role–based Hierarchical Self Organization for Wireless Ad Hoc Sensor Networks. Pages 98–107 of: WSNA ’03: Proceedings of the 2nd ACM international conference on Wireless sensor networks and applications. New York, NY, USA: ACM. Koditschek, Daniel E., & Rimon, Elon. 1990. Robot navigation functions on manifolds with boundary. Adv. Appl. Math., 11(4), 412–442. Krishna, K. Madhava, Hexmoor, Henry, & Sogani, Shravan. 2005 (Oct.-Nov.). A T–Step Ahead Constrained Optimal Target Detection Algorithm for a Multi Sensor Surveillance System. Pages 1840–1845 of: Proceedings of IEEE/RSJ International Conference on Intelligent Robot and Systems. Kuhn, H. W. 1955. The Hungarian Method for the Assignment Problem. Naval Research Logistics Quarterly 2, 83–97. Langendoen, K., & Reijers, N. 2003. Computer Networks. Elsevier. Chap. Distributed Localization in Wirelees Sensor Networks: A quantitative Comparison. Latombe, Jean-Claude. 2003. Robot Motion Planning. Boston, MA, USA: Kluwar Academic Publishers. LaValle, S., Gonzalez-Banos, H., Becker, C., & Latombe, J. 1997 (April). Motion Strategies for Maintaining Visibility of a Moving Target. Pages 731–736 of: Proceeding of the IEEE Int. Conference on Robotics and Automation. LaValle, S.M., & Hutchinson, S.A. 1998. Optimal Motion Planning for Multiple Robots Having Independent Goals. IEEE Trans. on Robotics and Automation, 14(6), 912–925. 180
REFERENCES Lehsaini, M., Guyennet, H., & Feham, M. 2008. A Novel Cluster–based Selforganization Algorithm for Wireless Sensor Networks. Pages 19–26 of: CTS 2008: Proceedings of the 2008 IEEE International Symposium on Collaborative Technologies and Systems, vol. 19. Leon, Francisco. 2005. LMSApi. http://lmsapi.sourceforge.net/. Leonard, John, Barrett, David, How, Jonathan, Teller, Seth, Antone, Matt, Campbell, Stefan, Epstein, Alex, Fiore, Gaston, Fletcher, Luke, Frazzoli, Emilio, Huang, Albert, Jones, Troy, Koch, Olivier, Kuwata, Yoshiaki, Mahelona, Keoni, Moore, David, Moyer, Katy, Olson, Edwin, Peters, Steven, Sanders, Chris, Teo, Justin, & Walter, Matthew. 2007 (April). Team MIT Urban Challenge Technical Report. Tech. rept. MIT. Li, Q., deRosa, M., & Rus, D. 2003a. Distributed Algorithms for Guiding Navigation Across a Sensor Net. In: Mobicom. Li, Xiang-Yang, Wan, Peng-Jun, & Frieder, Ophir. 2003b. Coverage in Wireless Ad Hoc Sensor Networks. IEEE Transactions on Computers, 52(6), 753–763. Liu, Z., Jr., M. H. Ang, & Seah, W. K. G. 2003 (March). A Potential Field based Approach for Multi-Robot Tracking of Multiple Moving Targets. In: Proc. 1st International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management. Lobo, M., Vandenberghe, L., Boyd, S., & Lebret, H. 1998. Applications of Secondorder Cone Programming. Linear Algebra and Applications, Special Issue on Linear Algebra in Control, Signals and Image Processing. Lofberg, J. 2004. YALMIP : A Toolbox for Modeling and Optimization in MATLAB. In: Proceedings of the CACSD Conference. Loizou, Savvas G., Dimarogonas, Dimos V., & Kyriakopoulos, Kostas J. 2004. Decentralized feedback stabilization of multiple nonholonomic agents. Pages 3012– 3017 of: in 2004 IEEE International Conference on Robotics and Automation. 181
REFERENCES Lowe, D. 2004. Distinctive image feature from scale-invariant keypoints. International Journal of Computer Vision, 60(2). Manduchi, R., Castano, A., Talukder, A., & L.Matthies. 2005. Obstacle Detection and Terrain Classification for Autonomous Off-Road Navigation. Autonomous Robot, 18, 81–102. Mayne, D., Rawings, J., Rao, C., & Scokaert, P. 2000. Constrained Model Predictive Control: Stability and Optimality. Automatics, 36(6), 789–814. Mirzaei, F. M., Mourikis, A. I., & Roumeliotis, S. I. 2007 (April). On the Performance of Multi-Robot Target Tracking. In: Accepted to IEEE International Conference on Robotics and Automation (ICRA) 2007. Montemerlo, Michael, & Thrun, Sebastian. 2004.
Experimental Robotics IX.
Springer. Chap. Large–scale Robotic 3–D Mapping of Urban Structures. Montemerlo, Michael, Hahnel, Dirk, Ferguson, David, Triebel, Rudolph, Burgard, Wolfram, Thayer, Scott, Whittaker, William, & Thrun, Sebastian. 2002 (October). A System for Three-Dimensional Robotic Mapping of Underground Mines. Tech. rept. CMU-CS-02-185. Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. MOSEK. 2006. The MOSEK Optimization Tools Version 3.2 (Revision 8) User’s Manual and Reference. MOSEK ApS. http://www.mosek.com. Mourikis, Anastasios I., & Roumeliotis, Stergios I. 2005 (June). Optimal Sensing Strategies for Mobile Robot Formations: Resource-Constrained Localization. In: Proceedings of Robotics: Science and Systems. Nesterov, Yurii, & Nemirovskii, Arkadii. 1994. Interior-point Polynomial Algorithms in Convex Programming. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics (SIAM).
182
REFERENCES Newman, P., Cole, D., & Ho, K. May 15-19, 2006. Outdoor SLAM Using Visual Appearance and Laser Ranging. Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, 1180–1187. ¨ Ogren, P., & Leonard, N. 2002 (October). A tractable convergent dynamic window approach to obstacle avoidance. In: IEEE/RSJ IROS, vol. 1. Pereira, G. A. S., Das, A. K., Kumar, V., & Campos, M. F. M. 2003 (March). Decentralized Motion Planning for Multiple Robots subject to Sensing and Communication Constraints. Pages 267–278 of: Proceedings of the Second Multi-Robot Systems Workshop. Philippsen, R., & Siegwart, R. 14-19 Sept. 2003. Smooth and Efficient Obstacle Avoidance for a Tour Guide Robot. Robotics and Automation, 2003. Proceedings. ICRA ’03. IEEE International Conference on, 1, 446–451. Poduri, S., & Sukhatme, G. 2004 (Mar). Constrained Coverage for Mobile Sensor Networks. In: IEEE ICRA. P`olik, Imre. 2005. Addendum to the SeDuMi User Guide Version 1.1. Advanced Optimization Laboratory, McMaster University. http://sedumi.mcmaster.ca/. Powers, M., & Balch, T. 2004. Value–based Communication Preservation for Mobile Robots. In: 7th International Symposium on Distributed Autonomous Robotic Systems. Press, W., et al. . 1993. Numerical Recipes in C. Cambridge University Press. Reinholtz, Charles, Alberi, Thomas, Anderson, David, Bacha, Andrew, Bauman, Cheryl, Cacciola, Stephen, Currier, Patrick, Dalton, Aaron, Farmer, Jesse, Faruque, Ruel, Fleming, Michael, Frash, Scott, Gothing, Grant, Hurdus, Jesse, Kimmel, Shawn, Sharkey, Chris, Taylor, Andrew, Terwelp, Chris, Covern, David Van, Webster, Mike, & Wicks, Al. 2007 (April). Team Victor Tango: DARPA Urban Challenge Technical Paper. Tech. rept. Virginia Tech.
183
REFERENCES Saad, Y. 2003. Iterative Methods for Sparse Linear Systems. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics. Shucker, B., & Bennett, J. K. 2005 (October). Target Tracking with Distributed Robotic Macrosensors. In: Proceedings of MILCOM 2005. SICK AG Waldkirch. 2006. Telegrams for Configuring and Operating the LMS 2xx Laser Measurement System. Skrzypczynski, Piotr. 2007 (April). Spatial Uncertainty Management for Simultaneous Localization and Mapping. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA ’07). Spletzer, J., & Fierro, R. 2005. Optimal Positioning Strategies for Shape Changes in Robot Teams. In: Proc. IEEE Int. Conf. Robot. Automat. Spletzer, J., & Taylor, C. 2003. Dynamic Sensor Planning and Control for Optimally Tracking Targets. International Journal of Robotics Research, 22(1), 7–20. Stamos, I., & Allen, P. 1998 (June). Interactive Sensor Planning. Pages 489–495 of: Computer Vision and Pattern Recognition Conference. Stanford Racing Team. 2007 (April). Stanfords Robotic Vehicle Junior: Interim Report. Stroupe, Ashley, & Balch, Tucker. 2003 (September). Value–Based Observation with Robot Teams (VBORT) for Dynamic Targets. In: Proceedings of IROS 2003. Sweeney, J., Brunette, T. J., Yang, Y., & Grupen, R. A. 2002. Coordinated Teams of Reactive Mobile Platforms. Pages 99–304 of: Proceedings of the International Conference on Robotics and Automation (ICRA). Sweeney, John D., Grupen, Roderic A., & Shenoy, Prashant. 2004 (August). Active QoS Flow Maintenance in Controlled Mobile Networks. In: Proceedings of the Fourth International Symposium on Robotics and Automation (ISRA). IEEE, Queretaro, Mexico. 184
REFERENCES Systems Support Solutions. 2005. Free Space Optics. System Support Solutions. http://www.systemsupportsolutions.com/. Thrun, S., Burgard, W., & Fox, D. 2000. A Real-time Algorithm for Mobile Robot Mapping with Applications to Multi-robot and 3D Mapping. Robotics and Automation, 2000. Proceedings. ICRA ’00. IEEE International Conference on, 1, 321–328 vol.1. Thrun, S., Montemerlo, M., Dahlkamp, H., Stavens, D., Aron, A., Diebel, J., Fong, P., Gale, J., Halpenny, M., Hoffmann, G., Lau, K., Oakley, C., Palatucci, M., Pratt, V., Stang, P., Strohband, S., Dupont, C., Jendrossek, L.-E., Koelen, C., Markey, C., Rummel, C., van Niekerk, J., Jensen, E., Alessandrini, P., Bradski, G., Davies, B., Ettinger, S., Kaehler, A., Nefian, A., & Mahoney, P. 2006. Winning the DARPA Grand Challenge. Journal of Field Robotics. Trafton, A. 2006 (July). Team Envisions Exploring Mars with Mini Probes (MIT News Office). http://web.mit.edu/newsoffice/2006/microbots.html. TTC.
2006
National
(December). Center
TTC for
Defense
Program
Spotlight:
Robotics
The
(e–Newsletter).
http://www.techcollaborative.org/default.aspx?id=ncdr dec06. Urmson, Chris, Anhalt, Joshua, Bagnell, Drew, Baker, Christopher, Bittner, Robert, Dolan, John, Duggins, Dave, Ferguson, Dave, Galatali, Tugrul, Geyer, Chris, Gittleman, Michele, Harbaugh, Sam, Hebert, Martial, Howard, Tom, Kelly, Alonzo, Kohanbash, David, Likhachev, Maxim, Miller, Nick, Peterson, Kevin, Rajkumar, Raj, Rybski, Paul, Salesky, Bryan, Scherer, Sebastian, Woo-Seo, Young, Simmons, Reid, Singh, Sanjiv, Snider, Jarrod, Stentz, Anthony, Whittaker, William, Ziglar, Jason, Bae, Hong, Litkouhi, Bakhtiar, Nickolaou, Jim, Sadekar, Varsha, Zeng, Shuqing, Struble, Joshua, Taylor, Michael, & Darms, Michael. 2007 (April). Tartan Racing: A Multi-Modal Approach to the DARPA Urban Challenge. Tech. rept. Carnegie Mellon University.
185
REFERENCES Urmson, Christopher, Anhalt, Joshua, Bartz, Daniel, Clark, Michael, Galatali, Tugrul, Gutierrez, Alexander, Harbaugh, Sam, Johnston, Joshua, Kato, Hiroki, Koon, Phillip L, Messner, William, Miller, Nick, Mosher, Aaron, Peterson, Kevin, Ragusa, Charlie, Ray, David, Smith, Bryon K, Snider, Jarrod M, Spiker, Spencer, Struble, Joshua C, Ziglar, Jason, & Whittaker, William Red L. 2006. A Robust Approach to High-Speed Navigation for Unrehearsed Desert Terrain. Journal of Field Robotics, 23(8), 467–508. Vale, Alberto Manuel Martinho. 2005.
Sick Laser Interface for Matlab.
http://users.isr.ist.utl.pt/ vale/english/projects/sick/sick.html. Wagner, A. R., & Arkin, R. C. 2004. Communication-sensitive Multi–robot Reconnaissance. Pages 2480–2487 of: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Waldkirch, SICK AG. 2006. User Protocol Services for Operating/Configuring the LD–OEM/LD–LRS Laser Measurement System. Wang, Chieh-Chih, & Thorpe, Charles. 2004 (September-October). A Hierarchical Object Based Representation for Simultaneous Localization and Mapping. In: Proceedings of the IEEE/RSJ 2007 International Conference on Intelligent Robots and Systems (IROS ’04). Willebrand, H., & Ghuman, B. 2002. Free Space Optics: Enabling Optical Connectivity in Today’s Networks. Sams Publishing. Wilson, J. 2002. Ultra-Wideband / a disruptive RF technology? Tech. rept. Intel Research and Development. Winfield, A. F. T. 2000. Distributed Sensing and Data Collection via Broken Ad hoc Wireless Connected Network of Mobile Robots. Pages 273–282 of: L. E. Parker, G. Bekey, & Barhen, J. (eds), Distributed Autonomous Robotic Systems 4. Springer-Verlag.
186
REFERENCES Yamauchi, Brian. 2006. Autonomous Urban Reconnaissance Using Man Portable UGVs. In: Proceedings of Unmanned Systems Technology VIII (SPIE Conference 6230). Yuen, David. 2004. LMS200Demo. http://www.ele.auckland.ac.nz/~cyue001/tech/LMSintro.html. Zhang, Fumin, Goldgeier, Michael, & Krishnaprasad, P. S. 2003 (Sep). Control of Small Formations Using Shape Coordinates. In: Proc. IEEE Int. Conf. Robot. Automat., vol. 2.
187
Vita Jason Charles Derenick was born on March 13, 1981 in Scranton, Pennsylvania, to Donna Lee and Stephen Derenick, Jr. Jason attended Riverside Junior Senior High School (in Taylor, Pennsylvania) from which he graduated among the top ten in his class. Soon after, he accepted the Ignatius of Loyola Scholarship to attend the University of Scranton, Pennsylvania. At the University of Scranton, Jason pursued a dual–major in computer science and mathematics. During his time as an undergraduate, he served as President of the Pennsylvania Gamma Chapter of Upsilon Pi Epsilon (2002–2003) and was a Presidential Honors Student (2003) receiving Honors in Mathematics (Pi Mu Epsilon) and Honors in Computer Science (Upsilon Pi Epsilon). He was selected by departmental faculty to receive the Class of 2003 Award for Academic Excellence in the Computing Sciences. In May 2003, Jason graduated summa cum laude earning the Bachelor of Science in Computer Science and Bachelor of Science in Mathematics degrees. His undergraduate thesis advisor was Professor John Beidler. After completing his undergraduate studies, Jason joined the Department of Computer Science and Engineering at Lehigh University as a doctoral student under the mentorship of Professor John Spletzer. Over the next six years, Jason worked as a member of the Vision, Assistive Devices, and Experimental Robotics (VADER) Laboratory where he focused on developing optimal coordination strategies for multi–agent systems. In the 2006–2007 academic year, he also served as a teaching assistant for courses in both networks and advanced robotics.
188
VITA
Among other notable achievements, Jason was an active member of the Ben Franklin Racing Team, a collaborative effort between the University of Pennsylvania, Lehigh University and the Advanced Technologies Laboratory (ATL) at Lockheed Martin, to help field an autonomous vehicle entry in the 2007 DARPA Urban Challenge. The team’s entry, “Little Ben”, took fourth place in the final competition on November 3, 2007 at the former George Air Force base in Victorville, CA. As part of his efforts on this project, Jason served as a visiting student in the General Robotics, Automation, Sensing, and Perception (GRASP) Laboratory at the University of Pennsylvania during the Fall 2007 semester. He subsequently released an open–source software package that has gone on to be utilized by academic and industry leaders to include the Robotics Institute at Carnegie Mellon University, the Stanford Artificial Intelligence Laboratory (SAIL), Google, Inc., among many others. During Summer 2008, he served as an intern with the Multi–dimensional Mobility Robot (MDMR) Group at Boston Dynamics, Waltham, MA where he developed a manual kinematic control system that was demonstrated as part of DARPA MDMR Phase II project testing at the Southwest Research Institute. Jason was awarded the Master of Science in Computer Science in May 2005 and received the Doctor of Philosophy in Computer Science four years later in May 2009. During his tenure as a doctoral student at Lehigh University, he authored/coauthored ten papers (nine of which he is the primary author) including three journal publications. His work has appeared in highly respected forums to include the IEEE Transactions on Robotics and the Journal of Field Robotics. In January 2009, Jason unofficially departed Lehigh University after accepting an early appointment as a visiting research scholar in the Department of Mechanical Engineering and Applied Mechanics (MEAM) at the University of Pennsylvania. He is currently a post–doctoral researcher and a member of the GRASP Laboratory at Penn under the mentorship of his research advisor, Professor Vijay Kumar.
189