Navigation-based Optimization of Stochastic Deployment Strategies ...

Navigation-based Optimization of Stochastic Deployment Strategies for a Robot Swarm to Multiple Sites ´ am Halász, M. Ani Hsieh and Vijay Kumar Spring Berman, Ad´ Abstract— We present a biologically inspired approach to the dynamic assignment and reassignment of a swarm of homogeneous robots to multiple locations, with applications in search and rescue, environmental monitoring, and task allocation. Our work is inspired by experimental studies of ant house hunting and empirical models that predict the behavior of the colony that is faced with a choice between multiple candidate nests. In this work, we build on the results of [1] and present a method to optimize stochastic control policies for the dynamic redistribution of a swarm of agents among multiple sites in a specified ratio to account for navigation delays. The stability and convergence properties of these control policies are analyzed and simulation results are presented.

I. INTRODUCTION We are interested in the deployment of a swarm of homogeneous robots to various distinct locations for parallel task execution at the each locale. As such, robots must be capable of distributing themselves among many locations/sites as well as redistributing to ensure task completion at each site which may be affected by robot failures or changes in the environment. This is similar to the task/resource allocation problem where the objective is to determine the optimal assignment of robots to tasks. Such combinatorial optimization problems are known to scale poorly as the numbers of agents and tasks increase. In the multi–robot domain, existing methods often reduce to market–based approaches [2]–[4] where robots must execute complex bidding schemes so as to determine the appropriate allocation based on the various perceived costs and utilities. While market–based approaches have gained much success in various multi–robot applications [5]–[8], optimality is often sacrificed so as to reduce the computation and communication requirements [9], [10]. Another approach to the allocation problem is to formulat it as a Distributed Constraint Satisfaction Problem [11] which requires the explicit modeling of tasks, their requirements, and robot capabilities, making implementation for large populations difficult. Additionally, as the number of agents increases, it becomes impractical, if not unrealistic, to expect small, resource constrained robotic agents to always have the communication or computational capabilities required for centralized control. As such, it makes sense to consider decentralized strategies This work was not supported by any organization ´ am Halász and Vijay Kumar are with GRASP LabSpring Berman, Ad´ oratory, Universiy of Pennsylvania, 3330 Walnut Street, Philadelphia, PA 19104, USA, {halasz,spring,kumar}@grasp.upenn.edu M. Ani Hsieh is with the Engineering Department, Swarthmore College, 500 College Avenue, Swarthmore, PA 19081, USA,

[email protected]

to the allocation problem that can lead to optimal outcomes for the population in a manner that is efficient, robust, and uses little to no communication. Such decentralized consensus–building behaviors are often seen in a variety of biological organisms, including ants [12], honeybees [13], and cockroaches [14]. These systems are generally composed of large numbers of organisms with simple behaviors that respond solely to local information. Thus, when considering the deployment of a robot swarm to distinct locations, it makes sense to consider such “swarming paradigms” where agents have the capability to operate asynchronously and can effect their behaviors based solely on local sensing alone. Allocation strategies that are closest in spirit to such swarming paradigms include [15], where an adaptive multiforaging task with no explicit communication or global knowledge is modeled as a stochastic process. While the model has been verified through simulations, the only way to control robot task reallocation is to modify the task distribution in the environment. In [16], large robot populations are modeled using a partial differential equation. Then a centralized optimal control strategy is used to maximize robot occupation at a desired position. Similar to [17] and [15], our proposed strategy employs a multi-level representation of swarm activity. Rather than the bottom-up analysis procedure, our methodology is based on a top-down design approach. We draw inspiration from the process through which an ant colony selects a new home from several sites using simple stochastic behaviors that arise from local sensing and physical contact [18], [19]. We build on [1], [20], [21] and propose a communication–less approach to the allocation of a robot swarm to multiple sites taking into account time delays due to navigation. This paper is organized as follows: We begin with some background information in Section II and formulate the time– delayed problem in Section III. The convergence and stability properties of our controllers are discussed in Section IV. Section V describes our controller design methodolgy that accounts for travel times between sites. Section VI presents our simulations and a discussion of our results. We conclude with a discussion of directions for future work in Section VII. II. BACKGROUND Temnothorax albipennis ants engage in a collective process of group decision-making when they are faced with the task of choosing between two new nest sites upon the destruction of their old nest [18]. This selection process involves two

phases. Initially, undecided ants choose one of the sites quasiindependently, often after visiting both. A recruiter ant, i.e. one who has chosen one of the two candidate sites, returns to the old nest site and recruits another ant to its chosen site through a tandem run. Thus, the path to the new site is learned by the “naive” ant by following the recruiter ant. Once this occurs, the naive ant becomes a recruiter ant for the particular candidate site; in essence, showing preference for the particular site. In such fashion, ants with preferences for different sites recruit naive ants from the old site to their preferred site. Once a critical population level at a candidate site has been attained, i.e. quorum, the convergence rate to a single choice is then boosted by recruiters, who instead of recruiting naive ants via tandem runs, simply transport naive ants from the old nest to the new one. The ants exhibit a small number of distinct, simple behaviors that differ only in their allegiance to one of the nests. During the selection process ants transition spontaneously, but at constant average rates, between their allegiance to the sites. The pattern of transition rates, which ultimately determine the average propensity of individual ants to switch behaviors, ensures that the higher quality nest is chosen and that no ants are stranded in the worse nest. This is done without any explicit communication since no pheromones are used by Temnothorax albipennis in the house hunting process, rather the group relies heavily on physical interactions between neighbors. A model for this “ant house hunting” process was initially presented and analyzed in [20]. In [21] showed how the model in [21] can be modified to converge to a state at which both new sites are occupied at a pre-specified ratio. This was then further extended to consider the problem of distributing N agents across M distinct locations in [1]. Furthermore, [1] showed that transitions from one locale to another often involves non–zero transition times since agents must navigate from one site to another. As such, a linear time–delayed model was proposed for these systems. In this work, we define metrics for the optimal (re)deployment strategy and optimally determine a deployment strategy which accounts for the impact of these non–zero travel times. We show that the time–delayed model can in fact be approximated by an equivalent linear model that possesses a unique and stable equilibrium point which corresponds to the desired allocation. A brief summary of the linear models presented in [1] and the formulation of our time–delayed model are described in the following section. III. PROBLEM STATEMENT A. Definitions Consider N agents to be distributed among M sites. We denote the number of agents at site i ∈ {1, . . . , M } at time t by ni (t) and the desired number of agents at site i by n ¯ i . We define the population P fraction at each site at time t as xi (t) where xi (t) = ni (t)/ i n ¯ i . Then the system state vector is T given by x = [x1 , . . . , xM ] . For some initial distribution of 0 the agents given by {ni } , i = 1, ..., M , we denote the target

configuration as a set of population fractions to occupy each site given by n ¯i x ¯i = PM

i=1

n ¯i

∀ i = 1, . . . , M .

(1)

A specification in terms of fractions rather than absolute agent numbers is practical for scaling as well as for potential applications where losses of agents to attrition and breakdown are common. Then the task is to redeploy the swarm of N robots to the desired configuration given by the x ¯, starting from the initial distribution x0 as fast as possible while minimizing inter-agent communication. The interconnection topology of the sites can be modeled as a directed graph, G = (V, E), where the set of vertices, V, represents sites {1, . . . , M } and the set of edges, E, represents physical routes between sites (more generally, possible transitions between tasks). Sites i and j are adjacent, denoted by i ∼ j, if agents can travel from i to j. We represent this relation by the ordered pair (i, j) ∈ V × V, with the set E = {(i, j) ∈ V × V | i ∼ j }. It is assumed that the graph G is strongly connected, i.e. a directed path exists between any pair of distinct vertices. We consider x(t) to represent the distribution of the state of a Markov process on G, for which V is the state space and E is the set of possible transitions. Every edge in E is max assigned two constant positive transition rates, kij and kij , each of which defines a transition probability per unit time for one robot from site i to go to site j. In general, kij 6= kji . max The rate kij enforces a limit on the number of robots that can travel simultaneously along edge (i, j). Lastly, we define the flux from site i to site j, denoted by φij , as the fraction of agents per unit time going from i to j and denote the time required to travel from site i to site j as τij . We assume that each robot has complete knowledge of G max and the transition rates kij and kij for each edge. We also assume that the robots have localization capabilities. B. Linear Model Our baseline strategy [1] endows each agent with a small set of instructions based on the transition rates kij and achieves (re)deployment of the swarm of N robots to M sites using no communication. Furthermore, rather than model the system as a collection of individual agents, this strategy models the swarm as a continuum. Thus, in the limit of large N , the time evolution of the population fraction at site i is given by a linear law: dxi (t) = dt

X ∀j|(j,i)∈E

X

kji xj (t) −

kij xi (t).

(2)

∀j|(i,j)∈E

Then the system of equations for the M sites is given by dx = Kx dt

(3)

PM with Kij = kji for i 6= j and Kii = − j=1,(j,i)∈E kij . We note that the columns of K sum to 0 and since the number

of agents is conserved, the system is subject to the following constraint M X xi (t) = 1 . (4) i=1

Equation (2) can be equivalently written as X X dxi (t) = φji (t) − φij (t). dt ∀j|(j,i)∈E

Fig. 1. A labeled edge (i, j) = (1, 2) that consists of (a) the physical sites, corresponding to model (2), and (b) both physical and dummy sites (for D12 = 2), corresponding to (9).

∀j|(i,j)∈E

This system was shown to possess a unique and stable equilibrium point given by (1) in [1]. C. Time-Delayed Model While the loss of robots at a site due to transfers to other sites is immediate, the gain due to incoming robots from other sites is delayed since it takes a finite amount of time to navigate from one site to another. The effect of finite travel times can be included by re–writing (2) as a delay differential equation (DDE): X X x˙ i (t) = kji xj (t − τji ) − kij xi (t) . (5) j∼i

i=1

f (t) = λij e−λij t .

xi (t) +

M X X

yij (t) = 1

(6)

i=1 i∼j

where nij (t) denotes the number of robots traveling from site i to j at time t and yij (t) = nij (t)/N . IV. ANALYSIS In this section, we show that the time–delayed model given by (5) can be viewed as an abstraction of a more realistic model, in which the delays, τij , are random variables from some distribution. If we assume these delays follow a gamma distribution, then the DDE model given by (5) can be transformed to a ODE model of them form (2) by representing each edge in (5) with a directed path composed of a finite set of “dummy sites.” This approach is similar to the linear chain trick used in [22] to transform a system of integro-differential equations with gamma-distributed delays into an equivalent system of ODE’s. To achieve this, we replace each edge (i, j) with a directed path composed of a sequence of dummy sites, k = 1, ..., Dij . Assume that each of these dummy sites are equally spaced. Then τij /Dij is the deterministic time to travel from dummy site k ∈ {1, ..., Dij } to its adjacent site such that when k = Dij , the site corresponds to site j. Then the rate at which an agent transitions from one dummy site to the next is defined as the inverse of this time, λij = Dij /τij . The number of transitions between two adjacent dummy sites in a time interval of length t has a Poisson distribution with parameter λij t. If we assume that the numbers of transitions in non–overlapping intervals are independent, then

(7)

PDij

Let Tij = k=1 Tk be the total time to travel from site i to site j. Since T1 , ..., TDij are independent random variables drawn from the common density function (7), their sum Tij results in a gamma density function [23] given by: D

g(t) =

i∼j

In this model, there is always a finite number of robots en route between sites at equilibrium, as such, (5) is subject to the following constratint: M X

the probability density function of the travel time between dummy sites k and k + 1 is given by

λij ij tDij −1 −λij t e (Dij − 1)!

(8)

with expected value and variance of Tij given by E(Tij ) = 2 τij and V ar(Tij ) = τij /Dij respectively. Thus, in an equivalent ODE model, the average travel time from site i to site j is always the delay τij from the DDE model, and the number of dummy sites Dij can be chosen to reflect the travel time variance. Note that V ar(Tij ) → 0 as Dij → ∞. As such, the equivalent ODE model for the system given by (5) with gamma–distributed time delays results in replacing each x˙ i with the following set of equations: X X (D ) x˙ i (t) = λji yji ji (t) − kij xi (t) , j∼i

i∼j

(1)

(1)

= kij xi (t) − λij yij (t) , (m) (m−1) (m) y˙ ij (t) = λij yij (t) − yij (t) , y˙ ij (t)

(9)

(l)

for m = 2, . . . , Dij , where yij (t) denotes the fraction of the population that is at dummy site l ∈ {1, ..., Dij }, λij denotes the transition rates between the dummy sites, and (i, j) ∈ E. As such, the equivalent ODE would have the form given by (m) (2) with added dummy states given by yij . At equilibrium, the incoming and outgoing flux at each dummy site along the path from site i to j is kij x ¯i , yields T T T (m) Kequiv x ¯ y ¯ = 0 with equilibrium values of yij (t), m = 1, ..., Dij , given by: kij x ¯i . (10) λij PDij (m) Furthermore, since yij (t) = m=1 yij (t), the equivalent constration to (6) for this system is given by   M X X x ¯i (t) 1 + kij τij  = 1 (11) (m)

y¯ij

i=1

=

i∼j

with the equilibrium values for xm (t) given by [24]: x ¯m = Kmm /

M X

(1 +

p=1

X

kpj τpj )Kpp , m = 1, ..., M .

p∼j

(12) WHAT IS XM and BIG K HERE?! THE NOTATION FOR THIS SECTION IS REALLY NOT CLEAR. WE NEED TO FIX THIS. Note that since kpj ≥ 0 and τpj > 0, the equilibrium values will always be lower than those for the corresponding linear model since a fraction of the population will always be en route. However, the ratio of equilibrium values between any two sites would be the same as in the linear model. Theorem 1: If the graph G is strongly connected, then the system (9) subject to (6) has a unique, stable equilibrium given by (10), (12). (m) Proof: Let y be the vector of fractions yij , m = 1, ..., Dij , (i, j) ∈ E. The system state vector is then z = [x y]T . By considering the dummy sites to be equivalent to the original M sites, we can say that each component of z is the population fraction at a site i ∈ {1, ..., M 0 }, where P 0 M = M + i∼j Dij . The interconnection topology of these sites can be modeled as a directed graph, G 0 = (V 0 , E 0 ), where V 0 = {1, ..., M 0 } and E 0 = {(i, j) ∈ V 0 ×V 0 | i ∼ j }. Since G is strongly connected, so is G 0 . Each component of z evolves according to the linear law: X X z˙i (t) = kˆji zj (t) − kîj zi (t) , (13) j∼i

i∼j

where kîj , (i, j) ∈ E 0 is defined by the corresponding coefficient in the model (9). As for the linear model, the system of equations for the M 0 sites is: ˆ , z˙ = −Kz

(14)

ˆ ij = −kˆji if (j, i) ∈ E 0 and 0 otherwise, and K ˆ ii = where K P ˆ ˆ i∼j kij . The columns of K sum to zero. The system is subject to the conservation constraint: 0

M X

zi (t) = 1 .

(15)

i=1

Since the system can be represented in the same form as (3) subject to (4), the uniqueness and stability proof in [1] can be applied to show that there is a unique, stable equilibrium. It has shown in [1], that systems of the form (2) result in agents transitioning between sites even at equilibrium, i.e. when the net flux for each site is zero. As such, they force a trade-off between maximizing the transition rates for fast equilibration and achieving long–term efficiency during equilibrium since the rate of convergence of these systems depend on the magnitude of the transition rates, kij . In actual robotic systems, the extraneous traffic resulting from the movement between sites at equilibrium come at a significant cost. As such, it makes sense to design the transition rates, i.e. K, to maximize performance of the system in the presence of gamma–distributed travel times.

Thus, an optimal deployment strategy is defined as the choice of K that simultaneously maximizes convergence towards the desired distribution while minimizing the number of idle trips at equilibrium. In other words, an optimal transition matrix, denoted by K∗ (x0 ), for some initial distribution, x0 , is one that can balance short term gains, i.e. fast convergence, against long term losses, i.e. idle trips at equilibrium. We outline our methodology for determining K∗ (x0 ) in the following section. V. METHODOLOGY In general, determining an optimal transition matrix K that satisfies both the short and long term restrictions is not trivial. Given the a set of short and long term requirements, we outline our methodology for determining an optimal transition matrix for some given initial distribution x0 , denoted by K∗ (x0 ). From some initial x0 , K∗ (x0 ) is determined by using the linear model (2) to calculate the convergence time to a set number of misplaced agents, i.e. agents in transitions, in closed form. This is achieved by decomposing K into its normalized eigenvectors and eigenvalues and mapping (2) into the space spanned by its normalized eigenvectors. Thus, given an initial state, x0 , we can apply the appropriate transformation and compute the new state x(t) using the matrix exponential of the matrix of eigenvalues multiplied by time. After applying the appropriate transformations, we can then calculate the difference in the number of misplaced agents between this new configuration and the desired one, x ¯. Since (2) is stable [1], the number of misplaced agents always decreases monotonically with time, as such, a Newton scheme is used to calculate the exact time when the difference is reduced to 10% of its initial value. To find the optimal K for the given initial configuration, x0 , Metropolis optimization [25] is used with the entries of K as the optimization variables. The objective function then is to minimize the convergence time subject to upper bounds on the number of idle trips at equilibrium and the max individual transition rates, i.e. kij ≤ kij . Figure 2 shows the convergence times for the stochastic optimization of the transition rates. VI. RESULTS & DISCUSSION We consider the deployment of 200 of agents, initially located at two sites, to four sites. To make the simulation more realistic, the workspace was chosen to be a four block region on the University of Pennsylvania’s campus and the task is to perform perimeter surveillance of four distinct campus building/collection of buildings selected a priori. We assume the agents are initially located at two sites and the goal is to equally distribute themselves across the four sites of interest. We outline our methodolgy and results in this section. A. Methodology Since the objective is to determine the optimal K given a collection of M with expected site–to–site travel times {τij },

Fig. 2. Convergence time for the stochastic optimization of the transition rates. The random walk in the space of all K configurations is biased with the convergence time so lower times are eventually found. The horizontal axis in the number of the respective configuration versus convergence time.

we assume a convex decomposition of the free space shown as in Fig. 3. This results in a discrete roadmap where shortest path computations from one cell to any other cell can be easily obtained using any standard graph search algorithm. At each site, the agents are tasked to monitor the perimeter of the region of interest, as such, once an agent arrives at a site, it is tasked to circulate perimeter such that all agents at a given site are uniformly distributed along the perimeter. This can is easily achieved with feedback controllers of the form given in [26]. The perimeters of interested are shown in Fig. 3(b) denoted by the light colored dashed lines. To enable navigation from one site to another, each site is then assign an entry/exit point such that each entry/exit point is associated with a convex cell. Then A* is used to a priori determine the shortest path between any pair of sites i, j such that (i, j) ∈ E which consist of a sequence of convex cells, {ck }ij , to be traversed by each agent. Navigation between convex cells is then achieved by composing local potential functions such that the resulting control policy ensures arrival at the desired goal position [27]. These navigation controllers where then composed with repulsive potential functions to achieve inter-agent collision avoidance within each convex cell. As such, navigation is achieved by providing each agent with the sequences of convex cells for every pair of sites, {ck }ij , a priori, while the feedback controllers to go from one cell to another is computed by each agent based on its current position and as well as the set of agents within its sensing range at every time step. Then, the time required to travel from site i to site j is given by the sum of the time needed to travel to a site’s exit point from an agent’s current position on the perimeter and the time spend on the road between one site’s exit point and another site’s entry point. To determine the τij ’s for every pair of sites, NUM of simulations were ran to determine the expected value of τij ’s for every pair of sites. This was then used to fit an appropriate gamma distribution for the given data set. Then the number of dummy sites to be added on edge (i, j) and the transition rates between the dummy sites are respectively given by Dij and λij from (8). A sample data set and the approximate gamma distribution is shown

(a)

(b) Fig. 3. (a) Region of interest selected from the University of Pennsylvania’s campus map. The region shown is approximately 100 m × 200 m in area. (b) Cell decomposition of the free spaced used for navigation.

in Fig. 4. ˆ opt . This was then used to determine K B. Results C. Discussion This work is devoted to closer analysis of inter-site travel and modalities to include realistic models of agent movement into optimization schemes for stochastic, decentralized deployment of large swarms to multiple sites. In addition to the fact that travel takes finite time and thus must be included in the design process, we emphasize that travel times are potentially highly variable due to many factors. Interactions with other agents results in modified navigation patterns; collision avoidance in simulations is in fact one approach to estimate the physical capacities (throughputs) of paths that can not be modeled as pipelines with quantifiable properties. The idea of emulating travel time delays with Poisson processes is straightforward; the more sophisticated version which relies on a linear chain of such processes is less well known. We apply these ideas here to a concrete example which is supported by simulations of agent movements. The main theoretical benefit of this work is the fact that we are

Fig. 4. Histogram of the travel times between the northeast and southwest sites in Fig. 3(b) shown in conjunction with the approximate gamma distribution for this data set. Based on this data Dij = 14 and λ = 0.0190.

able to treat travel within the framework of linear ODEdriven deployment models; we can take advantage of the powerful formal mathematical methods developed for such systems. The distribution of joint waiting times for chains of Poisson processes are given by the so caleld gamma distributions. We approximate the travel time distributions obtained from navigation simulations with gamma distributions, which provide the length and individual transition rates of linear chains which can be used to emulate traveling agents. Stark difference with the ODEs that result from idealized models of switching By contrast, linear ordinary differential equations are well sutdied and have quasi-closed form solutions. The linear chain ”trick”: linear chains allow for variability and can be dealt with more easily VII. CONCLUSION more realistic navigation; use simulations to quantify path/road capacities and effect of agent crowding expect higher variability: shorter chains but perhaps parallels less brute force optimization meohds this was a major hurdle in the path towards experimental implementation perhaps a software package some day VIII. ACKNOWLEDGMENTS The authors gratefully acknowledge the contribution of National Research Organization and reviewers’ comments. R EFERENCES [1] A. Halasz, M. A. Hsieh, S. Berman, and V. Kumar, “Dynamic redistribution of a swarm of robots among multiple sites,” in Proceedings of the Conference on Intelligent Robot Systems (IROS’07), 2007. [2] L. Lin and Z. Zheng, “Combinatorial bids based multi-robot task allocation method,” in Proc. of the 2005 IEEE Int. Conference on Robotics and Automation (ICRA’05), 2005. [3] J. Guerrero and G. Oliver, “Multi-robot task allocation strategies using auction-like mechanisms,” Artificial Research and Development in Frontiers in Artificial Intelligence and Applications, 2003. [4] E. G. Jones, B. Browning, M. B. Dias, B. Argall, M. Veloso, and A. T. Stentz, “Dynamically formed heterogeneous robot teams performing tightly-coordinated tasks,” in International Conference on Robotics and Automation, May 2006, pp. 570 – 575.

[5] M. B. Dias, R. M. Zlot, N. Kalra, and A. T. Stentz, “Market-based multirobot coordination: a survey and analysis,” Proceedings of the IEEE, vol. 94, no. 7, pp. 1257 – 1270, July 2006. [6] D. Vail and M. Veloso, “Multi-robot dynamic role assignment and coordination through shared potential fields,” in Multi-Robot Systems, A. Schultz, L. Parker, and F. Schneider, Eds. Kluwer, 2003. [7] B. P. Gerkey and M. J. Mataric, “Sold!: Auction methods for multirobot control,” IEEE Trans. on Robotics & Automation: Special Issue on Multi-robot Systems, vol. 18, no. 5, pp. 758–768, Oct 2002. [8] E. G. Jones, M. B. Dias, and A. Stentz, “Learning-enhanced marketbased task allocation for oversubscribed domains,” in Proceedings of the Conference on Intelligent Robot Systems (IROS’07), 2007. [9] M. B. Dias, “Traderbots: A new paradigm for robust and efficient multirobot coordination in dynamic environments,” Ph.D. dissertation, Robotics Institute, Carnegie Mellon Univ., Pittsburgh, PA, Jan 2004. [10] M. Golfarelli, D. Maio, and S. Rizzi, “Multi-agent path planning based on task-swap negotiation,” in Proc. 16th UK Planning and Scheduling SIG Workshop, 1997. [11] W.-M. Shen and B. Salemi, “Towards distributed and dynamic task reallocation,” in Intelligent Autonomous Systems 7, Marina del Rey, CA, 2002, pp. 570 – 575. [12] S. Pratt, E. B. Mallon, D. J. T. Sumpter, and N. R. Franks, “Quorum sensing, recruitment, and collective decision-making during colony emigration by the ant leptothorax albipennis,” Behavioral Ecology and Sociobiology, vol. 52, pp. 117–127, 2002. [13] N. F. Britton, N. R. Franks, S. C. Pratt, and T. D. Seeley, “Deciding on a new home: how do honeybees agree?” Proc R Soc Lond B, vol. 269, 2002. [14] J. Ame, J. Halloy, C. Rivault, C. Detrain, and J. Deneubourg, “Collegial decision making based on social amplification leads to optimal group formation,” in Proc. of the Nat’l Academy of Sciences of the USA, vol. 103:15, 2006, pp. 5835 – 5840. [15] K. Lerman, C. V. Jones, A. Galstyan, and M. J. Mataric, “Analysis of dynamic task allocation in multi-robot systems,” Int. J. of Robotics Research, vol. 25, no. 4, pp. 225–242, 2006. [16] J. Guerrero and G. Oliver, “Modeling and optimal centralized control of a large-size robotic population,” IEEE Transactions on Robotics, vol. 22, no. 6, pp. 1280–1285, 2006. [17] A. Martinoli, K. Easton, and W. Agassounon, “Modeling of swarm robotic systems: a case study in collaborative distributed manipulation,” Int. J. of Robotics Research: Special Issue on Experimental Robotics, vol. 23, no. 4-5, pp. 415–436, 2004. [18] N. Franks, S. C. Pratt, N. F. Britton, E. B. Mallon, and D. T. Sumpter, “Information flow, opinion-polling and collective intelligence in househunting social insects,” Phil. Trans.: Biological Sciences, 2002. [19] S. C. Pratt, “Quorum sensing by encounter rates in the ant temnothorax albipennis,” Behavioral Ecology, vol. 16, no. 2, 2005. [20] S. Berman, A. Halasz, V. Kumar, and S. Pratt, “Algorithms for the analysis and synthesis of a bio-inspired swarm robotic system,” in 9th Int. Conf. on the Simulation of Adaptive Behavior (SAB’06), Swarm Robotics Workshop, Rome, Italy, Setp-Oct 2006. [21] ——, “Bio-inspired group behaviors for the deployment of a swarm of robots to multiple destinations,” in Accept to 2007 Int. Conf. on Robotics and Automation (ICRA’07), Rome, Italy, April 2007. [22] N. MacDonald, Time-lags in biological models, ser. Lecture Notes in Biomathematics. Springer, Berlin, 1978, vol. 27. [23] B. Harris, Theory of Probability. Reading, MA: Addison-Wesley, 1966. [24] H. Othmer, Analysis of Complex Reaction Networks, Lecture Notes. University of Minnesota, 2003. [25] D. P. Landau and K. Binder, A guide to Monte-Carlo simulations in statistical physics. Cambridge University Press, 2000. [26] M. A. Hsieh, S. Loizou, and V. Kumar, “Stabilization of multiple robots on stable orbits via local sensing,” in Proceedings of the International Conference on Robotics and Automation (ICRA) 2007, Rome, Italy, April 2007. [27] D. C. Conner, A. Rizzi, and H. Choset, “Composition of local potential functions for global robot control and navigation,” in Proceedings of 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), vol. 4. IEEE, October 2003, pp. 3546– 3551.

Navigation-based Optimization of Stochastic Deployment Strategies ...

Navigation-based Optimization of Stochastic Deployment Strategies ...

Suggest Documents

Optimization of Stochastic Strategies for Spatially Inhomogeneous ...

Navigation-based Optimization of Stochastic Strategies for Allocating a ...

multistage and stochastic optimization of bidding strategies for a price ...

INNOVATION DEPLOYMENT STRATEGIES IN CONSTRUCTION

Coverage Optimization Deployment Based on

Stochastic Optimization Algorithms - arXiv

Deployment Time Optimization of Distributed ... - IBM Research

Optimization of sensor deployment using multi ...

Simulation Optimization for the Stochastic ... - Optimization Online

ordinal optimization approach to stochastic simulation optimization ...

Optimization of Transmission Strategies for

PCA-enhanced stochastic optimization methods

Online Stochastic Optimization Without Distributions

Stochastic Processes, Optimization, and Control

Stochastic Optimization with Importance Sampling

Online Stochastic and Robust Optimization

ASYNCHRONOUS STOCHASTIC OPTIMIZATION FOR SEQUENCE ...

Stochastic Optimization for Operating Chemical

Stochastic optimization for transshipment problems

Online Stochastic Optimization Without Distributions

Stochastic ADMM for Nonsmooth Optimization

STochastic OPTimization library in C++

Stochastic portfolio optimization using efficiency

STochastic OPTimization library in C++