insect colonies, called Swarm-GAP. Swarm-GAP was ex- perimented in an abstract centralized simulation environ- ment and in the RoboCup Rescue Simulator.
A Swarm Based Approximated Algorithm to the Extended Generalized Assignment Problem (E-GAP) Paulo R. Ferreira Jr.†‡, Felipe S. Boffo† and Ana L. C. Bazzan† ´ tica, Universidade Federal do Rio Grande do Sul †Instituto de Informa
Caixa Postal 15064, CEP 91501-970, Porto Alegre, RS, Brasil
ˆ ncias Exatas e Tecnolo ´ gicas, Centro Universita ´ rio Feevale ‡Instituto de Cie
RS239, 2755 - CEP 93352-000, Novo Hamburgo / RS, Brasil {prferreiraj,fboffo,bazzan}@inf.ufrgs.br
ABSTRACT
Algorithms, Experimentation
complex scenario in distributed task allocation as the one in which small instances formalized as a DCOP generate large problems with exponentially growing parameters. These complex DCOP scenarios introduce new challenges for the DCOP research. In the real-world we usually have large scale and dynamic complex scenarios. We propose Swarm-GAP, an approximated any-time algorithm for distributed task allocation based on the division of labor in social insects colonies, and on the theoretical models that describes it. This method is highly scalable regarding both the number of agents and tasks, and can solve the E-GAP [2] model for dynamic task allocation in complex DCOP scenarios. Cooperative agents running our algorithm are allowed to coordinate their actions with low communication and computation. We empirically evaluated the Swarm-GAP method on two simulated environments: an abstract, domain-independent simulator, and the RoboCup Rescue Simulator [1], which provides a more realistic scenario. Our swarm algorithm is compared mainly to LA-DCOP [2], an approximated method for DCOP that seems to outperform prominent contestants.
Keywords
2. SWARM-GAP
Emergent Behavior, Task and Resource Allocation, Distributed Constraint Processing
The aim of Swarm-GAP is to allow agents to decide individually which task to execute in a simple and efficient way, minimizing computational and communication efforts. As in [2], we assume that the communication does not fail. In the future we intend to relax this assumption and perform tests with unreliable communication channels. Agents in Swarm-GAP decide which task to execute based on the same mechanism used by social insects. The tendency of agent i ∈ (I) to execute task j ∈ (J) is given by its internal threshold θij , the task stimulus s and the coefficient of execution xj of task j, which is related to other tasks by an AND constraint, as shows Equation 1. The constant ω is used as a discount rate to decrease the weight of the coefficient of execution on the computation of the tendency.
This paper addresses distributed task allocation in complex scenarios modeled using the distributed constraint optimization problem (DCOP) formalism. We propose and evaluate a novel algorithm for distributed task allocation based on theoretical models of division of labor in social insect colonies, called Swarm-GAP. Swarm-GAP was experimented in an abstract centralized simulation environment and in the RoboCup Rescue Simulator. We show that Swarm-GAP achieves similar results to other recent proposed algorithm with a dramatic reduction in communication and computation. Thus, our approach is highly scalable regarding both the number of agents and tasks.
Categories and Subject Descriptors D.2.8 [Artificial Intelligence]: Distributed Artificial Intelligence—Multiagent systems
General Terms
1.
INTRODUCTION
The research regarding multiagent systems coordination through distributed task allocation has shown significant advances in the last few years. One successful direction, under the multiagent community perspective, has been the Distributed Constraint Optimization Problem (DCOP) framework. Task allocation in complex environments when modeled as a DCOP, results in very hard problems, which cannot be treated with the traditional optimal/complete DCOP approaches , leading to a restrict applicability. We see a
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’07 May 14–18 2007, Honolulu, Hawai’i, USA. Copyright 2007 IFAAMAS .
Tθij (s) =
(1)
Each task has the same associated stimulus, there is no priority on task allocation. The stimulus s is the same for every task j and its value was empirically determined to maximize the system reward, through direct experimentation. The execution coefficient xj associated to the task j
1231 c 978-81-904262-7-5 (RPS) 2007 IFAAMAS
s2j + ωxj 2 s2j + θij
is computed using the rate between the number of allocated tasks |nj |, and the total number of tasks |Nj | which are related to task j by means of an AND relationship.
xj =
1+|nj | |Nj |
0
if |nj | = |Nj | and |nj | > 1 otherwise
(2)
Swarm-GAP uses the polymorphism to setup the agents thresholds according to the agents capabilities, as shown in Equation 3. θij = 1 − capabilityi (j)
(3)
where capabilityi (j) is the capability of individual i regarding to task j. Algorithm 1 details Swarm-GAP. Agents running SwarmGAP communicate using a token based protocol, and react to two events: task perception and message arriving. When an agent perceives some tasks, it creates a token composed by these tasks (line 5) or it receives the token from another agent (line 10). Once this is done, the agent has the right to determine which tasks to allocate (lines 14 to 21), according to their tendency given by Equation 1. This decision also depends on whether the agent has the resource which is required to perform the task. The quantity of resource one agent has is decreased by the amount required by the task (line 19). Afterwards, the agent is marked as visited (line 23). This prevent deadlocks, since it avoids passing the token to agents that already received the token. At the end of this process, if the token still has available tasks, a token message is sent to an agent randomly selected among those agents which have not received the token in the current allocation (lines 24 to 29). The size of the token message is proportional to the number of tasks.
3.
EXPERIMENTS AND RESULTS
Empirical evaluations was conducted in two simulators. The first is an abstract, domain-independent simulator written in C++ and the second is the RoboCup Rescue Simulator [1], a real-time environment where agents, representing fire brigades, police forces and ambulances must cooperate to mitigate the effects of an earthquake affecting an urban area. Regarding the former, the abstract simulator allows experimentation with a large number of agents and tasks. Swarm-GAP is compared with two methods: LA-DCOP and a centralized greedy algorithm. This simulation has exactly the same setup used by [2] to experiment LA-DCOP. Regarding the RoboCup Rescue Simulator our experimental setup features 10 fire brigade agents. Agents capabilities are computed considering the normalized distance to fires. Two scenarios were used: partial maps from the cities of Kobe and Foligno. In the first there are 8 ignition points, clustered in three areas. In the second, a more difficult situation, there are 14 ignition points, clustered in four areas. Again, we compare the Swarm-GAP to LA-DCOP and to a greedy approach.
3.1 Abstract Simulations In the first experiment we measure the average number of messages per simulation step, according to the number of
1232
Algorithm 1 Swarm-GAP(agentId) 1: loop 2: ev ← waitEvent() 3: if ev = task perception then 4: J ← set of new tasks 5: token ← newToken() 6: for all j ∈ J do 7: token.addTask(j, -1) 8: end for 9: else 10: token ← receiveToken() 11: end if 12: 13: r ← avaiableResources() 14: τ ← token.avaiableTasks() 15: for all t ∈ τ do 16: θt ← 1− capability(t) 17: if roulette() < Tθt (s) and r ≥ ct then 18: token.aloc(j, agentId) 19: r ← r − ct 20: end if 21: end for 22: 23: token.visited(agentId) 24: τ ← token.avaiableTasks() 25: if |τ | > 0 then 26: ı ← token.avaiableAgents() 27: i ← rand(ı) 28: sendToken(i) 29: end if 30: end loop
agents, for different setup parameters of Swarm-GAP (stimulus 0.2 and 0.08) and LA-DCOP (threshold 0.0 and 0.8). Figure 1 shows that when LA-DCOP works with threshold equal to zero, the number of messages changed is the smallest, i.e. the average number of messages exchanged by Swarm-GAP is greater than LA-DCOP. However, in this case the total reward is significant lower for LA-DCOP in comparison with Swarm-GAP. The total reward in E-GAP is computed as the sum of the agents capabilities to each task they allocated. Swarm-GAP with 0.2 and LA-DCOP with 0.8 achieve similar rewards, Swarm-GAP is less than 10% worst than LA-DCOP, but the number of exchanged messages by LA-DCOP is dramatically greater than SwarmGAP, about 100% on average. In the second experiment, we evaluate Swarm-GAP comparing its results with those achieved by the greedy centralized algorithm and LA-DCOP. Figure 2 shows the total reward, for different number of agents, achieved by SwarmGAP, with the best stimulus for each number of agents and LA-DCOP with also the best threshold for each number of agents. As expected, the greedy approach outperforms Swarm-GAP and LA-DCOP. Swarm-GAP performs well achieving rewards that are only 20% lower (on average) than the greedy ones and 15% than LA-DCOP. The greedy strategy, used to benchmark other methods, allocates the appearing tasks to the most capable available agent, but does this in a centralized way. To illustrate the advantages of Swarm-GAP over LADCOP regarding the number of exchanged messages, Figure 3 shows the average reward divided by the total num-
The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07)
Average of total messages per step
30000
150
Swarm-GAP 0.08 Swarm-GAP 0.20 LA-DCOP 0.0 LA-DCOP 0.8
140
Swarm-GAP 0.2 LA-DCOP 0.8
130 Average reward per message
35000
25000
20000
15000
10000
120 110 100 90 80 70 60
5000 50 0 500
1000
1500
2000 2500 Number of agents
3000
3500
4000
Figure 1: Average number of messages per simulation cicle versus the number of agents.
40 500
1000
1500
2000 2500 Number of agents
3000
3500
4000
Figure 3: Comparison of the average reward divided by the total number of exchanged messages for different number of agents.
2e+06
map), this seems to work the other way: this fluctuation seems to avoid a more concentrated effort and Swarm-GAP needs 149.1 steps versus 112.2 steps needed by LA-DCOP . Both algorithm outperform the greedy one.
1.8e+06
Total reward
1.6e+06 1.4e+06 1.2e+06
4. CONCLUSIONS
1e+06 800000 Swarm-GAP Swarm-GAP 0.2 LA-DCOP LA-DCOP 0.0 Greedy
600000 400000 500
1000
1500
2000 2500 Number of agents
3000
3500
4000
Figure 2: Comparison of the total reward for different number of agents. ber of exchanged messages for different number of agents achieved by Swarm-GAP with stimulus 0.2 and LA-DCOP with threshold 0.8. This setup of the algorithms leads to similar rewards. As we can see, Swarm-GAP exchanges a significant lower number of messages. Furthermore, as Swarm-GAP uses a probabilistic decision process, the computation necessary to agents take their decision is significantly lower than in LA-DCOP. An agent running LA-DCOP chooses allocate tasks which maximize the sum of its capabilities, while respecting its resource constraints. This is a maximization problem that can be reduced to a Binary Knapsack Problem (BKP), which is proved to be NP-complete. The computational complexity of LA-DCOP depends of the complexity of its function to deal with BKP. Each agent solves several instances of BKPs during a complete allocation. Agents running Swarm-GAP chooses to allocate tasks according a probability computed by Equation 1, constrained by its available resources. This is a simple one-shot decision process.
3.2 RoboCup Rescue Simulations In the Kobe map, Swarm-GAP performs a little better than LA-DCOP extinguishing all fires in 28.7 steps against 34.3 steps needed by LA-DCOP. This can be explained by the fact that, while LA-DCOP agents do not accept distant tasks at all, some Swarm-GAP agents occasionally accept those tasks. And they help to prevent fire from spreading in those distant clusters. In the second situation (Foligno
The approach introduced here – Swarm-GAP – deals with task allocation in complex scenarios modeled as DCOPs based on the theoretical models of division of labor in swarms. The presented algorithm solves complex DCOPs in an approximated and distributed fashion. Swarm-GAP intends to be a simple and effective algorithm. The experimental results show that the probabilistic decision, based on the tendency and polymorphism models, allows the agents to make reasonable coordinated actions. In the abstract simulation the Swarm-GAP performs well, achieving rewards, on average, only 20% worse than the ones achieved by a greedy centralized approach and 15% than the ones achieved by other recent proposed algorithm. However, by the nature of its mechanisms, Swarm-GAP uses significant less communication and computation than the other algorithm mentioned above. In the RoboCup Rescue Simulation both Swarm-GAP and the cited algorithm achieves similar results, outperforming the greedy strategy.
5. ACKNOWLEDGMENTS This research is partially supported by the Air Force Office of Scientific Research (AFORS) (grant number FA9550-061-0517). Ana L. C. Bazzan is partially supported by CNPq.
6. REFERENCES
[1] H. Kitano, S. Takodoro, I. Noda, H. Matsubara, T. Takahashi, A. Shinjou, and S. Shimada. Robocup rescue: search and rescue in large-scale disasters as a domain for autonomous agents research. In Proc. of the IEEE International Conference on Systems, Man and Cybernetics, pages 739–743. Los Alamitos, IEEE Computer Society, 1999. [2] P. Scerri, A. Farinelli, S. Okamoto, and M. Tambe. Allocating tasks in extreme teams. In Proc. of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 727–734, Utrecht, The Netherlands, 2005. New York, ACM Press.
The Sixth Intl. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS 07)
1233