to the indexing where node i has node i/2 as its parent and nodes 2i and 2i+1 as its children. We assign robot and task information to the nodes as follows: each ...
From: FLAIRS-01 Proceedings. Copyright © 2001, AAAI (www.aaai.org). All rights reserved.
A Binary Tree based Approach for the Design of Fault-Tolerant Robot Team Haihang
Sun and Robert
McCartney
Departmentof ComputerScienceand Engineering Universityof Connecticut Storrs, CT06269-3555 {haihang[ robert } @engr.uconn.edu
Abstract Taskallocation andload balancingare critical in faulttolerant robotic teamdesign. This paper deals with the problemsof allocating andreallocating tasks to multiple robotsin a fault-tolerantmanner.Usinga tree structureto store global information about active robots and unfinishedsub-tasks, eachrobot can reallocate itself to tasks independentlywhenrobot failure occurs in the team. Organizedin full distributed manner,robots in the team respondrobustly, flexibly and without negotiation with other team membersto robot failure and other environmentchanges. Simulationresults are given to showthe feasibility of this approach. Introduction One of the most important and well-known advantages in using multiple robots is that it is possible to increase the robustness and fault-tolerance of a robotic system. It can be easier, cheaper, moreflexible and morefault-tolerant to build and using several simple robots than to have a single powerful, complicated robot. In a multi-robot system, the failure of a single robot mayonly cause partial degradation of system performance, and the robots can be much less complexsince each robot in the system is only responsible for partial fulfillment of the total task. However,several challenging issues, which do not arise in single robot systems, remain to be addressed while constructing multirobot systems (Mataric 1995): ¯ Howdo we describe, decomposeand allocate problems amonga group of robots? ¯ What is the communication and interaction structure amongrobots? ¯ Howdo we achieve globally coherent and efficient solutions from the interaction of robots lacking environment information? Problemssuch as these are mademore difficult if we allow robots to fail. Tasks need to be reallocated, communicationsand interaction structure must change, the robot-robot interactions will change, and so forth. To maintain efficiency, we need to provide fault tolerance Copyright ©2001, AAAI. Allrights reserved. 536
FLAIRS-2OOl
without excessive redundancy: we want to get useful work fromall robots available. Considering the limited communicationbandwidth in real physical robot systems, and the goal of scaling up to large robot teams, information exchange among robots needs to be limited. Anyschemerequiring large amountof communicationamongmanyrobots is not acceptable. In this paper, we consider using binary trees to keep information about tasks and available robots. Usingthese, each robot can allocate himself to tasks independent of other robots’ decisions. Robots organized in this architecture can dynamically reallocate tasks without negotiation when robot failure occurs. We have demonstrated this approach in simulation, and give exampleslater in this paper. The following section gives a brief overview of the related work in this area. The next section describes in detail our approach used for homogeneousrobots. Next, we present the results of our studies, followed by a discussion of these results. Finally, we concludethe paper, discussing questions yet to be resolved. Related
Work
Various approaches for achieving fault tolerance in robot teams have been proposed in the past few years. Parker (1994, 1998) demonstrates the ALLIANCE architecture, which is used to study fault tolerant team of largely cooperation in a heterogeneous independent, loosely coupled robots. Unlike typical behavior-based approaches, in ALLIANCE,individual robots are based on a behavior-based controller with an extension for activating behavior sets that accomplish certain tasks. Each behavior set is activated by motivational behaviors whose activations are in turn determined by using the current rates of impatience and acquiescence, as well as sensory feed back and the awareness of its teammates. Whenthe environmental changes require the robot to take over tasks from other malfunctioning team members or to give up its own current tasks, the motivations of impatience and acquiescence allow robot system to handle individual robot
failure and malfunction fluidly and flexibly, leading to adaptive action selection to ensure mission completion. Experiments on physical and simulated robot systems demonstrateits qualities of robustness, fault tolerance and flexibility. Schneider-Font~n and Mataric (1998) study territorial approach to minimizeinter-robot interference, and thus to achieve high task efficiency and fault tolerance. In the distributed clean-up and collection task, the robots are assigned individual territories which can be dynamically resized corresponding to robot failure, permitting the completion of the mission. Amongthe interesting features of this approachis that each robot can independently(and deterministically) allocate its owntask by simply knowingwhich robots have failed. Dias and Stenz (2000) describe a free market architecture with whichrobots coordinate to solve a given task. Eachrobot tries to maximizeits personal profit while executing the task. The robots negotiate amongthemselves to minimizetheir costs and maximizetheir profits. In this way, tasks are assigned to the most appropriate robots, thus dynamically allocating tasks amongrobots. The approach proposed in this paper stores global information about tasks and robots in a binary tree. This information allows individual robots to make workload balancing decisions quickly and independently, while keeping the average workload in each sub-group within a tolerable range. This makesphysical cooperative multiple robot team construction feasible with limited communication: allocation information is shared, but reallocation decisions are made without negotiations amongthe robots.
combinedinto a single subtask; an exampleof this is using a tree to partition a region so that siblings combineinto a contiguous region.) In subsequent figures, we label nodes corresponding to the indexing wherenode i has node i/2 as its parent and nodes 2i and 2i+1 as its children. Weassign robot and task information to the nodes as follows: each leaf k, which corresponds to a single subtask, is given the value (Rk,Sk), where Rk denotes the number of robots assignedto its task, and Sk is 1 if the task is unfinished, 0 if the task is finished. In general, the value of node i is basedon the sumof the values in its children’s subtrees, as illustrated in (Figure 1): for value (Ri,Si) Ri denotes the numberof active robots in the sub-tree and Si denotes the numberof unfinished tasks in the sub-tree. Figure 2 gives someexampletrees with values at nodes. 1 (5,5) 2/~~3
/6
8 9 (1,1) (IA)
10 ll 12 13 14 15 (0,0) (0,0) (l,l)(l,l)(I,l) (a) 1 (5,4) 2/~~3
Approach Representation of global knowledge In this approach, robots use balanced binary trees for the purpose of accumulation and utilization of the global knowledgeabout the environment.
4 /
A given task can be divided into N sub-tasks, then assigned to the leaves of our tree (if N is not a powerof two we can pad the subtasks with the appropriate number of dummysubtasks that we consider as already finished. This ideally reflects a natural hierarchical decomposition, that is, the two children of a single node can reasonably be
/
6
\7
A(°’°) //~(2’2) (l’l) 10 I 1 12 13 14 15 (0,0) (0,0) (l,{)({,{)(I,l)
(2’1) A
8 9 0,0) (l,I)
(b) I (4,5) (3,3)
2 k (Rk,Sk)
Figure 1 : Binarytree structure used for storing global information.
(3,3)
\5
Ri = Rj + R k Si = Sj + S k
j (RiSj)
(3,3)
/4
~
(1,2)
8 9 (0,1) 0,1)
\5
6
/
\7
10 11 12 13 14 15 (o,o) (o,o) ({,{) ({,l) (o,o) (c)
Figure 2: Global information kept by robots.
ROBOTICS 537
Eachrobot keeps a copy of the tree itself, and makes control decisions based on the tree information. Whena robot changes its location or finishes a task, this information is broadcast to the other robots, as well as sufficient information for all to determinethat (and where) a robot has died. For example, in Figure 2.a, suppose the robot in Node 8 finishes the task: it changes the value of this node to (I,0), and propagates this change up the tree, resulting the tree in Figure 2.b. If a robot detects the failure of another robot (doesn’t receive "I amalive" messagefor an amountof time, e.g.), it will change the robot value part of the binary tree correspondingly. Take Figure 2.a for example: if the robot in node 8 fails, the other robots will change the tree as Figure 2.c i = node# of robot L ; finish =FALSE; while((i>l) &&(!finish)) if(i %2~0) j=i+l;//i is left child,j is its sibling else /=i-l; //i is right child. // Checkfor" Sj ==0" if( S./~ 0 ) Ratio= ( Pl + P2)/2 else Ratio= (Si / Ri ) / (S/ / Rj ); if (Ratio< Pl ) Prl=RandomO; //Pr2 ~[0,1] if (Prl P2 finish =TRUE; //keep in sameplace else i =i/2; //go to upperlevel. } Figure 3: Dynamictask allocation algorithm Dynamic task allocation Wheneverthere is any changes in the tree, each robot reallocates itself task basedon the informationof the tree. Supposerobot L is currently doing task i in leaf i. The numberof total active robots in this node is Ri, the
538
FLAIRS-2001
number of total unfinished tasks in this node is S,. (R/,S/) denotes its node value. Robot L compares its node value (Ri,Si) with its sibling’s node value (Rj,Sj). If Si < Sj -- --’Pl, R i Rj
the average workload in node /is
less than that in node j by a factor of at least 1/pl (Pl < 1, Pl is the load balancing threshold coefficient which defines the unbalance tolerance in the two sibling sub-trees), robot L maydecide to leave node i and go to node j, in order to balance the workloadbetween node i and node j. If the task balance in these two nodes is comparable (equal within a factor of Pl, robot L will compare its parent node with its parent’s sibling node, and determine in similar fashion whether it can help balance betweenits parent sub-tree and its parent’s sibling sub-tree. This procedureis to balance the workloadwith its cousin nodes. Figure 3 showsthe details of the algorithm. Conflict resolving and self adaptation It is inevitable that two or more robots maysimultaneously choose to help their sibling node. In Figure 4.a, robot in node 4 and node 5 maymoveto their parent’s sibling subtree at the sametime, and the tree becomesas Figure 4.b. 1 2 / A(2’2) 5
I (2,4) \3
(0’2) 6 7 (03) (o3) (a)
Figure 4: Robots oscillate states.
2
/
\3
A(0’2) (2’2) 4 5 6 7 (o,I)(o,l)(13) (i,t) (b) back and forth between two
In the worst case, these two robots mayoscillate back and forth betweentasks. In order to minimizelikelihood of such oscillation, we makethe decision random: the move is chosen with probability p. This p defines the probability that an individual robot makes a decision to movewhen it can. The moveprobability is determined by the number of robots that might potentially moveto the new location. Considering the Figure 4 example: the robots in node 2’s subtree each self-assign a p of 0.5; the probability of both robots choosing to moveis 0.25, while the probability of exactly one robot moving is .5. There is also the probability of .25 that neither robot moves, but the
remaining unbalance can be addressed situation evaluation.
on the next
Collapsing subtasks One potentially bad behavior occurs when there are less robots in a subtree than tasks: a robot that is the only one assigned to a task maymove to its sibling that has no robot, leaving its current task unassigned. One solution would be to prohibit such moves; if there are not enough robots, sometasks haveto wait for other subtasks to finish. In some cases this is not acceptable: in situations where subtasks never finish, or in situations where tasks cannot reasonably left unassigned. As an exampleof both of these, consider the task to patrol a region, whichcan be partitioned into disjoint regions. A possible solution is to collapse two subtasks into their parent, and to assign the robot to the parent task (that is, one robot is assigned to both subtasks). Howthe individual robot allocates its effort between these subtasks maybe determined by the individual problemconstraints.
Simulation of task reallocation The simulationts were carried out on a multiple robot simulator, which models the physical robots in our lab, various aspects of which are described in Netter, 1998, and Wurst & McCartney, 1996. The overall task is sampling a square area (see McCarmey& Sun, 2000a, 2000b) using eight robots; this is done by partitioning the area into four regions (I,1I,III, and IV, Figure 5), and
1
2-J -3 4 / I
\5 1I
6/ I11
\7
.......
IV
Figure5
Figure 6.a 8 robots simultaneously work on four areas with 2 robots at each area.
i.m..
..........
Figure 6.b Whenthe four robots working at area III &IV fail, one robot workingat area i will go to area Ill, one robot at area II will go to area IV.
Figure 6.c Whenthe robot working at area ! fails, robot workingat area II will take charge of the work of sampling both area I and area Ii.
ROBOTICS 539
bandwidth. Experimental results show that robots in the team can respond effectively to robot failures, allowing successful performanceeven if only one robot remains. Much work remains, however. For example, a more general mode[ of probabilistic movementdecisions needs to be developed and analyzed to characterize how well tasks are balanced, and howlong it takes to regain balance given randomizedbehavior. Weshould consider the costs of reallocating robots, which has multiple components. One possibility for this wouldbe to consider other tree structures that better capture the spatial characteristics of the environment. Furthermore, we need to develop metrics that enable quantitative measurementof the robustness and efficiency of this approach.
References Figure 6.d Whenonly one robot is left, this robot’s working area is the whole area, which allows the completionof the task.
assigning two robots to each region. Within a region, the robots sample the area at random and determine area characteristics; if we set no stopping conditions this is equivalent to eight robots patrolling a region. Figure 6.a. through 6.d. illustrate some observed behaviors when robots fail. The lines in the drawings indicate the individual robot paths. Figure 6.a. indicates the initial configuration: each quadrant has two robots wandering. Given a single robot failure in any area, nothing reallocates (given a Pl < .5). However,if two robots fail in the same area, then a robot will be reallocated from a sibling (if possible). Figure 6.b. indicates what happenedafter all of the robots in sectors Ill and IV failed: robots were reallocated from sectors I and II. Figure 6.c. showsthe results of a robot failing when there is only one robot per task: the unassigned sector is combinedwith its sibling into a larger sector. The ultimate failure result is shownin Figure 6.d.; whenonly one robot remains it assigns itself to the root, merging the two remaining subtasks into one.
Conclusions Wehave proposed a new approach for designing faulttolerant cooperative multiple robot team, by using a binary tree as a tool to store global information and enable balance of the workload. This paper demonstrates that workload balance can be done efficiently and independently. This approach makes it feasible in a real world multiple robot system with limited communication 540
FLAIRS-2001
Dias, M. B.; and Stenz, A., 2000. A Free Market Architecture for Distributed Control of a Multirobot System.6th International Conferenceon Intelligent Autonomous Systems (1AS-6), pp. 115-122. Font~in , M. S.; and Mataric, M. J, 1998. Territorial Multi-Robot Task Division. IEEE Transactions on Robotics and Automation, 14(5). Kanellakis, P. C. and Shvartsman,A. A, 1997. FaultTolerant Parallel Computation, Kluwer Academic Publishers. Mataric, M. J., 1995. Issues and Approachesin the Design of Collective AutonomousAgents, Robotics and AutonomousSystems, 16 (2-4), Dec. 1995. McCartney, R.; and Sun, Haihang, 2000a. Random Sampling with Mobile Robots. In Proceedings of 31"’ International Symposium on Robotics (1SR 2000), Montreal, Canada, pp. 89-95. McCartney, R; and Sun, Haihang, 2000b. Samplin.~ u’ and Estimation by Multiple robots, In Proceedings of 4 International Conference on Multiagent Systems, Boston, MA,pp. 415-416. Netter, C., 1998. A Hardware/Software Architecture for A modular multiprocessor robot control system, M.S. Thesis, Dept. of CSE,University of Connecticut. Parker, L. E., 1994. Heterogeneous multi-robot cooperation. Ph.D. Thesis, Dept. of EECS,Mass. Inst. Of Technology. Parker, L. E., 1998. ALLIANCE: An Architecture for IEEE Fault Tolerant Multi-Robot Cooperation, Transactions on Robotics and Automation, 14 (2). Wurst, K.; and McCartney, R., 1996. Physical Implementation of Performing Agents. In Proceedings of the Ninth AnnualFlorida Artificial Intelligent Symposium, pp. 132-136.