A Distributed Bidding Algorithm for Multi-robot Exploration ... - CiteSeerX

0 downloads 0 Views 252KB Size Report
3) How to design a new multi-robot coordination strat- ... robot failure will make the mission impossible. ... Section 3 describes our distributed, bidding-based.
A Distributed Bidding Algorithm for Multi-robot Exploration with Safety Concerns Weihua Sheng

Qingyan Yang

Ning Xi

ECE Dept. Kettering University Flint, MI 48504 Email: [email protected]

System Division Iteris Inc. Madison Heights, MI 48071

ECE Dept. Michigan State University East Lansing, MI, 48824

Abstract— This paper proposes a totally distributed bidding model for a typical multi-robot application, the exploration of unknown environment. Based on this bidding model, the robot safety concept is introduced. Its impact on the behavior and the efficiency of a multi-robot team is investigated. The algorithm aims to maximize the exploration efficiency and reduce the risk of the exploration. We also investigate the exploration efficiency by assigning different risk tolerances to each individual robots. The safety concept can be extended to model other constraints in multi-robot missions. Simulation results demonstrate the effectiveness of the algorithm.

I. I NTRODUCTION A. Problem Statement Exploration of unknown environment is an important topic in mobile robot research due to its wide real-world applications, such as search and rescue, hazardous material handling, military actions, etc. Recently, multi-robot exploration has received much attention from the research community [?]. The obvious advantage of multi-robot exploration is the concurrency, which can greatly reduce the time needed for the mission. Coordination among multiple robots is necessary to achieve certain efficiency in the robot exploration. Other desirable characteristics such as reliability, robustness also require the cooperation among multiple robots [?]. Increasing the safety or reducing the risk of human operators in the face of dangerous exploration missions is the main motivation of adopting robots. But this does not imply that an individual robot itself is strong or powerful enough to deal with any hostile situations which may emerge during the exploration. A single robot, without the help from other robots, may be vulnerable to hostile environment or enemies. A typical example is the robotic scout mission in some military actions. In many other applications, a single robot may get help from other nearby robots during emergencies, such as failures, malfunctions. The safety of the robot itself has received much less attention in robotic research community. Most of the exploration strategies developed so far simply ignore this issue. To increase the safety of the exploration team, new coordination strategies should be developed, which requires that the following issues be solved:

1) How to define the safety measure for a single robot? 2) What impact does the safety requirement have on the other performances of the exploration, such as the efficiency in terms of total distance travelled and the overall exploration time? 3) How to design a new multi-robot coordination strategy that takes the safety into consideration, while trying to maximize the other performances? To the best of our knowledge, no research work in autonomous robotics exists that formally investigates the relationship between the safety concern and other characteristics of multi-robot systems. This paper tries to answer the above three questions through a typical multi-robot application – the exploration of unknown environment. The problem of multi-robot exploration with safety concern is stated as follows: n identical robots are going to explore an unknown, hostile area. Each robot is equipped with sensing, localization, mapping and communication capability. Design a robust, efficient exploration strategy for each robot to achieve certain degree of safety during the exploration process. B. Previous work Although there has been considerable work in single autonomous robot, much work remains to be done in multirobot coordinated localization, mapping and exploration [?]. The coverage problem is closely related to the exploration problem. Latimer et al. [?] introduced an algorithm to cover an unknown space with a homogeneous team of circular robots. Their multi-robot coverage algorithm is based on their previous work in single robot coverage. Their goal is to guarantee the full coverage of the whole area while achieving certain efficiency by minimizing the repeated coverage. However, their algorithm can not be directly extended to exploration applications where each robot has a relatively large sensor range compared to its physical size and the environment is cluttered. Also their algorithm is semi-distributed which implies that a single robot failure will make the mission impossible. Yamauchi

[?] developed a totally distributed, asynchronous multirobot exploration algorithm which introduces the concept of frontier cells, or the bordering area between known and known environment. Their basic idea is to let each robot move to the closest frontier cell independently. This brings the fault tolerance capability. However their algorithm does not achieve sufficient coordination so that multiple robots may end up moving to the same frontier cell, which introduces inefficiency. Based on the similar frontier concept, Simmons et. al [?] developed a semi-distributed multi-robot exploration algorithm which requires a central agent to evaluate the bidding from all the other robots to obtain the most information gain while reducing the cost, or the travel distance. Fujimura and Singh [?] presented a cooperative terrain aquistion strategy for distributed mobile agents, which can handle heterogeneous robots. However, efficiency is not their main concern. Recently, Zlot et. al. [?] developed a multirobot exploration strategy based on a market economy, which inherits the basic concept from the bidding protocol used in Simmons et. al.’s work. Their algorithm can improve the reliability, robustness and efficiency of the exploration through negotiation in a market architecture. However, the goal generation algorithm they used is not as efficient as the frontier-based algorithm and the negotiation process is complicated. Gerkey and Mataric proposed an auction method for multirobot coordination in their MURDOCH system [?]. The use of an ”auctioneer” agent is similar to the central agent in Simmons et al.’s work. To summarize, we have the following observations regarding the current research work in multirobot coordination. First, there is no totally distributed negotiationbased method for multiple robots coordination. Previous work adopts either central agents or group leaders for winner selection. And due to the use of those central agents or group leaders, the negotiation protocols become complicated. It is desirable to have a simple, totally distributed coordination algorithm. Second, the existing work in robotic exploration, whether it is for a single robot or multiple robots, did not consider the safety issue of the robot team during exploration. As a result of this, many exploration algorithms deploy the robots into different directions in order to achieve efficiency, which most likely will put a single robot into a remote, isolated area working by itself. For a multi-robot team to carry out exploration successfully in hostile unknown environments, it is highly desirable to maintain certain degree of safety, or closeness so that in the case of unexpected events, a single robot can get immediate help and reinforcement from other nearby robots, or in the case of failure, a single robot’s task can be taken over by other nearby robots. The maintaining of certain degree of safety provides a preventive measure against hostile environment, enemies, or self-malfunctions. This perspective is obviously different from those employed in most of the existing reliable,

robust exploration algorithms, which provide correction measures against failures or malfunctions that already occur. The remainder of the paper is organized as follows. Section 2 describes the model of the multi-robot exploration and introduces the safety measure for a single robot. Section 3 describes our distributed, bidding-based algorithm for exploration. Simulation and testings are presented in Section 4 and Section 5 concludes the paper. II. M ULTI - ROBOT EXPLORATION MODEL A. Environment and robot model In this section, we define the environment model and the mobile robot model. The environment is modelled as a 2D occupancy grid of a specified resolution. There are arbitrary-shape stationary obstacles in the area. During the exploration, each cell of the grid has one of the three status: occupied, free or unknown. Here “occupied” means that the cell is occupied by obstacles, such as a wall; “free” means that no obstacle exists in this cell; and “unknown” means that the cell has not been detected by the range sensor of any of the robots yet. The physical shape of a robot can be modelled as a square. For simplicity, its dimension is set as the size of a grid cell. Each robot Ri has localization, mapping and communication capabilities. It can sense the neighboring cells using range sensors such as a sonar or a laser-rangerfinder. Its sensing range is denoted by a circle of radius r centered on the robot. When the robot is travelling, it does not carry out sense and mapping. Each robot is capable of localizing itself with respect to its own local map. The communication capability enables the robot to talk to any other robot. Each robot also builds up a global map by combining the local maps sent from other robots. A typical scenario of multi-robot exploration is shown in Figure 1. Unknown Area

Boundary

Explored Area

Sensing Range

rs

Fig. 1. A multi-robot exploration in unknown environment. The circles represent the current robot positions and the crosses represent the next target positions.

n robots start from initial positions which are close to each other and the relative positions are known to all. Each robot seeks to explore the area with the maximum information gain while reducing its cost, which is usually the distance from the current position to the target position. Instead of using a central-agent-based semi-distributed model for the bidding process [?], [?], we develop a totally distributed bidding algorithm. Our bidding algorithm is also much simpler than the market economy algorithm developed in [?]. The skeleton of the exploration algorithm is shown in Figure 3.

4

3

high safety

5

1 high safety 2

6

low safety

Fig. 2.

Robot 6 has smaller safety measure than other robots.

B. Safety and risk First we introduce the concept of safety and risk. Safety, λ, is the measure of the closeness of a single robot to other robots in the same mission. Risk is the reciprocal of safety. To mathematically characterize the safety or the risk of robot Ri during its mission at a specific moment, we denote the distances between robot Ri and robot Rj as d(Ri , Rj ), then λi = e −

d1 c

+ αe−

d2 c

+ ... + αn−2 e−

dn−1 c

.

(1)

Here d1 ≤ d2 ≤ d3 ≤ ... ≤ dn−1 is an increasing order of d(Ri , Rj ), j ∈ (1, 2, ..n) and j 6= i. α is a fading factor which is smaller than 1. c is a constant, which equals to the sensor range r. This definition models the safety as a function of its distances to other robots and it favors the robots that are closer to the robot concerned. This reflects the fact that the safety is not a linear function of the number of robots and their distances. In fact, a single robot nearby is better than a dozen robot far away, which is usually the situation in many hostile environment exploration. The risk of a robot, κ, is simply the reciprocal of safety. That is, κi =

1 . λi

void robot_exploration() { mode=SENSING-MAPPING; local_map=sensing_mapping(); broadcast(local_map, robots); while (!is_map_complete(map)) { mode=BIDDING; winner_declared=FALSE; bid_finish=FALSE; while ( !bid_finish ) { my_bid=calculate_bid(map, robots); broadcast(my_bid, robots); start_timer(bid_time); timer_expired=FALSE; while ( (!timer_expired || (better_bid_received()) && ! winner_declared ) { if ( new_bidder_enter() ) break; } if ( timer_expired && ! winner_declared ) { bid_finish=TRUE; broadcast(my_ID, robots); // I am the winner! best_bid_begin_time=INFINITY; mode=TRAVELLING; move_to(my_bid.destination); local_map=sensing_mapping(); update_globalmap(map, local_map); if((current_time()-best_bid_begin_time) > SEND_MAP_WINDOW) { while(!winnder_declared); } broadcast(local_map, robots); } } } }

(2)

As an example, in Figure 2, robot 6 has much less safety measure because its distance to other robots is much farther. This safety measure can be easily extended to other constraints on the robot team behavior. For example, the communication range is an important constraint on the robot movements. If a single robot is too far from other robots, the communication link may break between this robot and others. By adopting a similar concept as the safety, we can develop algorithms that keep all the robots within communication range. III. T HE DISTRIBUTED BIDDING ALGORITHM To achieve reliability and robustness, the exploration algorithm should be totally distributed, which means central-agent-based algorithm is not desirable. All the

Fig. 3.

The skeleton exploration algorithm for each robot.

A. Sensing and mapping Each robot works asynchronously. At any time instant, a robot is in one of the following three states: 1) sensing and mapping, 2) bidding 3) travelling, as shown in Figure 4. The robot uses its range sensor to detect the status of the cells within its sensor range. A local map will be generated and the robot updates its global map. Then based on this global map, frontier cells are identified. Here a frontier cell is a “free” cell that is adjacent to at least one “unknown” cell, as shown in Figure 5. Next, the robot checks if the current bidding session is almost closed. If yes, it will choose waiting for the next bidding session instead of participating in the current session, otherwise, it will send out the new local map. This checking step is

Sensing and Mapping

Bidding

Travelling

Fig. 4.

The state transition of each robot.

calculates the shortest travel distance Di to each frontier cell i. Due to the existence of obstacles and unknown areas, the shortest distance may not be the straight line distance. A quick algorithm based on Dijkstra’s shortest distance algorithm [?] is developed. First, the partially known map is converted into a connectivity graph which describes the neighboring relationship among cells and the weights on the graph represent the travelling distance between neighboring cells. Second, on this connectivity graph, the Dijkstra’s algorithm is applied to find the shortest path which avoids the obstacles and the unknown areas. In order to calculate the risk κi of the robot at each frontier cell i, the travel distance from that cell to each of the other robots’ current positions or target positions are calculated. Then the risk κ can be calculated based on the distances using Equation (II-B) and (II-B). The net gain gi for frontier cell i is the weighted combination of the three measures: gi = ω1 Ii − ω2 Di − ω3 κi .

(3)

Here ωi , i ∈ (1, 2, 3) are the positive weights. By changing the weights, the importance of each component can be varied. The bid B of a bidding robot is the maximum net gain of all the frontier cells. B = max gi i

C. Bidding

Fig. 5. As the robot moves from position 1 to 2, new frontier cells are found. The grey/green ones are frontier cells and the black ones are obstacle cells.

very important to maintain information consistency in the face of communication delay, as will be discussed later in the paper. The robot broadcasts its newly obtained local map to all the other robots, who then update their global maps in their memory. If the map is not complete, which means there are more frontier-cells, the robot will decide the next target frontier cell to move to. B. Bid calculation The robot calculates the information gain Ii for each frontier cell i based on (i) the current global map (ii) the current positions of the sensing-and-mapping robots and (iii) the target cell positions of those travelling robots. The information gain Ii for a specific frontier cell i is the number of unknown cells within the sensor range, but not within the sensor range of any other sensing-mapping robots or the target cells of those travelling robots. This information gain definition keeps different robots separated so their sensing areas will not overlap too much. The robot

The bidding robot broadcasts its bid B to all the other robots and waits for a constant bidding time tbid to hear responses from other robots. If during that bidding time, there is no other robot participating in the current bidding session, or no other robot provides better bids, this robot wins the bidding. If during that time, one or several other robots finish sensing and mapping and send in the new local maps, this bidding robot recalculates its bid based on the new global map and broadcasts the bid again. If this robot receives better bid from other robots, it will wait for the winner to declare its ID. Once the winner is identified and its assigned target is known, this robot recalculates the bid based on the updated information and the process repeats. The winning robot then moves to its target frontier cell and starts the sensing and mappingbidding-travelling cycle again. All the robots stop when no new frontier cell is available. It is worth noting that some of the shared variables are not explicitly modified in this skeleton algorithm, actually, they are updated when certain events happen. Obviously the bidding time tbid is an important parameter. If tbid is very small, it is highly possible that fewer robots will participate in the bidding. If tbid is very big, it is very likely that many robots will participate in the bidding, which implies better bid will be generated but the waiting time will be longer. In one extreme case if the bidding time approaches 0, it is very likely that

Bidding Time --- Tbid The Moment when A’s Bidding Time Expired

Send-Map-Window ---Twin

X (Just Enter Bidding)

A (Current Best Bidder)

Best Bid Beginning Time

Fig. 6.

Arrow of Time

Grace Window ---Tgrace

The time parameters in coordination algorithm.

there will be no more than one bidder in each session and actually each robot wins its own session. In the other extreme case, if the bidding time is large enough, all the robots will participate in each bidding session and the exploration will significantly slow down. D. Communication delay and information consistency As discussed in the coordination algorithm, when a robot finishes its sensing and mapping, it will check if the current bidding session is near the end. If yes, it will not participate in the current session, otherwise, it will send out the new local map. The basic idea behind this is to introduce a grace window Tgrace to overcome the problem of information inconsistency caused by communication delay. The relationship between Send-Map-Window, bidding time and the grace window is illustrated in Figure 6. The information consistency concern about the distributed coordination algorithm is: in the same bidding session, do all the bidding robots make their decisions based on the same map? The answer is yes if the time parameters in the coordination algorithm are carefully chosen. We have the following proposition regarding the constraint on the grace window time and the maximum communication delay: All the bidding robots calculate their bids based on the same map if the grace window time Tgrace is greater than the upper limit of the communication delay td . i.e.,

B

Circled are the Bidders

Fig. 7.

A snapshot when map inconsistency ocurrs.

waiting for the expiration time to declare itself as winner. All the other bidders are just waiting for the WinnerDeclared message and they should be able to receive the new local map from X as long as A does not declare itself as winner. So the only possibility that all the bidders do not have the same map occurs when A declares itself as winner before the map from X arrives at it, as shown in Figure 7. Because in the worst case, it takes the new local map a time of td to get to A. Therefore, if the grace window time Tgrace is greater than the maximum communication delay td , we can guarantee that A will receive the local map before it declares. This concludes the proof. IV. S IMULATION R ESULTS A. Implementation and Results The distributed coordination algorithm is simulated in JavaT M . All the robots communicate with each other through TCP connections, which guarantee the ordered message delivery.

Tgrace > td . Proof: Without losing generality, considering the scenario illustrated in Figure 7. According to the distributed coordination algorithm, when Robot X finishes the sensing and mapping, it checks if the difference between the current time and the beginning time of the best bidder ( assuming it is A ) is greater than the SendMap-Window Twin . If yes, it will not participate in this bidding session and will hold its map until the winner is declared, otherwise it will send out the new local map. Supposing bidder X participates in this session, it sends out the new local maps to other robots, so all the bidding robots A, B, ... and X are required to calculate their bids based on the same map. Since bidder A is the current best bidder, according to the coordination algorithm, it is

Fig. 8. The five robots exhibit certain clustering behavior when safety is concerned. The grey cells are unknown area and the white cells are explored area. The circled dots are robots.

The risk measure by steps 140

120 Without Safety 100 With Safety

risk

80

60

40

20

0

0

5

10

15 step

20

25

30

Fig. 10. The comparison of risk measure of robot 2 between two cases: with and without safety concern. Fig. 9. The five robots tend to move apart when no safety is concerned. The risk measure by steps 140

120 Without Safety 100 With Safety

risk

80

60

40

20

0

0

5

10

15

20 step

25

30

35

40

Fig. 11. The comparison of risk measure of robot 3 between two cases: with and without safety concern.

are 0. Total Distance Travelled (by 5 Robots) vs. Bidding Time 1200

1100 Without Safety 1000 With Safety Total Distance Travelled

In the testing runs, five identical programs, each representing one robot, run on five different computers. First, we compare the movement of the robots in two different scenarios: one considers the safety and the other not. The bidding time used in the coordination algorithm is Tbid = 1.0s and the sensor range radius is r = 4. When safety is considered, the weights used in Equation (3) are: ω1 = 1.0, ω2 = 1.0, ω3 = 1.5. As shown in Figure 8, the five robots exhibit certain clustering behavior, which implies that the safety measure is high for each robot. When no safety is concerned, the weights used in Equation (3) are: ω1 = 1.0, ω2 = 1.0, ω3 = 0. As shown in Figure 9, all the robots disperse to separated locations immediately. The quantitive comparison of the risk measure between safety and non-safety scenarios are shown in Figure 10 and Figure 11 for two robots. These two figures display the recorded risk measure of Robot 2 and Robot 3 for all the steps they go through. These two figures clearly show that, by including safety concern in the coordination algorithm, the safety measure gets significantly improved for each individual robot. We also observe the relationship between the bidding time Tbid and the efficiency of the exploration mission. We vary Tbid and record the corresponding total distance travelled by all the robots and the overall exploration time. The total distance travelled by all the five robots against the different bidding time is shown in Figure 12. It is clear that as the bidding time increases, the best bid improves for each session since the number of bidders increases. That contributes to the fact that the total distance travelled decreases. We investigate the total exploration time in the following three cases. (i) With safety concern and the weights ω3 for all the robots are 1.5. (ii) With safety concern and one of the weights ω3 is 0.5 and all the others are 1.5. (iii) Without safety concern, or all the weights ω3

900

800

700

600

500 0.5

1

1.5

2

Bidding Time (s)

Fig. 12. time.

The total distance travelled by the 5 robots vs. the bidding

Exploration Time vs. Bidding Time

the other robots. When this new “safety” measure is used in the coordination algorithm, no robot will get out of the communication range of all the other robots and become “communication-isolated”.

600

550

Exploration Time (s)

500

V. C ONCLUSIONS

450

400 Without Safety With Safety (1.5-1.5-1.5-1.5-1.5) With Safety (0.5-1.5-1.5-1.5-1.5) 350

300 0.5

1

1.5

2

Bidding Time (s)

Fig. 13.

The exploration time vs. the bidding time.

From Figure 13, we have two observations: (i) the overall exploration time increases as the bidding time increases, since the waiting time grows; (ii) by increasing the risk tolerance of a very few number of robots, we can improve the total exploration time significantly to approach the total time when no safety is concerned. The second observation can be explained as follows. When there are enough unknown cells to explore, the information gain Ii for each frontier cell is quite large, the changing of risk weights ω3 does not affect much on the behavior of the robots. However, when the information gain Ii of nearby frontier cells is getting smaller and smaller, for example, when the robots almost finish one end of a corridor, the risk component will make the robots reluctant to move to the other end of the corridor to exploration. If there are a few ( one or two) robots that like to take more risks than others, they will move to the other end of the corridor first. Then other robots will come over since the safety measure at the new area is increased due to the existence of the risk-taking robots. In this way, the exploration process speeds up. B. Discussions In the coordination algorithm, the bidding time Tbid is set to a constant. However, for each robot, as the explored area grows, the time spent on the travelling will increase. To maintain enough bidders in each bidding session, the bidding time needs to be increased. This implies that the bidding time Tbid should be adaptive. We will investigate how to update the bidding time and how to guarantee that all the robots agree on the same bidding time in our future work. As mentioned already, the safety concept can be extended to other similar constraints which can improve the system behavior. For example, the communication range is usually limited for each robot. To maintain the communication links, a new “safety” measure can be defined that will be used to monitor the distances to all

This paper proposes a distributed bidding algorithm for multiple robots in exploration tasks and addresses the safety issue. By introducing the safety, or the risk concept into robot teams, reliability and robustness can be improved from a protective point of view. One intuitive definition of the safety is based on the distances to other robots. A new, totally distributed bidding algorithm is developed, which maximizes the net gain–a weighted combination of the information gain, the travel distance and the risk. This distributed bidding algorithm is efficient and simpler compared to existing multi-robot exploration algorithms. From the simulation results, we observe that certain clustering behavior occurs during the exploration mission, which is an indication of the risk minimization. Efficiency of the robot team is also investigated with regard to the bidding time and the existence of safety concern, as well as the introduction of a few risk-taking robots. Future work will improve this algorithm using more realistic communication model, such as limited communication range, unreliable message delivery etc. R EFERENCES [1] Y.Cao, A.S.Fukunaga, and A.B.Kahng. Cooperative mobile robotics: Antecedents and directions. Autonomous Robots, (4), 1997. [2] R.Zlot, A.Stentz, M.B.Dias, and S.Thayer. Multi-robot exploration controlled by a market economy. In Proceedings of the 2002 IEEE International Conference on Robotics and Automation, pages 3016– 3023, 2002. [3] Lynne E. Parker. Current state of the art in distributed autonomous mobile robotics. In George Bekey Lynne E. Parker and Jacob Barhen, editors, Distributed Autonomous Robotic System 4, pages 3–12, October 2000. [4] D.Latimer IV, S.Srinivasa, A.Hurst, H.Choset, and Jr V.Lee-Shue. Towards sensor based coverage with robot teams. In Proceedings of the 2002 IEEE International Conference on Robotics and Automation, 2002. [5] B.Yamauchi. Frontier-based exploration exploration using multiple robots. In Proceedings of Second International Conference on Autonomous Agents, pages 47–53, 1998. [6] R.Simmons, D.Apfelbaum, W.Burgard, D.Fox, S.Thrun, and H.Younes. Coordination for multi-robot exploration and mapping. In Proceedings of the National Conference on Artificial Intelligence, 2000. [7] K.Fujimura and K.Singh. Planning cooperative motion for distributed mobile agents. Journal of Robotics and Mechatronics, 1996. [8] B.P.Gerkey and M.J.Mataric. Sold!: Auction methods for multirobot coordination. IEEE Transacitions on Robotics and Automation, 18(5):758–768, 2002. [9] E.Dijkstra. A note on two problems in connection with graphs. Numerical Mathematics, 1:269–271, 1959.

Suggest Documents