Fuzzy logic decision making for multi-robot security ...

Artif Intell Rev DOI 10.1007/s10462-010-9168-8

Fuzzy logic decision making for multi-robot security systems Mahmoud Tarokh · Matthew Cross . Malrey Lee

© Springer Science+Business Media B.V. 2010

Abstract The paper proposes a fuzzy logic decision making system for security robots that deals with multiple tasks with dynamically changing scene. The tasks consist of patrolling the environment, inspecting for missing items, chasing and disabling intruders, and guarding the area. The decision making considers robot limitations such as maximum floor coverage per robot and remaining robot battery energy, as well as cooperation among robots to complete the mission. Each robot agent makes its own decision based on its internal information as well as information broadcast to it by other robots about events such as intruder sighting. As a result the multi-robot security system is distributive without a central coordinator. The system has been implemented both in simulations and on actual robots and its performance has been verified under different scenarios. Keywords

Security robots · Fuzzy decision making · Multi-robots · Behavior control

1 Introduction Research into security robots covers broad areas with various approaches. Research goals may be as specific, such as designing better sensors to extract information from the environment, or be increasingly grand such as the design of an entire multi-robot teams that attempts to address all security-related need. A brief history of robotic security systems is provided in Everett (2010), and some real-world applications are discussed in Everett and Gage (1999). Much of the work on robotic security is based on behavioral robotics (Arkin 1998) and artificial intelligence robotics (Murphy 2000). The work reported in Castelnovi et al. (2003)

M. Tarokh · M. Cross Department of Computer Science, San Diego State University, San Diego, CA 92128-7720, USA e-mail: [email protected] M. Lee (B) The Research Center of Industrial Technology, School of Electronics & Information Engineering, ChonBuk National University, 664-14, 1Ga, DeokJin-Dong, JeonJu, ChonBuk 561-756, South Korea e-mail: [email protected]

123

M. Lee et al.

proposes a surveillance system intended for mobile security robots with the specific purpose of detecting changes in the environment without concern to the nature of changes. The system compares color histograms of current images to those of previously taken throughout the environment. The drawbacks are that the system assumes a totally static environment and the amount of time and physical memory for storage required to initially collect the images throughout the environment. A related work uses a robotic system to detect abnormal and dangerous situations and their notification possibly through internet (Su et al. 2004). The use of cooperative autonomous sensor-based robots is investigated in Parker (2002) for the observation of multiple moving target. The work in Born et al. (2007) develops a team of cooperative security robots intended to work along the international border to aid the effort to prevent illegal immigration. The goal is to have the robots function essentially as a flexible sensor array having each robot configured with a suite of sensors, such as thermal imaging, to detect human presence. A multi-robot system for security using Java software is proposed in Rusu (2004). However, the system is very specific has many limitations. Security robots systems fall under a more general class of multi-robot systems. For a team of robot to perform a set of tasks, the system must determine which robot should do which task and when Gerkey and Mataric (2004). This problem is sometimes referred to as task allocation and must take place in a distributive manner without a central coordinator, i.e. each robot must make a decision without knowing what other robots have done. In this case, task allocation is performed through emergent coordination where individual robots take their actions based on local information sensing and local interaction (Bojinov et al. 2000; Salemi et al. 2001). More recently, phenomenological models have been used to analyze multi-robot systems (Kazadi et al. 2002; Lerman et al. 2005). Stochastic processes are proposed in Lerman et al. (2006) for task allocation in which robots estimate the state of the environment from repeated local observations and decides the task to be chosen. In this paper, we focus on intelligent decision making for security robots. In our approach each robot maintains its autonomy by requiring it to decide itself about what to do and when to do it, yet have that decision reflect both its own needs as well as those of the team. Each robot only receives from, or broadcasts to, other robots events such an intruder sighting, a missing object, etc. However, the robot does not have information about the state of other robots and their actions. Fuzzy logic reasoning is employed in order to deal with large quantity of often imprecise information originating from internal state of a robot as well as external events. In our approach a basic premise of decision-making is that security is best achieved by preventing security breaches until a breach is detected in which case security is best served by stopping that breach. Watching for hazardous situations under normal conditions also helps in the security effort. However, security is not best achieved if all robots engage in the same activity simultaneously, and cannot be accomplished at all if they run out of battery power. In this paper, decision-making and the cooperative aspects of it considers not just what is activity is appropriate for a particular robot but also activity diversity across the robot team.

2 General system overview Intelligent robotic systems achieve goals by responding to information about their environment in a task-supportive manner. The goal of the robot team for this project is to cooperatively maintain security in a moderately complex environment such as that found in warehouses and laboratories after hours when no regular human activity is expected. The environment is moderately complex and dynamic in the sense that the floor plans are known in advanced

123

Fuzzy logic decision making

while other features of the environment, such as locations of objects, robots, and intruders are dynamic. A robot move around acquiring new information, broadcast limited and essential information to other robots, and updates its knowledge which is then used as the basis for selection of its task or behavior. An important consideration is to distinguish between task allocation and task selection. In the former case, a higher level or centralized coordinator assigns tasks to individual robots based on total communication and perhaps negotiation. In the latter case, the each robot chooses its own task in a distributive manner. This is more practical and appealing and the present work is in the distributive category. Robot agents permit intelligent robot control in a cooperative manner and are composed of several subsystems partitioned into the five components. These are (1) decision-making, for deciding which activity/behavior to engage in over time as the mission progresses, (2) world modeling, for maintaining knowledge about the world in which it operates (3) navigation, for permitting the robot to move about the environment without crashing into obstacles, (4) sensory interpretation, or perception, for updating the world model and aiding in navigation, and (5) communication, for allowing the robots to share information that is used in the decision-making and world modeling processes. Decision-making is the core focus of this paper and through it the cooperative security maintenance effort emerges and is seen in a robot selection of a particular activity or behavior over time as the mission progresses. In this paper we consider four robot behaviors, namely patrol, inspect, chase and guard. Other behaviors can be incorporated in a straightforward manner. Patrol is a fast-paced visitation of waypoints on a waypoint graph, an example of which is shown in Fig. 2 as the numbered bullets. Waypoints on the graph represent key locations in the environment the robot should visit; edges connecting the waypoints indicate a preferred route that the robot is encouraged, but not required, to follow. A message is sent by a patrolling robot when it selects a particular waypoint, and another message when it reaches that waypoint giving the waypoint number and time visited. Knowledge about waypoints selected or recently visited by a robot helps prevent multiple robots from clumping together in the same region of the environment. A patrolling robot selects a waypoint that is not recently visited and that it closest to its current location. As is true for all robot behaviors in this system, while the robot patrols about the environment, it watches for the indication of an intruder presence. Indication is through either the sighting of an intruder or detection of the removal or relocation of an object in the environment. Patrol is also meant to be a deterrent to would-be intruders from even attempting to make an appearance. Inspect is a slow-paced examination of the environment. While inspecting, the robot moves slowly and is engaged in a wall-following type of behavior avoiding objects. It also uses its cameras to detect an intruder or potential problems such as fire, water leaks, chemical spills, etc. Sighting of an intruder causes the sighting robot agent to broadcast to all robots a message that an intruder is sighted. Included in the broadcast is the coordinate at which the intruder was sighted. Chase is a fast paced activity during which a robot either pursues an intruder while the robot’s sensors ‘see” it, or disable it (e.g. using a TAESER) when the intruder is close enough. Chase also causes a robot to move quickly towards the intruder’s last known location. The disabling robot broadcasts a message stating that the intruder is disabled, allowing other robots to move to appropriate behaviors. Finally, Guard behavior has the robot stand its ground watching for an intruder or signs of one. Guarding permits a robot to conserve energy by ceasing all motor activity while continuing to assist in maintaining security. A robot’s decision about which security-based behavior to execute considers not only what the robot agent believes it knows about the world but also what is best for the team overall, rather than desires of the individual robot. This yields the cooperative aspect sought

123

M. Lee et al.

among the robots while permitting each to autonomously make its own decisions. In the next two sections, we formulate the robots state and their behaviors.

3 Formulation Let us denote by xi (t) the state at time t of the robot Ri in a team of consisting of n robots, i = 1, 2, . . . , n. The state xi (t) consists of an activity vector ai (t) and an internal state vector si (t), i.e. ai (t) xi (t) = ; i = 1, 2, . . . , n (1) si (t) The activity vector ai (t) describes the particular behavior B j ; j = 1, 2, . . . , m; that the robot Ri is engaged in, where m is the number of different behaviors, and the time ti j in or away from the behavior B j . In this paper we consider four behaviors and assign B1 , B2 , B3 and B4 for patrol, inspect, chase and guard behaviors, respectively. The behavior value is bi j = 1 if the robot Ri is engaged in the behavior B j , and bi j = 0 otherwise. The time ti j is positive if the robot Ri is currently in behavior B j , and is the duration since the start of the current behavior. On the other hand, ti j is negative and measures the time since the robot Ri stopped behavior B j . In our scheme, the time away from a behavior increases the likelihood that the robot will engage in that behavior whereas the time spent in a behavior decreases the likelihood that the robot continues the behavior, as will be seen. At any particular time, the robot can only be engaged in one behavior. Thus the robot Ri activity vector is i = 1, 2, . . . , n bi j (2) ; ai (t) = ti j j = 1, 2, . . . , m The vector si (t) in (1) describes the sensed characteristics of the robot, and consists of the estimates of remaining battery energy ei (t) of the robot, and the estimate position pi (t) of the robot in the environment. Thus ei (t) si (t) = i = 1, 2, . . . , n (3) pi (t) In addition to the state xi (t), the behavior of the robot Ri depends on the external event vector u(t) that describes external events that are broadcast by other robots and picked up by the current robot. The events can be, for example, • • • •

An intruder sighting, the time of sighting and his location An intruder disabled message and the time An object moved or missing with time and location of the event. A new robot joining the team and broadcasting the behavior it is engaged in, or leaving the environment.

The above information will enable each robot to know the total number of robots currently active, and the number currently engaged in a particular behavior. External event broadcasting enables the vector u(t) to be defined and found as follows: ⎛ ⎞ εk u j (t) = ⎝ α j ⎠ (4) ρj

123


The vector εk denotes a particular event and the time and the estimated location where the event was detected, e.g. ε1 denotes intruder sighting, ε2 denotes intruder disabled, ε3 indicates object missing, etc. The event vector can be written as εk = (ck tk pk )T where ck is a Boolean indicating the presence or absence of the event, and tk and pk are, respectively, the time and estimated position of the event that was detected by a robot. The quantity α j is the fraction of robots currently engaged in a behavior B j , i.e. α j = n j /n where n j is the number of robots in behavior B j , and n is the total number of robots in the team. Finally, the quantity ρ j is the power needs for the behavior B j , e.g. chase requires more power than guard. It is noted that the estimates of the positions of the robot and intruder allows the robot Ri to determine its distance di to the intruder. Furthermore, the time since the intruder was last sighted tintr is also recorded. The estimate of the fraction of the area covered (observed) by the robots at a particular time is also a consideration. The area covered is the one sensed by the robot sensors such as vision and proximity sensors. This fraction is determined by the number of team robots n and the area coverable by a single robot Arob , i.e. The fraction covered is defined as Fcov =

n Arob Atot

(5)

where Atot is the total area to be secured. Essentially, floor coverage is the amount of surface area possible to cover by the team of robots if individual robots were spread out evenly. The greater the floor coverage, either due to higher numbers of robots or greater coverage offered by each individual robot, the less need there is for the robot to actually move about and therefore the less likely to choose to engage in patrol or inspect, and more likely to guard. There is a minimum commit time (t j )min for being engaged in each behavior. Minimum commit time prevents a robot flip-flopping between behaviors that would occur otherwise when one or more robots change their behaviors resulting in other robots in the team changing their behaviors in a repetitive cycle. Enforcing a minimum commit time permits a steady state to be reached When the robot Ri receives an event signal, it adjusts its in/away time ti j to ti j = ti j + ti j

(6)

The value of ti j can be positive or negative depending on the particular event and particular behavior. For the events object missing, object moved and intruder disabled, ti j is negative for the patrol behavior, whether or not the robot i is currently patrolling. This would make the robot more likely to continue longer time to patrol if it is already in this behavior since it effectively reduces the robot has been in patrol. If the robot is not patrolling, the negative value of ti j added to a negative ti j makes the robot more likely to start patrolling. On the other hand, ti j is positive for inspect behavior for the above events, i.e. object missing or removed, and intruder disabled. The reasoning behind this strategy is that the increased time in inspect will make the robot less likely to inspect or continue to inspect since it has already been determined that an object is missing or moved, or the intruder is disabled. When an intruder sighted event is broadcast, ti j is positive for patrol and inspect, and is negative for chase. An intruder sighting would discourage patrol and inspect behaviors and making the robot more likely to engage in chase. The robot Ri determines which behavior B j is most suitable to choose at the next sample time (t + 1). For this the robot agent finds a behavior strength βi j (t + 1) which is a function of the robot state and external information vector u(t), i.e. j = 1, 2, . . . , m (7) βi j (t + 1) = f xi (t), u j (t)

123

M. Lee et al.

where f is a function to be described in the next section. The arbitration will decide which behavior to choose. A simple arbitration strategy that is used in this paper is based on the maximum behavior strength, i.e. the chosen behavior B j for the robot Ri is the one that has max{βi j }. Ties are awarded to behaviors in the order of chase, patrol, inspect, and guard. j

Exceptions overrule the winning behavior if a robot has not served a minimum commit time in its current behavior, i.e. the robot will continue its current behavior until ti j ≥ (t j )min .

4 Fuzzy decision making During their security mission, robots engage in various behaviors such as patrol, guard, inspect, chase intruder, etc. Deciding which behavior to engage in over time in support of security depends on the robot agent decision-making process and involves several factors. These factors are current robot behavior bi j , remaining battery energy ei (t) compared to the power need by a behavior ρ j , the fraction of robots already engaged in a behavior α j , security related events εk , the duration a robot has been engaged in/away from a behavior ti j , and the fraction of coverable area Fcov , etc. as formulated in the previous section. Fuzzy logic based method is used here for decision making due to its ability to establish qualitative relationships between different possible input types in terms meaningful to human reasoning. In addition, fuzzy logic methods have the ability to easily integrate and handle information originating from multiple sources even when the information is incomplete or unreliable as it usually is in realistic situations. It may be possible to create a single fuzzy logic system that would take various inputs described above simultaneously and produce the required behavior. However, this would lead to a high dimensional rule matrix which would not only complicate the devising of the rules but also requires intensive computation. Instead, we decompose the overall fuzzy system into m systems, i.e. one system for each behavior. A fuzzy system for the behavior B j itself is decomposed into several subsystems, each of which has only two inputs and one output. The output of each subsystem is normalized and is then fed sequentially into the next fuzzy subsystem. This simplifies writing of the rules using human-like reasoning, and the set of rules is considerably reduced. The normalization value for each subsystem’s output is set by determining the range of output values that a subsystem can generate and scaling the output to a minimum of zero and a maximum of 1. 4.1 Fuzzy energy risk subsystem The fuzzy logic system associated with each behavior considers information relevant to that behavior when determining its behavior strength. Common to the fuzzy systems for all behavior is determining energy depletion risk E i j . This risk depends on the remaining battery energy ei of the robot Ri and the power demand ρ j for the behavior B j . Thus we can write E i j = f rsk (ei , ρ j );

j = 1, 2, 3, 4

(8)

where j = 1 is the index for patrol and j = 2 for inspect, j = 3 for chase and j = 4 for guard. Each behavior demands a different amount of power ρ j . It is assumed that power demands for behaviors considered here from highest to lowest are for chase, patrol, inspect and guard, respectively. The output of the fuzzy energy subsystem is the energy depletion risk E i j . The relationship (8) is implemented using a fuzzy rule matrix, which is given in Table 1. The concept behind E i j is that when faced with low remaining battery energy, a robot agent is

123

Fuzzy logic decision making Table 1 Fuzzy rule matrix for determining energy depletion risk (E i j )

Estimated battery energy (ei )

Behavior power draw (ρ j ) V. = very

V. low Low Medium High

Low

Medium

High

Safe Caution Risky Dangerous

V. safe V. safe Safe Safe

V. safe V. safe V. safe V. safe

Energy Depletion Risk Behavior Motivation No. Of Robots Engaged In Behavior Behavior Score

Behavior In/Away Duration

Behavior Enthusiasm

Floor Coverage

Fig. 1 Fuzzy logic system for patrol or inspect behaviors

likely to engage in activities that require lower amounts of energy so that its duration of service for the mission is prolonged as far as possible. This is an especially useful concept when other robots and cooperation among them are considered, i.e. if an individual robot or robots have low energy reserves, it lets the ones with higher energy reserves engage in the more power demanding behaviors. Initially regular triangular/trapezoidal membership functions with equal base width can be used. These can then be tuned to obtain better performance. 4.2 Patrol and inspect decision making Both Patrol and inspect behaviors use identical fuzzy logic systems with identical input types. However, the values of the inputs are different for these behaviors. Figure 1 shows the fuzzy logic system and its subsystems for patrol and inspect behaviors. In addition to energy risk subsystem, there are three subsystems that constitute patrol or inspect behaviors, namely, motivation, enthusiasm and aggregation subsystems. The chase behavior subsystems are also similar to Fig. 1. However, the enthusiasm subsystem for chase has different inputs, as will be seen. The motivation subsystem takes the energy depletion risk E i j of patrol (inspect or chase) and number of robots n j in the patrol (inspect or chase), and produces the normalized motivation score which ranges from 0 to 1, i.e. motivation score is Mi j = f mot (E i j , n j );

j = 1, 2, 3

(9)

The relationship (9) is realized through a fuzzy subsystem whose rule matrix is given in Table 2. The membership functions for the fuzzy sets given in Table 2 are standard triangular/trapezoidal distributed over the range of their associated variable. The human reasoning

123

M. Lee et al. Table 2 Motivation fuzzy subsystem rule matrix Energy depletion risk (E i j )

nj

Lo Mod Hi

V. safe

Safe

Caution

Risky

Dangerous

Driven Motivated Little

Driven Motivated Little

Motivated Inclined Stint

Inclined Little Stint

Little Stint Stint

Table 3 Enthusiasm fuzzy subsystem rule matrix Floor coverage (Fcov )

Time in or away (ti j )

Long out Out Recently out Recently in In Long in

Sparse

Moderate

Dense

Crowded

Ecstatic Upbeat Willing Ecstatic Upbeat Willing

Upbeat Upbeat Willing Upbeat Upbeat Willing

Willing Lacking Lacking Willing Lacking Lacking

Lacking Scant Scant Lacking Scant Scant

behind this rule matrix is that the higher behavior energy risk and the more robots already engaged in it, the less motivated a robot is to participate in that behavior. In this case, it makes sense to choose another behavior that drains less energy and permits a longer duration of the robot’s mission. On the other hand, low energy risk or low number of other robots engaged in patrol (inspect or chase) behavior makes the robot more motivated to engage in it. Enthusiasm to engage in an activity is a quality humans can relate to and may be affected by various issues such as mood, tiredness, and repetition. The fuzzy subsystem Enthusiasm is a measure of the robot desire to engage in the behavior patrol (or inspect). From the perspective of human reasoning, the robot is more enthusiastic to engage in a behavior the longer it has been away from it. On the other hand, it is less enthusiastic if the environment is well covered by other robots and the longer it has been engaged in it. Thus it is reasonable to take as input to the enthusiasm subsystem the time in or away from the behavior ti j and the floor coverage Fcov to produce the output of the enthusiasm subsystem ηi j as ηi j = f enth (ti j , Fcov );

j = 1, 2

(10)

Note that the enthusiasm for the chase ( j = 3) has different inputs, as will be seen. A possible rule matrix for fuzzy logic implementation of enthusiasm subsystem is given in Table 3. This is implemented with standard membership functions for the named fuzzy sets. Finally the scores of the motivation and enthusiasm subsystems constitute the inputs to the aggregation subsystem to produce the patrol (inspect or chase) behavior strength, αi j = f agg (Mi j , ηi j );

j = 1, 2, 3

(11)

The rule matrix for fuzzy logic implementation of the above relationship is straightforward and is given in Table 4.

123

Fuzzy logic decision making Table 4 Fuzzy subsystem rule matrix for patrol (inspect) behavior strength Enthusiasm

Motivation

Stint Little Inclnd Motvd Drvn

Scant

Lacking

Willing

Upbeat

Ecstatic

Minute Minute Lo Med Med

Minute Lo Med Med Med

Lo Med Med Med Hi

Med Med Med Hi Great

Med Med Hi Great Great

Fig. 2 Views of the environment and the waypoint graph overlaid onto the environment

4.3 Chase decision making The fuzzy system for determining strength of chase behavior is very similar to that of patrol (inspect) shown in Fig. 2 and consists of three fuzzy subsystems of motivation, enthusiasm and aggregation. The only difference is that the inputs to the enthusiasm subsystem are now information about the intruder, namely the time since the intruder was last sighted tintr and distance di of the robot Ri to the intruder, i.e. ηi3 = f enth (tintr , di )

(12)

where 3 is the index of chase behavior. As the intruder roams through the environment, sighting robots periodically broadcast intruder sighted messages composed of, among other bits of information, the intruder’s position. These messages are repeated as long as the intruder can be “seen” by the robot(s). Robots on the team mark the time the message is received and intruder position and calculate the distance to the intruder’s reported position di . As tintr increases, it is more likely that the intruder has evaded the robot(s) resulting in the intruder sighted broadcasts to cease, making the robot less enthusiastic to chase. However, the robots may still move toward the last known location of the intruder even when it is no longer in sight. Similarly the farther the intruder is from the robot, the less likely the robot will decide to chase it. The rule matrix for the fuzzy logic implementation of (12) is given in Table 5. The other subsystems for chase, namely enthusiasm and aggregation, have identical input and output types as before and are given by (9) and (11), and implemented by the rule matrices of Tables 2 and 4.

123

M. Lee et al. Table 5 Enthusiasm to chase score fuzzy subsystem rule matrix Distance to sighted intruder (Di )

Time since intruder sighting (tintr )

Close by

Near

Far

V. recent Recent Long ago

Ecstatic Upbeat Willing

Upbeat Willing Lacking

Willing Lacking Scant

Table 6 Guard fuzzy rule matrix Energy depletion risk (E i j )

Floor coverage (Fcov )

Sparse Moderate Dense Crowded

V. safe

Safe

Caution

Risky

Dangerous

Mnut Low Med Hi

Lo Med Hi Hi

Lo Med Hi Great

Lo Hi Hi Great

Hi Hi Great Great

4.4 Guard decision making Finally, Guard behavior is a low-power activity in which the robot remains still, only using sensors to watch for intruders. Its purpose is to prolong the duration of the robot’s mission by having it conserve energy when the robot’s energy reserves is low. Unlike the fuzzy systems for the other behaviors mentioned before, the guard behavior has only one subsystem. It takes the energy depletion risk E i j and the floor coverage Fcov as input to produce the behavior strength for the guard, i.e. αi4 = f grd (E i j , Fcov )

(13)

The human-based thinking is that the higher the energy risk, the more likely the robot must conserve its reserves. On the other hand, the less environment coverage due to either fewer robots or smaller floor coverage, the less likely the robot is the robot to guard. The guard fuzzy rule matrix is given in Table 6, and follows the argument just made. 4.5 Arbitration and coordination Each robot agent Ri , i = 1, 2, . . . , n; collects and updates its internal information such estimated remaining battery energy and times it has been in or away from a behavior, as well as the external broadcast information about events such as intruder sighting. This data collection and updating is performed asynchronously and independently from all other robots. Because all information, either transmitted to the robot from other robots or obtained via sensors, arrives at the robot unpredictably and asynchronously, various threads are set up in the software to store the incoming information into buffers until the update time reaches. The buffered information is accessed in each sample time which provides the input quantities for various fuzzy logic subsystems of different behaviors B j , j = 1, 2, . . . , m. The strength of different behaviors βi j are computed using the fuzzy logic inference, as discussed before. The robot Ri will then engage in a behavior that has maximum strength, provided that it has met the minimum commit time in this behavior.

123


It is noted that in the proposed scheme there no central task allocation, and that the system is distributed in the sense that each robot agent makes its own decision. Due to the particular design and provisions made, each robot agent makes intelligence decision to satisfy both its own limitations such as its remaining battery energy and coordination with other team members to achieve the goals of the mission. In the next section, we present simulations and experimentation results that demonstrate these aspects.

5 Simulation and experimental results The emphasis of this project is on decision-making that imparts a cooperative yet autonomous ability for the robots to achieve their mission and testing here is centered on that decision-making. Tests were carried out that examined the intelligent robot security system as a whole in both simulated and real-world environments. Testing of the system as a whole in a simulated environment permits development and verification of the system-wide code. The code is composed not only of the decision-making process but also such other areas as networking, communications, sensory interpretation, and navigation. Testing of the system in the real world verifies that the system actually functions as expected under conditions it was meant for. The floor plan used in all trials was that of the Intelligent Machines and Systems Laboratory at San Diego State University, two sections of which is shown in Fig. 2a–b. The robot agents use a waypoint graph shown in Fig. 2c, overlaying the environment’s floor plan map, which must be visited while patrolling. Edges on the graph represent preferred routing from waypoint to waypoint but the robot is not restricted by the edges. The patrol graph is prepared offline by a human and numbers displayed in figure do not represent any form of visitation order. Obstacles consist of boxes and structures placed on the floor as well as tables around the lab. The hardware consists of two Pioneer robots each has a mounted laptop and runs a copy of the robot agent application. Three other computers run the other applications; one for the simulated intruder, one for the simulated objects, and one for the agent viewer application. The agent viewer permits the user to observe and record system performance and graphically see how the robot agents are actually modeling the environment in real time. A network router with wireless capability is required to permit the robots to communicate, permit the applications that simulate objects and the intruder to broadcast information to the system, and to permit the agent viewer application to listen in and display all such communications. 5.1 Simulation trial 1 In this trial two robots, named Mozar and Scruf each with a floor coverage of 36% are used. The behavior timing diagram is shown in Fig. 3. Mozar enters the environment and starts patrolling. Scruf is then activated 30 s later and starts guarding which makes the total floor coverage jump to 72%. Due to relatively large floor coverage by the robots and to conserve energy, Mozar’s the scores of patrol and inspect enthusiasms are both reduced, whereas the guard score is increased, as shown Fig. 4 for the first 70 s. However Mozar continues to patrol to meet its 60 s commitment time, and then decides to guard (Fig. 3). Both robots continue to guard until about 340 s into the scenario when Mozar begins inspecting driven by increased time away from this behavior. From there until around 485 s, the two robots alternate their behaviors among patrol, inspect and guard, as seen in Fig. 3. The behaviors satisfied several requirements such as collaboration for environment coverage, performing different tasks and saving energy, and meeting the minimum commitment times.

123

M. Lee et al.

Fig. 3 Behavior timing diagram for Trial 1

Fig. 4 Strength of patrol and inspect enthusiasms and guard

At 542 s an intruder enters the environment, as seen from the screenshots of the agent viewer in Fig. 5a. The two security robots are shown as blue disks, and the intruder as a brown disk. The blue cones indicate the area inside which is visible by the robot or the intruder. The circles around the disks show the influence area. If an intruder gets in this area, the security robot can disable it. Due to the position and orientation of the security robots, they can not see the intruder as it enters the scene (Fig. 5a). Around the time 723 s, Mozar begins inspecting due to time away from the behavior. The intruder steels object named Coffee at 740 s (Fig. 5b). At 751 s while still inspecting, Mozar sights and starts chasing the intruder as seen from Fig. 5c, and broadcast the event. Scruffy continues to guard (Fig. 3) although its enthusiasm to also chase as a result of the sighted intruder jumps to 0.95 (on scale of 0–1.0) while its motivation to do so dropped to 0.22 because now 100% of the other robots (Mozar in this case), are already chasing. While pursuing the intruder, Mozar discovers an object

123


Fig. 5 Trial 1 event screenshots: a Intruder enters the environment 542 s. b Intruder stealing object Scanner, c Intruder sighted by robot Mozar, d Mozar discovered missing object, e Immediately before Mozar disables the intruder, f After Mozar disables the intruder and Scruf begins patrolling

is missing from the environment (shown as red square in Fig. 5d). At around 754 s Mozart disables the intruder (Fig. 5f). The object missing and intruder disabled events reward patrol by i j = 50 s and penalizing inspect by i j = −30 s using (7). Immediately after Mozart disables the intruder, Scruf begins patrolling which causes Mozar’s motivation to patrol to drop so it begins guarding (Fig. 3). As Scruf patrols, it broadcast three object-missing events at 770, 780, and 810 s as a consequence of the objects pilfered by the intruder. Around 908 s while Mozar is patrolling and Scruf guarding, multiple intruder sightings and a single intruder disabled event occur. The two robots then continue guarding. Trail 1 demonstrates how floor coverage impacts decision-making; as floor coverage increases to 72% robots become more prone to guarding rather than patrolling, inspecting, or even chasing to conserve energy. Also demonstrated is how the number of robots already engaged in an activity also strongly impacts the motivation and therefore the ultimate score of each behavior. When one robot engaged in a behavior, the second robot’s motivation to do the same behavior is dropped significantly. This is part of the cooperative effort and general

123

M. Lee et al.

Fig. 6 Behavior timing diagram for simulation Trial 2

strategy to not have all robots engage in the same behavior, unless it is necessary for energy conservation or to meet the security needs. 5.2 Simulation trial 2 Trial 2 demonstrates how battery energy and power required by a behavior influence decision-making towards energy conservation to extend robot mission. The four robot behaviors considered in this paper require different power to operate, with the chase requiring the highest (normalized here to 1.0 Watt) followed by patrol, inspect and guard, normalized to 0.7, 0.5 and 0.2, in watts respectively. The decision-making continuously weighs the robot’s remaining battery energy against the power drawn by each behavior to determine the current energy risk of behavior. The normalized full battery energy is 1.0 Watt-hour. In this trial, the initial estimated battery energy is set to a low 0.32 Watt-hour to investigate decision-making in the face of low energy reserves early on. In this trial only one robot is in the environment with very low floor coverage of only 11%. The behavior timing diagram is shown in Fig. 6. It is noted that the robot chooses patrol and inspect even in the face of relatively low energy. This is due to the fact that guarding amounts to just looking at a small area due to low robot floor coverage, and which does not secure the environment. Figure 7 shows energy reserves, energy risk, and motivation for patrol. It is seen that as the energy reserves are depleted, patrol energy risk increases and patrol motivation is decreased. Following this, the robot changes its behavior to inspect and from then onward alternates between patrol and inspect at higher frequency than the first 600 s until its energy reserve are completely depleted (Fig. 6). The reason for this shorter duration inspect or patrol is that after about 600 s into the scenario, the energy reserves become low and the robot ignores the minimum commit time. 5.3 Experimental trial This trial demonstrates operability and performance of the intelligent security robot system in a real environment. The equipment consists of the P2-AT and P2-DXE Pioneer robots, which serve as the security robot and the intruder, respectively. The P2-AT was controlled through a laptop mounted on it (Fig. 2a) with the network identification name scruf running

123


Fig. 7 Energy reserves, energy risk and patrol motivation

Fig. 8 Strengths of various behaviors during the experimental trial

the security robot agent application. The P2-DXE was controlled wirelessly from a desktop computer with the network identification name Mozar running the intruder application. The trial ran for 440 s and experienced several security events. Both the initial battery and the floor coverage were set to low values of about 20%. The intruders enter the environment three times during the trial. The strengths of various behaviors and the behavior timing diagram are shown in Figs. 8 and 9, respectively. It is seen that the timing diagram generally follows the behavior strengths. The three occurrences of the chase behavior in Fig. 9 correspond to the intruder sighting and disabling. Let us now review some details of the decision making process for the first 100 s. In the face of diminishing energy reserves, the security robot Scruf that is activated at time 38 s, went into patrolling and inventorying the objects, as seen in Fig. 9. The intruder robot Mozar also entered at the same time as the Scruf but at a distance from it. At about 86 s into the

123

M. Lee et al.

Fig. 9 Behavior timing diagram for the experimental trials

Fig. 10 Screenshots showing first security event a Immediately before intruder discovery, b Immediately after intruder discovery, c Immediately after disabling intruder

scenario, Scruf detected the intruder and immediately switched to chasing and continued pursuing the intruder for about 6 s before reaching and disabling it. Figure 10a shows a screenshot of the agent viewer immediately before the intruder (shown as a brown disk) was detected. Figure 10b shows the situation immediately after the intruder detection. Note that the intruder (brown disk) is inside the cone visible by the security robot (shown as a blue disk), and that the triangle on the blue disk turning from green in Fig. 10a to red in Fig. 10b indicates that the robot sighted the intruder. Finally, Fig. 10c shows the situation immediately after the intruder is disabled which is indicated as the intruder is within the range (inside the

123


influence circle) of the chasing robot. After disabling the intruder, the security robot goes into the patrol behavior. The behaviors for the he remaining period from 100 s to the time 420 s when the robot energy is completed depleted, can similarly be explained and show human-like decision making and behaviors. It is noted that the above simulation and experimental trials are only a few cases of many trials carried out and studied under various conditions. In almost all these trials, the robot exhibited natural and human like behaviors. The detailed account of these trials and experiments are report in Cross (2009).

6 Conclusions A multi-robot security system has been proposed that considers various security demands such as patrolling, inspecting, chasing and guarding while considering limitations and constraints such as remaining robot battery energy. The decision making uses a fuzzy logic paradigm. In response to messages received from another robot, such as item missing or intruder sighting as well as its own internal information, each robot takes an appropriate course of action. This action not only meets an individual robot limitations and desires but also is in concert with other robots such that the overall system of robots performs its mission as best as possible. The proposed system can readily be extended to include other behaviors and constrains. The system has been implemented both in simulation and on actual robots. Extensive trials have been conducted with different scenarios to investigate and verify the performance of the system under various conditions and parameters. These trials, a few of which are reported in this paper, have shown the effectiveness of the proposed solution to decision making for security robots. Acknowledgments

This paper is partially supported by LG YON-AM Foundation

References Arkin R (1998) Behavior-based robotics. MIT Press, Cambridge, MA Bojinov H, Casal A, Hogg T (2000) Emergent structures in modular self-reconfigurable robots. In: Proceedings of international conference on robotics and automation. San Fransisco Born T, Ferrer G, Wright AM, Wright AB (2007) Layered mode selection logic control for border security. Department of Physics, Hendrix College, Conway, AR, and Department of Applied Science, University of Arkansas at Little Rock, Little Rock, AR Castelnovi M, Musso P, Sgorbissa A, Zaccaria R (2003) Surveillance robotics: analyzing scenes by colors analysis and clustering. Proc IEEE 1:229–234 Cross M (2009) Fuzzy logic decision making for an intelligent cooperative multi-robot team that maintain security. M.S. thesis, Department of Computer Science, San Diego State University Everett HR (2010) A brief history of robotics in physical security. Space and Naval Warfare Systems Center, San Diego, Robotics Publications, available at http://www.spawar.navy.mil/robots/land/robart/history. html Everett HR, Gage DW (1999) From laboratory to warehouse: security robots meet the real world. Int J Robot Res 18(7):760–768 Gerkey GP, Mataric MJ (2004) A formal analysis and taxonomy of task allocation in mult-robot sysyems. Int J Robot Res 23(9):939–954 Kazadi S, Abdul-Khaliq A, Goodman R (2002) On the convergence of puck clustering systems. Robot Auton Syst 32(2):93–117 Lerman K, Martinoli A, Galstyan A (2005) A review of probabilistic macroscopic models for swarm robotic systems. In: Sahin E, Spears W (eds) Swarm robotics workshop, Lecture Notes in Computer Science, vol 3342, Springer, Berlin, pp 143–152

123

M. Lee et al. Lerman K, Jones C, Galstyan A, Mataric MJ (2006) Anaysis of dynamic task allocation in multi-robot systems. Int J Robot Res 25(3):225–241 Murphy R (2000) Introduction to AI robotics. MIT Press, Cambridge, MA Parker LE (2002) Distributed algorithms for multi-robot observation of multiple moving targets. Auton Robots 12: 231–255 Rusu RB (2004) Robotux, a multi-agent robot based security system. International conference on automation, quality and testing, robotics, May 2004, Cluj-Napoca, Romania, Salemi B, Shen W-M, Will P (2001) Hormone-controlled metamorphic robots. In: Proceedings of international conference on robotics and automation, pp 4194–4199 Su KL, Chien TL, Guo JH (2004) Design of low cost security robot applying in family. International conference on autonomous robots and agents. Palmerstn North, New Zealand, pp 367–372

123

Fuzzy logic decision making for multi-robot security ...

Fuzzy logic decision making for multi-robot security ...

Suggest Documents

Fuzzy Logic Applied to Decision Making in

Applying fuzzy logic for decision-making on Wireless ... - NCE/UFRJ

Improving Decision-Making for Fuzzy Logic-Based Routing in Wireless ...

Fuzzy Multi-Criteria Decision-Making for Information Security Risk ...

Fuzzy Multi-Criteria Decision-Making for Information Security Risk ...

Fuzzy Logic Applied to Decision Making in Wireless Sensor Networks

Applying Fuzzy Logic to Risk Assessment and Decision-Making

Fuzzy Logic Based Decision Making Algorithm to Optimize the ...

Fuzzy Logic, Informativeness and Bayesian Decision-Making Problems

Applying Fuzzy Logic to Risk Assessment and Decision-Making

1 TITLE Fuzzy logic as a decision-making support ...

Applying Fuzzy Logic to Risk Assessment and Decision-Making

fuzzy multi-criteria decision making

Fuzzy Systems for Multicriteria Decision Making

Type-2 Fuzzy Logic in Decision SupportSystems

Fuzzy Logic Based Decision Support Systems - EUSFLAT

Decision Support System Including Fuzzy Logic

MAKING PRESENTATIONS USING FUZZY LOGIC IN ...

The Logic of Conflicts between Decision Making

Fuzzy multiple criteria decision-making techniques ...

Multi-attribute decision making with generalized fuzzy

Fuzzy Multi Criteria Decision Making To Determine

Intuitionistic fuzzy multi-criteria decision making ...

FUZZY MULTI-CRITERIA DECISION MAKING ALGORITHMS