A Framework for Defining and Learning Fuzzy Behaviours for Autonomous Mobile Robots Humberto Martínez Barberá, Antonio Gómez Skarmeta Dept. of Communications and Information Engineering University of Murcia, 30100 Murcia, Spain e.mail:
[email protected],
[email protected] Abstract In this paper we show our work on the use of fuzzy behaviours in the field of autonomous mobile robots. We address here how we use learning techniques to efficiently coordinate the conflicts between the different behaviours that compete each other to take control of the robot. We use fuzzy rules to perform such fusion. These rules can be set using expert knowledge, but as this can be a complex task, we show how to automatically define them using genetic algorithms. We also describe the working environment, which includes a custom programming language (named BG) based on the multi-agent paradigm. Finally, some results related to simple goods delivery tasks in an unknown environment are presented. Keywords: fuzzy systems, autonomous systems, behaviour blending, genetic algorithms
1 Introduction The operation of a mobile autonomous robot in an unstructured environment, as it occurs in the real world, needs to take into account many details. Mainly, the controller has to be able to operate under conditions of imprecision and uncertainty. For instance, the a priori knowledge of the environment, in general, is incomplete, uncertain, and approximate. Typically, the perceptual information acquired is also incomplete and affected by the noise. Moreover, the execution of a control command is not completely reliable due to the complexity and unpredictability of the real world dynamics. Mobile robots are increasingly required to navigate and perform purposeful autonomous tasks in more complex domains, in real world like environments. These requirements demand reactive capacity in their navigation systems. Over the last decade much of the work on reactive navigation has been inspired by the layered control system of the subsumption architecture [7], which tightly couples sensing and action. The emergent behaviour of the robot is the result of the cooperation of independent reactive modules, each one specialised in a particular basic behaviour. Behaviour-based control shows potentialities for robot navigation environment since it does not need building an exact world model and complex reasoning process. However much effort should be made to solve aspects like formulation of
1
behaviours and the efficient coordination of conflicts and competition among multiple behaviours. In order to overcome these deficiencies, some fuzzy-logic based behaviour control schemes have been proposed [2][24]. In recent years, fuzzy modelling, as a complement to the conventional modelling techniques, has become an active research topic and found successful applications in many areas. However, most present fuzzy models have been built based only on operator's experience and knowledge, but when a process is complex there may not be an expert [28]. In this kind of situation the use of learning techniques is of fundamental importance. The problem can be stated as follows. Given a set of data for which we presume some functional dependency, the question arises whether there is a suitable methodology to derive (fuzzy) rules from these data that characterise the unknown function as precise as possible. Recently, several approaches have been proposed for automatically generating fuzzy if-then rules from numerical data without domain experts [12][17][25][26]. In our research, the final goal is to have a simple mobile robot performing delivery tasks in an indoor office-like environment, without prior knowledge of the floor plant, and without any modification or special engineering of the environment. This objective, among other things, dictates the necessity of having reliable sensors both to build maps and to navigate the robot, and methods to develop the control software based on different robot behaviours. To combine reactivity, fuzzy modelling, behaviour based control and learning techniques, we have defined and developed a simulation environment based on a programming language named BG (Section 2), which uses a multi-agent paradigm over a blackboard model. Using this language we have defined a sample series of agents to control the robot in a test environment (Section 3). As a key point in behaviour coordination we use learning techniques (Section 4). Finally we have tested this procedure simulating the robot in some environments (Section 5) and obtained some conclusions (Section 6).
2
The BG programming language Several robot control languages has been proposed in the literature. Each one of them has its own pros and cons. Our
main goal was to tackle with the behaviour blending problem and the learning of the rules that drive the blending process, and we wanted to try some new functionalities as well. Thus we decided to define and implemented a new high level programming language, named BG, which is based on the C language syntax and semantics and includes many of the COLBERT [18] language features, which makes extensive use of fuzzy rule bases both for the behaviours and blender. BG is based on the multi-agent paradigm, where each agent possesses a series of behaviours. Its intended use is robotics and control applications, mainly autonomous mobile robots. The examples in this paper yield in the later field. The main goal of the language is to aid in the development cycle of robotics applications and in the behaviour modelling (as it is a robotics oriented high level language with built in fuzzy logic operations), as well as to expedite it (by using automated tuning systems). BG, in its current version, incorporates many traditional elements and control structures of classical imperative languages:
•
Local and global variables.
•
Expressions and assignments.
•
Arithmetical, logical and relational operators.
2
•
Conditionals.
•
C-like functions.
Basically, BG supports two data types: real numbers and fuzzy sets with standard shapes (triangle, trapezoid, bell, and sigmoid). In addition to that, BG presents advanced non standard control structures like:
•
Deterministic finite state machines.
•
Fuzzy rule bases.
•
Grid based map-building.
•
Agents and behaviours definition and blending.
We have developed, using the Java language, an integrated development environment for BG. This way we have a multi-platform application named BGen (available at http://ants.dif.um.es/~humberto/robots). This implements an interpreter for the BG language, a simulator for different custom robots, some data visualisation and recording tools, and a GA based behaviour learning tool. Although the examples in this paper have been run only in the simulator, we are developing the interface for a real robot, similar to the one depicted in Figure 2. This robot will be used also as the platform for the different examples. It is important to note that the robot executes the BGen software, but with a modified version of the simulator that reads in sensory data and writes out the actuator commands. This way both the development and exploitation environment share code. The elements of the BG language are briefly described in the following subsections.
2.1
Finite state machines
The use of finite state machines allows the programmer to define tasks that need to be run sequentially, and the conditions to change from one task to other. The BG language, as of its current implementation does not have instructions for loops (while, for, or repeat-until), but these can be modelled using finite state machines. The main reason for this lack of loops is to have under control the duration of each control loop (an error can not lock it). In a future, loops will be incorporated as in COLBERT [18], which in fact, internally, are finite state machines. Another use of finite state machines is the implementation of simple planners. For example, we want our robot to move from a home location to given a goal coordinates, and then return back to home. It features a sensor to know its position, and a function (ingoal) that determines if it is in its current goal. Such planning task could correspond to the following state machine (Figure 1), with its corresponding sample BG code (Listing 1).
3
Figure 1: Sample FSM-based planner.
2.2
Fuzzy rule bases
One of the strengths of the BG language is the possibility of using fuzzy rule bases [23]. These are intended for both control and behaviour blending (the later will be discussed below). Two families of operators have been defined: t-norms and t-conorms. In the current implementation, these operators can be changed only globally for all the fuzzy rule bases. The examples in this paper assume that we use the min and max operators. The defuzzification is carried out by means of the center of gravity method. Typically the rules will consist of an antecedent with some input variables and a consequent with some output variables (MIMO rules). These consequents can be any combination of fuzzy sets, crisp values or TSK consequents. There is a special kind of fuzzy rules, called background rules, that are useful to cover undefined input space. The way they work is by applying a fixed alpha-cut to the consequent part of the rule (rules that always fire with a given strength). For instance, if we want to maintain a given distance from the robot to a wall, we can use the following code (Listing 2). Right is the input variable that measures the distance from the robot to the wall at the right. Turn is the output variable that controls if the robot should steer to the left (fuzzy sets TSL and TL), to the right (fuzzy set TSR) or none (fuzzy set TN). fsm start HOME { state HOME: { goal = 0.0; home_x = x; home_y = y; goal_x = gx; goal_y = gy; shift LOOKING; } state LOOKING: { if (ingoal == 1.0) { goal_x = home_x; goal_x = home_y;
// Sets for ToF sensor range [0 .. 10] (meters) set CLOSE = trapezoid {0.0, 0.0, 0.17, 0.2}; set NEAR = trapezoid {0.17, 0.2, 0.3, 0.4}; set MED = trapezoid {0.3, 0.4, 1.3, 1.4}; set FAR = trapezoid {1.3, 1.4, 10.0, 10.0};
// Goal not reached // Set origin coordinates
// Sets for steering [-10 .. 10] set TTL = trapezoid {-15.0, -15.0, -10.0, -9.0}; set TL = trapezoid {-10.0, -9.0, -6.0, -5.0}; set TSL = trapezoid {-6.0, -5.0, -2.0, -1.0}; set TC = trapezoid {-2.0, -1.0, 1.0, 2.0}; set TSR = trapezoid {1.0, 2.0, 5.0, 6.0}; set TR = trapezoid {5.0, 6.0, 9.0, 10.0}; set TTR = trapezoid {9.0, 10.0, 15.0, 15.0};
// Set goal coordinates
// Change goal coordinates
rules {
4
shift CHANGING;
background (0.1)
turn is TN;
// Turn none
if (right is FAR) if (right is MEDIUM) if (right is NEAR) if (right is CLOSE)
turn is TSR; turn is TSL; turn is TSL; turn is TL;
// Turn small right // Turn small left // Turn small left // Turn left
} } state CHANGING: { if (ingoal == 0.0) } state RETURNING: { if (ingoal == 1.0) } state HALT: { goal = 1.0; halt; }
shift RETURNING; } shift HALT;
// Goal reached
} Listing 1. Example of a finite state machine.
2.3
Listing 2. Example of a fuzzy rule base.
Grid based map-building
The BG language provides a basic structure and methods to manage bidimensional spatial information for constructing maps. The choice of internal map representation depends on various characteristics but mainly of the environment itself (obstacles density, rooms dimension, etc.). Lee [20] makes a classification attending to the range of geometric properties that can be derived from the map. According to the properties of an indoor high density environment, a metric map can be considered as the most appropriate choice. This stores a more compacted representation of the environment. The square cells grid maps are the most widely used metric maps. Although they have many shortcomings, the advantage of employing grid maps is that path planning is rather simple. In recent years, different tools have been introduced to build grid maps based on neural networks, possibility theory, or fuzzy logic [9][21][22]. The method we use for map building is based on the work by Oriolo [22]. In this case we have a map divided in square cells of a given length, resulting in a N by M matrix. Each cell has two associated values: the degree of certainty that the cell is empty eij and the degree of certainty that the cell is occupied wij. These values are calculated for each cell in the beam of a single sensor, taking into account the following facts:
•
The possibility ga that cells in the arc of the sensor range t are obstacles.
•
The possibility gb that cells in the circular sector of radius t are empty.
•
The belief gc that shorter distances are more accurate.
•
The belief gd that cells near to the beam axis are empty.
The eij and w ij are then aggregated to the previous values of each cell, to obtain the aggregated degree of certainty that the cells are empty E(i,j) or occupied O(i,j). As these correspond to fuzzy degrees of membership, we combine such information in an additional value for each cell M(i,j), which represents, in a conservative way, if the cell is occupied. Using these M(i,j) values, an A* algorithm searches for the shortest path from the current robot position to the goal. As a result, the absolute heading that the robot should follow to reach the goal is obtained.
5
In the BG language, when a grid is initialised, the user provides two parameters: the horizontal and vertical size of the grid map and the length in metres of the cell sides. The left-bottom cell serves as the origin of coordinates. As the robot moves, the following data is processed:
•
The current absolute location (x, y) of the robot.
•
The current absolute location (x, y) of the goal place.
•
The absolute location (x, y) of a possible obstacle.
•
A relative sensor reading (x, y, θ, sensor_value) from the given absolute location.
An example of a grid generated in a robot run is shown below (Listing 3). The Quaky-Ant robot (Figure 2) navigates in an indoor office like floor plant. It has ten ultrasonic sonar range sensors [6] (sonar0, … , sonar9), at different angles. In each control loop cycle, the current robot location and the current goal are updated. If the sonar readings are less than a given threshold value (2 metres) the grid is updated with them. As a result, the variable heading is updated using the A* output and the map is updated (Figure 3).
Figure 2: Quaky-Ant, custom autonomous mobile robot.
2.4
Figure 3: Sample grid based map.
Agents and behaviours definition
The organisational units in the BG language are based in the agent notion [16]. We arrange these agents using a blackboard based architecture [15][27] as the main paradigm for communication and control. Under this model, a series of agents share the available information about the system and the environment. Each agent can read from the blackboard the information that is relevant for it, perform its own processing, and then write each possible result onto the blackboard. These agents use KQML [19] as the protocol for information interchange. The proposed architecture has been designed according to the following principles:
6
•
Use of encapsulation mechanisms, isolation, and local control: each agent is a semiautonomous and independent entity.
•
There are no assumptions about the knowledge each agent has or the resolution method it implements.
•
Flexible and dynamic organisation.
•
Adequacy of the architecture to the use of learning techniques.
A key idea is that each agent is decomposed in very simple behaviours. These access the blackboard by way of some input and output variables. As two different behaviours may modify the same variable, a fusion method (called behaviour blending) is provided [24]. The advantage of this method is related to scalability and learning: while in a subsumption architecture [7] the management of agents depends on the behaviours they implement [14], with the proposed architecture the management resides on the fusion method. A fuzzy rule base carries on this fusion, enabling/disabling the output of the different behaviours, both totally and partially [2][14][24]. This way, the system can be trained with only some simple behaviours, and then new behaviours can be added without modifying the previous ones: only the fusion rule base needs to be re-trained. When an agent is specified, the user may supply three kinds of blocks that are placed inside the agent's execution loop:
•
A common block that is executed beforehand any behaviour.
•
A series of behaviours that are executed concurrently.
•
A blender block that specifies how the outputs of the behaviours will be fused.
All these blocks must satisfy a constraint: the execution time of each one must be compute bound, so there must not be infinite loops. The language itself assures that this constraint holds for all the blocks, and it is a necessary requirement for getting basic real time support, although the language doesn't support fixed time scheduling. It is the user responsibility to make the appropriate tests to obtain a certain control cycle duration. For the blending process to work each behaviour defines what global output variables are to be fused. Thus they only modify local copies of those variables during the behaviour execution. The blender applies inference over fuzzy rules, defined using some input global variables as the antecedent and some behaviour names as the consequent. Last, the following formula (eq. I) is applied to each blended output variable:
α j = pri j ⋅ behj
∑ α ⋅ lov ∑α j
ovi =
ij
[I]
j
j
j
where:
•
ovi is the i-th global output variable.
7
•
lovij is the local copy of the i-th output variable in the j-th behaviour.
•
prij is the priority of the j-th behaviour.
•
behj is the result of the fuzzy inference for the j-th behaviour.
This way the fusion depends both on a fixed priority and a variable activation degree calculated using fuzzy inference. The fixed priorities act as scaling factors, while the activation degrees vary according to environment changes. Each particular application will dictate which combination of priority and inference is the best (to use only priorities, to use only inference, or to combine both). Moreover, this novel approach for behaviour blending generalises the concept of hierarchical fuzzy rules, because, in fact, this hierarchy may combine fuzzy rule base with other techniques (from both soft computing and classical techniques). We explain below (Section 4) how learning techniques can be used in conjunction with this blending mechanism to obtain a set of fusion rules. A simple agent definition is shown below (Listing 4). The agent has three behaviours: one which avoids obstacles (avoid), one which aligns to the left (alignL), and one which aligns to the right (alignR). The blender block specifies how the output of these behaviours (turn and speed) will be fused, in each time step (Figure 4). grid updates heading cells 71.0 by 71.0 side 0.05 { location (x, y); goalpos (gx, gy); if (sonar0 < 2.0) if (sonar1 < 2.0) if (sonar2 < 2.0) if (sonar3 < 2.0) if (sonar4 < 2.0) if (sonar5 < 2.0) if (sonar6 < 2.0) if (sonar7 < 2.0) if (sonar8 < 2.0) if (sonar9 < 2.0)
agent basicControl { blending left, right, front;
update (x, y, 0, sonar0); update (x, y, 1, sonar1); update (x, y, 2, sonar2); update (x, y, 3, sonar3); update (x, y, 4, sonar4); update (x, y, 5, sonar5); update (x, y, 6, sonar6); update (x, y, 7, sonar7); update (x, y, 8, sonar8); update (x, y, 9, sonar9);
}
common behaviour avoid priority 1.0 behaviour scape priority 1.25 behaviour toGoal priority 0.3 behaviour alignL priority 0.5 behaviour alignR priority 0.5
{ /* Code goes here */ } { /* Code goes here */ } { /* Code goes here */ } { /* Code goes here */ } { /* Code goes here */ } { /* Code goes here */ }
blender { rules { background (0.01) background (0.01) background (0.9) background (0.01) background (0.01)
scape is LOW; avoid is LOW; toGoal is HIGH; alignR is LOW; alignL is LOW;
} }
8
if (front is CLOSE)
scape is HIGH, toGoal is LOW;
if (left is CLOSE) if (front is CLOSE) if (right is CLOSE)
avoid is HIGH, toGoal is HALFL; avoid is HIGH, toGoal is HALFL; avoid is HIGH, toGoal is HALFL;
if (left is NEAR) if (front is NEAR) if (right is NEAR)
avoid is HALFH, toGoal is HALFH; avoid is HALFH, toGoal is HALFH; avoid is HALFH, toGoal is HALFH;
if (left is MED) if (right is MED)
alignL is HALFH; alignR is HALFH;
if (left is FAR) if (right is FAR)
alignL is HIGH; alignR is HIGH;
} Listing 4. Example of agent and behaviours definition.
Listing 3. Example of a grid.
Figure 4: Degrees of activation of some behaviours (fusion).
3
A sample agents architecture Once defined a programming language and its inherent programming model, we use a robotics platform and define a
simple navigational task. The platform (Figure 2) is a differentially steered and kinematically holonomic robot provided with a ring of ultrasonic sonar range sensors. The task is to navigate from a given home position to a goal location, and then return back to home, in a completely unknown environment but similar to a real indoor floor plant (with different rooms and corridors). The coordinates of the two places are known beforehand. In addition, the robot should neither collide nor get frozen (or inside an infinite loop). To accomplish this goal the robot must be able to perform certain tasks: navigation with obstacle avoidance, goal seeking, map building and localisation, some form of high level planning, etc. Using the BG language and its implicit architecture we have made a decomposition of a real system, extracting the most important elements [1]. We have identified the following agents (Figure 5) that will be described in the next subsections:
•
Reactive Control. This is the agent that is in charge of the reactive navigation.
•
Planning. This is the agent that is in charge of high level planning and global goal completion supervision.
•
Navigation. This is the agent that is in charge of building a more or less precise model of the environment, and then locating the robot on it.
•
Sensor Processing and Actuator Control. This are the agents that are in charge of low level elements control and signal filtering: sensors and actuators.
9
Figure 5: Agents architecture elements.
3.1
Reactive control
This is the most important and crucial agent, which implements some basic reactive behaviours. These make extensive use of fuzzy logic in the form of fuzzy rule bases. These rules, which have been obtained both from previous experience and fuzzy modelling techniques [13][14], affect both the desired robot velocity (v) and steering angle (θ ) based on the current sonar readings (sonari). When working with fuzzy modelling, the user drives the robot to perform some simple reactive task (i.e. wall-robot alignment, speed control, and so on) based on sensory data. These data are recorded and then passed to one or more fuzzy modelling methods. The information is clearly imprecise and unreliable because we use both simulated sensors and the sensors of the real robot. Basically, those methods perform cluster analysis and then produce the required fuzzy rules. For the present application, we have identified and implemented the following reactive behaviours:
•
obstacle-avoider: drives the robot to the opposite direction to the nearest obstacle.
•
align-right: holds the robot at a given distance from the wall at its right (Listing 2).
•
align-left: holds the robot at a given distance from the wall at its left (Listing 2).
•
move-to-goal: drives the robot in the direction of the current goal.
•
scape: turns the robot randomly to escape from a u-shaped corner.
The output of these behaviours is fused by the blender block, which is defined using a fuzzy rule base as well (Listing 4). This set of rules serves as the base of the learning procedure. This point is addressed in section 4.
3.2
Sensor Processing and Actuator Control
The Sensor Processing agent is in charge of processing and filtering the information provided by the sensors, and the Actuator Control agent is in charge of sending the right signals to the actuators to perform the given control commands. In our system we have the following behaviours:
•
Range sensors acquisition and filtering.
•
Encoders data sampling and processing.
•
Collision detection.
•
Robot control vector (v, θ) execution.
To execute platform control vectors (v, θ) it is needed to take into account the fact that the robot possesses differential steering. So, this vector must be converted to different speeds for the two motors. Each motor has an agent in charge of maintaining a constant velocity using encoder information as feedback. This is accomplished by a typical set of five SISO fuzzy rules, derived using fuzzy modelling. For more details see [8][13].
10
3.3
Planning
Due to the simplicity of the tasks the robot is going to perform, the planning agent is quite simple. At present, it is a deterministic finite state automata (Figure 1) (Listing 1), whose states are: starting at home position, looking for goal destination, exiting goal place, looking for home position, arrived at home. These states have associated goal co-ordinates that will be received by the reactive control agent, which will activate the corresponding behaviours.
3.4
Navigation
The Navigation agent is in charge of building a map of the environment and keeping track of the robot location. To implement the map, a behaviour uses the grid structure provided by the BG language (Listing 3). This grid is updated continuously with the different sonar readings. On the map, the robot is located using odometry (currently we assume it has no important errors). This agent should integrate both the map building and the odometry correction tasks [9][10].
4
Learning behaviour fusion As stated previously, a key issue of behaviour-based control is how to efficiently coordinate conflicts and competition
among different types of behaviour to achieve a good performance. Instead of just inhibiting and suppressing strategies according to assigned priorities, the fusion method described above (Section 2) is used. A fuzzy rule base carries on this fusion, enabling/disabling the output of the different behaviours, both totally and partially. The use of learning techniques [5][9] in the fusion rule base, can result in robot navigation performance improvement. This way, the system can be trained with only some simple behaviours, and then new behaviours can be added without modifying the previous ones: only the fusion rule base needs to be re-trained. Evolutionary Algorithms (EA) [4][11] are adaptive procedures of search and optimisation that find solutions to problems, inspired by the mechanisms of natural evolution. They imitate, on an abstract level, biological principles such as a population based approach, the inheritance of information, the variation of information via crossover/mutation, and the selection of individuals based on fitness. EAs start with an initial set (population) of alternative solutions (individuals) for the given problem, which are evaluated in terms of solution quality (fitness). Then, the operators of selection, replication and variation are applied to obtain new individuals (offspring) that constitute a new population. The interplay of selection, replication and variation of the fitness leads to solutions of increasing quality over the course of many iterations (generations). When finally a termination criterion is met, such as a maximum number of generations, the search process is terminated and the final solution is shown as output. As our main concern is the learning of the fusion rule base, we make use of EAs [12]. The user selects an agent whose blending rule base will be learned. The key idea is to start with a predefined set of fuzzy rules for behaviour blending, and then apply the learning method to them. This set can even be the empty set. In either case, the individuals of the EA are fuzzy rule bases (Pittsburgh coding), and the result of the EA is the proposed fusion fuzzy rule base.
11
4.1
Representation and initial population
As stated previously, the learning procedure can be started in two ways: using a predefined set of rules (rule tuning) or using the empty set (rule discovery). If the algorithm is run in the rule discovery mode the initial population is set randomly. Otherwise, the initial population will consist of the user-defined rules and random modifications of them. The later corresponds to a typical set up, because it narrows the search space, and thus requiring shorter learning times. The representation we use for the individuals [12] imposes some constraints to the rule bases: the fuzzy sets must be trapezoidal and the maximum number of rules must be set up beforehand. These are not really hard constraints because the conversion from triangular to trapezoidal is straightforward, and we can specify a big enough maximum number of rules. We extract the fuzzy rule base from the blender module of a given fixed agent. The user also specifies the different input variables that will be used in the rules. The output variables are the different behaviours of that agent. Each rule consists in the concatenation of two parts. A rule part, which consists of the concatenation of the real numbers that define the fuzzy sets of the input and output variables. And a control part, which consists of the concatenation of two boolean numbers that represent if the rule is active or not, and if the rule is a background rule or not. Then, the representation of the individual is the concatenation of the data for all the different rules (both active or not).
4.2
Variation operators and parameters
The user must set up a series of parameters [11] that control how the algorithm proceeds. Basically, which kind of variation operator is going to be used, the probabilities of both mutation (Pm) and crossover (Pc), the maximum number of generations (Geners) and the population size (Popul). We have implemented several [12] operators for the rule part:
•
Classical mutation
•
Non uniform mutation
•
Classical crossover
•
Arithmetical crossover
•
MaxMin crossover
and for the control part of each rule:
•
Classical mutation
•
Classical crossover
12
Each time the algorithm needs to apply crossover or mutation, it randomly selects one of the previous methods. Additionally the user may specify different probabilities for the different operators. In our examples we have kept all the operators with the same probability.
4.3
Fitness function
While the other topics are quite standard, the definition of the fitness function is not a simple matter. This function measures the goodness of each individual. As the individuals, in fact, affect the performance of the robot, the fitness function must take into account how the robot executes a predefined task. For this reason the user must define a simulation framework with, at least, the following properties:
•
Goal: where the robot starts, where the robot ends, and what the robot has to do.
•
Map: a floor plant with walls, doors, corridors and obstacles.
•
Iters: number of simulation runs per individual.
To avoid infinite loops in the simulation some additional parameters are needed:
•
Timeout: maximum allowable time for each run, measured in epochs.
•
Maxcolls: maximum allowable number of collisions.
When evaluating the fitness function for each individual, it is used as the fusion fuzzy rule base for a given agent, and a simulation run is executed. At the end, each simulation run returns four values:
•
Gotcha: -1.0 if the goal is accomplished, 1.0 otherwise.
•
Epochs: number of simulations steps performed.
•
Colls: number of robot-wall collisions.
•
Stage: last portion of the full task completed by the robot.
•
Way: percentage of distance left from the starting to the goal location of the stage the robot is trying.
Using the different parameters and values defined above, the fitness function is calculated using the following formulae (eq. II, III, IV, V), with k1 = 25, k2 = 100, and k3 = 5:
Total = ( start x − goal x ) 2 + ( start y − goal y ) 2 ToGoal = (robot x − goal x ) 2 + (robot y − goal y ) 2 Way =
Total × 100 ToGoal
[II] [III] [IV]
13
Iters
[V]
Fitness = ∑ [(Timeouti − Epochsi ) − Colls × 25 − Gotcha × 100 − Way × 5] i
The fitness function defined this way tries to take into account the different aspects relevant to a good robot performance: reward the full completion of the task (gotcha), reward low execution times (epochs), and punish collisions with the walls (colls). In addition, this function must measure somehow how well the robot performed even if it did not completed the task, taking into account: the number of subgoals fully achieved (stage), and the degree of completion of the last subgoal not achieved (way). Otherwise, the evolution can not differentiate between different individuals that are not good enough as final solutions, but could be good candidates to continue evolving. This is very important at the beginning of the learning process.
5
Experimental results Using the BGen environment we have performed some test to verify the validity of our theses. We will focus on the
results of one of such tests. Basically, the task is to navigate on a given floor plant (Figure 7), starting from a given location (the left of the arena), reach the opposite side (small circle on the right of the arena), and then return back to the starting location. The robot has no knowledge about the floor plant but the coordinates of the finishing point. The agents, behaviours, and control rules for the robot controller are the ones described and showed in the previous sections. We have fed the learning tool with the Reactive Control agent and the following EA parameters (Table 1):
Pc
Pm
Generations
Population
Iterations
Timeout
Maxcolls
0.9
0.25
1033
8
2
500
5
Table 1. EA learning procedure parameters.
Test#31 500 400 300 200
Fn
100 0 -100
0
200
400
600
800
1000
-200 -300 -400 -500 Geners
Figure 6: EA fitness function evolution.
14
1200
Best Avg
Then we have obtained the following results. Using the user defined fusion rules, the initial robot controller neither achieves the goal nor collides. This is a good starting point for learning because the controller is pretty fair and intuitive (in fact designed on the fly just a while before the tests), and the learning time should not be very high. As shown (Figure 6), the fitness function starts with negative values (goal not achieved) and ends up with high positive values (goal achieved in a small number of epochs). The best individual does drive into the goal despite the fact of the surrounding walls, and minimising the total distance travelled. To test the generalisation capability of the procedure, after learning with three simple floor plants (Figure 7) we use the final fuzzy rule base in more complex floor plants. In both cases the robot achieves its task and the path is fairly smooth and straight, even when passing doors.
Figure 7: Floor plants for learning.
6
Conclusions In this paper we show the framework we have developed for autonomous mobile robots programming. The problem we
are interested in is a robot performing delivery tasks in an indoor office-like environment, without prior knowledge of the floor plant, and without any modification or special engineering of the environment. To produce the control software for such robot, we have defined a robotics high level programming language named BG, which uses a multi-agent paradigm over a blackboard model. Each agent defines a series of behaviours, and a fuzzy rule base carries on the behaviour fusion, enabling/disabling the output of the different behaviours. Using this language we have defined a sample series of agents to control the robot in a test environment. A real challenge is the efficient coordination of the behaviours these agents implement. To overcome the difficulty of doing it “by hand”, we use EA to learn a meta-controller capable of performing such behaviour blending. The validation of these tools (language, architecture, and learning techniques) is performed using a robot simulator, and the results are quite satisfactory. One advantage of the proposed approach is that the system can be
15
trained with only some simple behaviours, and then new behaviours can be added without modifying the previous ones: only the fusion rule base needs to be re-trained. Future works includes:
•
The test of more complex tasks and environment conditions.
•
The application of techniques to learn the fixed priority of each behaviour if the fusion method.
•
The study of the correlation between the simulator and the real robot performance.
Acknowledgements The authors thank the CICyT for the support given to this work under the projects TIC97-1343-C02-02 and 1FD97-0255C03-01(02).
References [1]
D.G. Armstrong, C.D. Crane, A Modular, Scalable, Architecture for Intelligent, Autonomous Vehicles, 3rd IFAC Symposium on Intelligent Autonomous Vehicles, Universidad Carlos III, Madrid, Spain, pp 62-66, 1998
[2]
B.C. Arrúe, F. Cuesta, B. Braunstingl, A. Ollero, Fuzzy Behaviours Combination to Control a Non Holonomic Mobile Robot using Virtual Perception Memory, Sixth IEEE Intl. Conf. on Fuzzy Systems (FuzzIEEE’97), Barcelona, Spain, pp 1239-1244, 1997
[3]
H.R. Beom, H.S. Cho, A Sensor-Based Navigation for a Mobile Robot using Fuzzy Logic and Reinforcement Learning, IEEE Trans. Systems Man and Cybernetics, 3(25):464-477, 1995
[4]
J. Biethahn, V. Nissen, Evolutionary Algorithms in Management Applications, Springer-Verlag, Berlin, 1995
[5]
A. Bonarini, F. Basso, Learning to Compose Fuzzy Behaviors for Autonomous Agents, Intl. Journal of Approximate Reasoning, no. 17, pp 409-432, 1997
[6]
J. Borenstein, H.R. Everett, L. Feng, Navigating Mobile Robots: Systems and Techniques, A.K. Peters, Wellesley, MA, 1996
[7]
R.A. Brooks, A robust Layered Control System for a Mobile Robot, IEEE J. Robotics and Automation, 1(2):14-23, 1986
[8]
M. Delgado, A.F. Gómez Skarmeta, H. Martínez Barberá, J. Gómez, Fuzzy Range Sensor Filtering for Reactive Autonomous Robots, Fifth Intl. Conference on Soft Computing and Information / Intelligent Systems (IIZUKA'98), Fukuoka, Japan, 1998
[9]
M. Dorigo, M. Colombetti, Robot Shaping: an Experiment in Behavior Engineering, The MIT Press, Cambridge, MA, 1998
[10] J. Gásos, A. Martín, Mobile Robot Localization using Fuzzy Maps, in Lectures Notes in Artificial Intelligence (A. Ralescu and T. Martín Eds.), Springer-Verlag, 1996 [11] D.E. Golberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989 [12] A.F. Gómez Skarmeta, F. Jiménez Barrionuevo, Generating and Tuning Fuzzy Rules with Hybrid Systems, Sixth IEEE Intl. Conference on Fuzzy Systems (FuzzIEEE’97), Barcelona, Spain, pp. 247-252, 1997 [13] A.F. Gómez Skarmeta, H. Martínez Barberá, Fuzzy Logic Based Intelligent Agents for Reactive Navigation in Autonomous Systems, Fifth International Conference on Fuzzy Theory and Technology, Raleigh, USA, pp 125-131, 1997 [14] A.F. Gómez Skarmeta, H. Martínez Barberá, M. Sánchez Alonso, A Fuzzy Agents Architecture for Autonomous Mobile Robots, International Fuzzy Systems World Congress (IFSA'99), Taiwan, 1999 [15] B.A. Hayes-Roth, Blackboard for Control, Artificial Intelligence, 1(26):251-324, 1985 [16] J. Hendler, Making Sense out of Agents, IEEE Intelligents Systemns, 2(14):32-37, 1999
16
[17] F. Hoffmann, G. Pfister, Evolutionary Learning of Fuzzy Control Rule Base for an Autonomous Vehicle, Proc. Sixth Int. Conf. Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU'96), Granada, Spain, pp 1235-1240, 1996 [18] K. Konolige, COLBERT: a Language for Reactive Control in Saphira, German Conference on Artificial Intelligence, Freiburg, 1997 [19] Y. Labrou, T. Finin, A semantics approach for KQML – a general purpose communication language for software agents, Third International Conference on Information and Knowledge Management, 1994 [20] D. Lee, The map-building and exploration strategies of a simple sonar-equipped robot, Cambridge University Press, Cambridge, 1996 [21] M. López Sánchez, R. López de Mántaras, C. Sierra, Possibility theory-based environment modelling by means of behaviour based autonomous robots, European Conference on Artificial Intelligence, pp 590-594, 1998 [22] G. Oriolo, G. Ulivi, M. Venditelli., Real-time map building and navigation for autonomous mobile robots in unknown environments, IEEE Trans. on Systems, Man and Cybernetics-Part B: Cybernetics, 3(28):316-333, 1998 [23] A. Saffiotti, et al., Robust Execution of Robot Plans using Fuzzy Logic, in Fuzzy Logic in Artificial Intelligence: IJCAI'93 Workshop. Lectures Notes in Artificial Intelligence 847 (Ralescu A. Ed.). Springer-Velag, pp 24-37, 1994 [24] A. Saffiotti, The use of fuzzy logic for autonomous robot navigation. Soft Computing, 1(4):180-197, 1997 [25] E. Tunstel, M. Jamshidi, On Genetic Programming of Fuzzy Rule Based Systems for Intelligent Control, Intl. Journal of Intelligent Automation and Soft Computing, vol. 3, no. 1, 1996 [26] J.R. Velasco, F.E. Ventero, Some Applications of Fuzzy Clustering to Fuzzy Control Systems, Third International Conference of Fuzzy Theory and Technology, Durham, USA, 1994 [27] S. Vranes, M. Stanojevic, Integrating Multiple Paradigms within the Blackboard Architecture, IEEE Transactions on Software Engineering, 3(21): 244-262, 1995 [28] L. Wang, R. Langari., Complex Systems Modeling via Fuzzy Logic, IEEE Trans. on Systems, Man and Cybernetics, 1(26):100-106, 1996
17