NEURAL SYSTEMS FOR INTEGRATING ROBOT BEHAVIOURS
Brett Browning & Gordon Wyeth
University of Queensland Computer Science and Electrical Engineering Department
Email:
[email protected] &
[email protected]
Mail to: Brett Browning, Department of Computer Science and Electrical Engineering, University of Queensland, St Lucia, Brisbane, Australia, 4072.
Phone: +61 7 3365 3985
Fax: +61 7 3365 4999
NEURAL SYSTEMS FOR INTEGRATING ROBOT BEHAVIOURS Brett Browning & Gordon Wyeth University of Queensland Computer Science and Electrical Engineering Department
[email protected] &
[email protected]
Abstract This paper compares and contrasts two approaches to integrating robot behaviours. Robot behaviours can be produced using neural networks, as illustrated by Braitenberg’s vehicles [Braitenberg, 1984]. Experiments show that homogeneous integration is flawed with a behaviour stability problem and with a scalability problem that makes it difficult to use for large scale, complicated control problems. In contrast it is shown that competitive integration leads to incremental design and avoids stability and scalability problems. The experiments are conducted in a maze environment using a real robot.
1.
Introduction
There is a growing interest from the research community in developing intelligence systems for mobile robots that are based upon connectionist and biologically plausible models [Pfeifer, 1996]. The resulting systems, which utilise Artificial Neural Networks (ANNs), have the potential to make intelligent agents smarter, and offer insight into cognitive science issues that explore the link between brain and behaviour. In previous work, we have shown that ANN systems are readily applicable to the generation of robot behaviours [Wyeth, 1997]. The integration of behaviours in the previous work was handled by assigning different weights to different behaviours. In this system, called homogeneous integration, behaviours connected with larger weights tended to subsume behaviours connected with smaller weights. Behaviours could therefore be prioritised and integrated in a meaningful manner. In this paper, it is shown that homogeneous integration is ill suited to robot systems with multiple behaviours that share a single sensory resource. A new scheme for behaviour integration, called competitive integration, overcomes the problems associated with homogeneous integration. The competitive integration approach produces more robust architectures and is scalable to more difficult problems. The improvements are illustrated on a small mobile robot in a maze environment. 1.1 Overview of the paper Section 2 presents an overview of robot behaviour generation using neural components and describes the robot used for the experiments. Section 3 details the neural model that was used for the networks described in Sections 4 and 5. Section 4 details the homogeneous integration approach to the maze traversing robot. Section 5 describes the competitive integration approach and its performance in the maze. The results of each of the networks will be compared and contrasted in the discussion section (Section 6).
2.
Background
The techniques proposed have much in common with behaviour based robotics [Brooks, 1990]. Behaviour based robots are based on the principle that intelligence emerges from the many competing behaviours within an agent, rather than from a single intelligence producing process. In this paper, a subset of behaviours is discussed. All behaviours described here are reactive; they do not rely on the memory of previous activity to perform their functions. For the purposes of this paper, reactive behaviours will be referred to as schemas. For a neural control system, schemas are implemented as neural networks that do not maintain an internal representation of the world. This implies that they have very limited state information, and are mainly feedforward structures. 2.1
Braitenberg Vehicles
Fear
Hate
Love Curious
Figure 1:The fundamental Braitenberg vehicles. Each robot behaviour is produced by two connections.
[Braitenberg, 1984] describes a series of vehicles that demonstrate how simple structures that resemble neurons can create animat behaviour that appears intelligent. The first few of these vehicles became somewhat popularised [Dewdney, 1988] and came to be known as Braitenberg Vehicles. Four of the fundamental examples are shown in Figure 1. The operation of these vehicles is, at once, both simple and profound. Consider these vehicles to be operating on a plain with randomly placed lights. The 1
arcs at the front of the vehicle represent lights sensors that produce a signal based on the intensity of nearby light sources. The boxes at the back represent propulsion units that drive the vehicle at a velocity proportional to the signal that the actuator receives. In between sensors and actuators are connections that may be inhibitory or excitory. These simple components provide each vehicle with behaviour representative of the labelled emotions, which is readily understood from thinking out the expected reactions of each type of vehicle. The connections found in Braitenberg vehicles bear resemblance to the weights used in Artificial Neural Network (ANN) research. Similarly, the units Braitenberg proposes for combining behaviours closely resemble the units used in ANN research. This paper explores the performance of the units proposed by Braitenberg for combining behaviour in the context of a real robot. 2.3 The Robot and its Environment The experiments with neural control systems were developed using a real robot - CUQEE III. CUQEE III is a small, fully autonomous, mobile robot that is used in “micromouse” competitions [Otten, 1990]. The robot is shown in Figure 2.
3.
Neural Model
The artificial networks used in this paper are based on connectionist units that are common to ANN research. All artificial neurons have the same generic structure and perform the computation: Vi = g i ( w i ⋅ ξ − θ i ) = g i ∑ wij ξ j − θ i j Here Vi is the activation, wi is the weight vector, θi is the threshold and ξ is the input vector for unit i. Note that the input vector may be the sensory input, or it may be the outputs of other units. The units can be classed based upon the transfer function gi (.) used for the unit. The simplest neurons are linear units where the transfer function is a linear function; these are used for the motor units LV and RV. The speed of the motor is proportional to the activation. Negative activations cause the motors to drive in reverse to the normal direction. The majority of units in this paper will consist of piece-wise linear units. In this case the transfer function gi (.) is 0 for any negative activation and is linear for any positive activation. For units with self-reinforcing connections it is necessary to limit the output of the unit. This is achieved by imposing a saturation point such that above this point the output remains the same. This is shown in Figure 3.
Output
linear Saturation
Figure 2. A picture of CUQEE III. The robot is fully autonomous and fits in the palm of the hand.
CUQEE III has three distance sensors located on the sensor arm. The sensors detect the distance to the walls in directly in front of and to the either side of the robot. The robot also has two drive wheels that are arranged in a wheel chair arrangement – one on either side. A velocity control loop is implemented in software for each wheel. The robots environment consists of a maze with walls arranged on an orthogonal grid. The grid is roughly twice the width of the robot. In a micromouse competition, the robot has to find its way to the centre of the maze through randomly placed walls. For the purposes of this paper, only reactive navigation processes such as corridor following are considered. For the purposes of the neural control system, the sensor readings are converted to activations with values between 0 and 1. A wall that is closer produces a higher activation of the sensor. The sensor inputs are represented as the activations of three neurons: Sensor Left (SL), Sensor Centre (SC) and Sensor Right (SR) for the left, centre and right sensors respectively. The velocity of each drive wheel is controlled by the activations of two neurons: LV for the Left Velocity, and RV for the Right Velocity.
Input Figure 3. The piece-wise linear transfer function.
The sensory input of the robot does not suffer greatly from noise. However, due to the limited resolution of the sensory input combined with the movement of the robot throughout the maze the sensory inputs can vary dramatically. This can have a drastic effect on the performance of the network, thus it becomes necessary to augment the neural units with a short-term memory. This makes the unit less sensitive to short term variation of its input. This is helpful in certain situations (such as turning a corner) as it is necessary to continue the behaviour for a short time after the sensory input has changed. In order to implement the short-term memory effect, a first order decay to the output of the unit is added. Effectively this makes the units “leaky integrators”. The new equation for the output is: dOi (t ) τi = −Oi (t ) + Vi (t ) dt Here Vi (t) is given by the equation presented earlier, and Oi (t) is the new output of the unit that is transmitted to the connecting neurons. It is important to 2
realise that the time constant, τi , is directly related to the velocity of movement for the vehicle and controls the reaction rate of the network. Thus the time constant should be chosen carefully. A time constant of 40 ms is used throughout.
4.
Homogeneous Integration
The schema integration discussed in this section was inspired by the schema integration used by Braitenberg for vehicle 3c [Braitenberg, 1984 pp 12]. The approach consists of summing the outputs from each schema together and using the resultant sum to drive the robot. The strength or weight of each connection leads to a behaviour hierarchy, which in essence creates the “personality” of the agent. For a given scenario for the robot, changing the strength of the integration connections changes the behaviour of the robot. The connections from the schema units to the motor units are referred to as the motor association layer, and the sensory-to-schema units as the schema layer. 4.1 Schema Selection It is helpful to visualise the sensory space of the robot to see how the problem can be partitioned. The sensory space of CUQEE III is three dimensional, formed by the orthogonal dimensions SL, SC and SR. Sensory input is represented as the vector s which is formed with the activations of (SL, SC, SR). Similarly the weight vectors for each of the schema units can be represented in sensory space. SR U S
L
s
SC
SL
thresholds on each of the motor units. Since CUQEE’s sensors have limited range, sensory input is often lost during cornering and U turns. Lack of sensory input causes the robot to behave as if it were in an empty area. By virtue of the neural constructs used each schema is tuned to a particular sensory input. This means the schema becomes more active when the dot product of the sensory input vector and the schema vector increases. Since all schema vectors are normalised, the unit with the highest activation will be the one with its schema vector closest to the sensory input. Schema units, in turn, represent a particular behaviour of the motor outputs. This behaviour is generated by weights chosen between the schema units and the motor units: the motor association layer. When the robot is in a corridor with no dead end, the sensory input will be confined to the SL-SR plane. Thus the S schema will be the most active indicating the robot should drive straight ahead. When there are no walls to the right of the robot, but the walls on the left and in front of the robot are present, the sensory vector will be confined to the SL-SC plane. In this situation the robot must turn to the right, hence the R schema vector is located in the SL-SC plane. Similarly the L schema vector is located in the SR-SC plane. Finally, when the robot is in a dead end (walls in front and to either side) the sensory vector will be approximately equiangular to the SL, SC and SR axes; the U prototype vector is orientated in this direction. The selective tuning of schema units places constraints upon the motor association weights. For example, when faced with the situation which causes the L schema unit to be the most active the robot can only turn left. Thus the motor association weights must reflect this response. In this case it means the connection to the RV unit must be excitory and stronger than the connection to the LV unit. It should be noted that the constraints are only relative to the schema under consideration. The constraints do not affect any other Sensor Units
R SL
SC
SR
Figure 4. Sensory Space: The cube that forms the sensory space of the robot and the partitioning of schemas. Here s is an example sensory input, which in this case is for a straight corridor.
There are five situations that are faced by the robot: straight sections, left and right corners, dead-ends and empty areas. By covering the sensory space with the weight vectors of the schemas it is possible to generate schemas that logically represent these situations. The weight vectors for the schemas are shown in Figure 4 as: go left (L), go straight (S), go right (R) and U turn (U). Note that once weight vectors are chosen, the network is virtually complete. The fifth situation that can be faced by the robot, empty areas, is covered by ensuring that in the absence of sensory input the motor units maintain non-zero activity levels. This can be achieved with non-zero
Schema weights are the normalised vectors in Sensor Space defined before.
L
S
U * U unit has an
R
experimentally determined threshold of 0.58
LV
RV
Motor Units
Figure 5. The Homogeneous network. Here thick lines indicate strong connections, thin lines weak connections. Dashed lines indicate inhibitory synapses (negative weights) and non-dashed lines are excitory synapses (positive weights).
3
schema to motor connections. Thus the relative strengths of different schema to motor unit connections can still be modified. 4.3 Integration Figure 5 shows the network developed using the homogeneous integration approach. As mentioned above, the schema layer places constraints upon the motor association weights. However, the relative strengths of each of the weights are yet to be determined. The schema hierarchy is enforced by the combination of the motor association layer and the schema layer. The schema unit activation will depend on the current sensory input. When this is combined with the relative strengths of the motor association weights the schema hierarchy for the current sensory input is defined. Note that the behaviour of the robot is defined by the combination of all the elements in the schema hierarchy. For the robot the motor association weights are ordered by relative strength as (from strongest to weakest): U turn, Left and Right turns followed closely by Straight. In the absence of sensory input activity in the network is maintained by the negative thresholds on the LV and RV units. Due to the confines of the dead end, the U schema must control the robot quite precisely. To achieve precise control, the U schema must dominate the other behaviours completely whenever it activates. As a result the U schema has much stronger motor association weights. A threshold term is added to the U schema to ensure that it does not activate unless SL, SC and SR are all sufficiently active. There is a limit to the strength with which motor association weights can be increased. The strength of the weights not only represents the domination of a behaviour, but also the gain in the feedback loop created by the sensors, the actuators and the environment. As the gain is increased in this feedback loop, stability decreases and the performance of the robot degrades sharply. This point is highlighted by the following results. 4.4 Results Rather than attempting to test all possible situations, we will show the results of the network in the two main scenarios faced by the robot. Figure 6 shows the robot performing a left and right turn followed by a dead-end.
Crash! Figure 6. The homogeneous network performing left and right hand turns, followed by a dead-end. The robot started from the left of the picture. Note that the robot failed to perform the U turn correctly and crashed into the wall.
Clearly, the homogenous integration network can perform left and right turns. The speed of the robot during the turn, the sharpness of the corner and the time it takes to re-centre itself in the corridor are a result of the shifts in activation amongst the schema units. Changes in the time constants chosen for each unit and the relative strengths of the motor association weights affect the performance of the robot. The homogeneous network successfully slows the robot down as it enters each corner. This shows the effectiveness of the S schema. Similarly, the L and R schema units work effectively. This is shown by the robot successfully negotiating the corners and remaining close to the centre of the corridor on straight sections. However, the network fails to adequately control the robot in the dead end section of the maze as the robot crashes half way through executing a U turn. The problem occurs when the robot is too close to wall and the sensors can no longer detect the wall correctly. Without the stimulus of the sensors, the U-turn schema cannot dominate the other schema units sufficiently. Simply increasing the schema unit’s motor association weights would achieve the domination, but would compromise the stability of the schema’s control. The method of homogeneous integration of neural control systems has failed to provide a solution to reactive navigation in a maze environment. No amount of “tweaking” provides a reliable solution in this situation. The following section shows a method that is both stable and reliable.
5.
Competitive Integration
The key to improving the results of the last section is a change in the approach to integrating schemas. In this section a new network is tested on the same problem using a new approach that the authors have termed “competitive integration”. Competitive integration means that schemas are made to compete against each other, rather than cooperate together, to generate behaviour. Schemas inhibit one another, with strength that is proportional to dominance in the hierarchy. A schema that is higher up in the hierarchy has greater importance, and therefore its inhibitory weight to the other schemas is stronger. The main motivation for using this approach to integrate schemas is to overcome the problems associated with cooperative integration. By making schemas compete against each other, only one schema will win and will effectively gain complete control of the robot without upsetting stability through excessive gain. When combined with leaky integration within schema units, the dominance will shift smoothly from one schema to the next. Furthermore, the noise tolerance within the network is improved, as fluctuations in sensory input do not cause similar fluctuations in the control of the robot. 5.1 The New Schema Integration The major difficulty with the homogeneous integration network was the coordination between the U schema and the general maze traversal schemas (S, L and R). Using 4
competitive integration these will be the two superschemas that will compete with each other for control. The new network is shown in Figure 7. Sensor Units SL
SC
maze traversal, and the other schemas are unchanged, the left and right turn experiment is the same as shown Figure 6. As desired the robot is now capable of performing the U-turn successfully. Furthermore, the U turn is performed at a controllable rate.
SR
New U-turn Schema. SLR units as before
ST 1
ST 2
U Weights as before LV
Figure 8. The competitive integration network performing left and right turns followed by a U-turn. The robot performed the task correctly and at a controllable pace.
RV
Motor Units
Figure 7. The competitive integration network. The S, L and R units are the same as before. The ST1 unit starts U firing and U’s recurrent connection forces it into saturation. ST2 disables U at the end of the turn.
The U-turn schema has been modified so that it activates at the start of the turn, and disables itself at the end of the turn. This is achieved by using a recurrent connection on the U unit and using ST1 and ST2 to activate and deactivate the U unit, respectively. The recurrent connection in the U unit makes it behave like a flip-flop. Positive input causes the unit to saturate positively, and then stay saturated. Conversely, negative input causes the unit to saturate negatively, and then stay saturated until sufficient positive input is received. The two units ST1 and ST2 act as switches to “flip” the U unit between these two states. ST1 is tuned to dead ends and ST2 is tuned to corridors. The resulting vectors are the same as used for the U schema unit and S schema unit in the homogeneous integration, respectively. It is important to ensure that ST1 activates only in dead ends and ST2 activates at the end of the turn and not during the turn. This is achieved by using high thresholds on the units that ensure that sensor vector has indeed come very close to one of the prototype schema vectors. It should be noted that in contrast to homogeneous integration, where the U unit had to fire for the duration of the turn, here ST1 only needs to fire at the start of the turn and ST2 at the end of the turn. While the U schema is active it inhibits the other schemas by virtue of the competitive inhibitory connection. While active, the U schema has sole control of the robot. Achieving sole control allows the U schema to perform precise motor control without noise from the other schemas. Importantly, the domination of motor control is achieved without compromising the stability. 5.2 Results Figure 8 shows the performance of the network under the same conditions as for homogeneous integration. As the U-turn schema does not become active during normal
6.
Discussion
There are two main problems with homogeneous integration: stability and scalability. Stability of the schema behaviour is a major issue as each schema forms a closed control loop with sensors, the schema, the motors and the real world. As a result the motor association weights form the gain control of the control loop. Too large a gain can result in instability and too small a gain may not perform the desired task effectively. With schemas such as the U-turn schema, stability becomes a significant issue. Stability becomes more difficult as the number of schemas increase. This means the networks are not scalable and design cannot be performed in an incremental fashion. The motor association weights must be changed every time new schemas are added. This is a serious problem for using this approach to integration in more complicated systems. Clearly the results show that competitive integration produces a more robust network than homogeneous integration, at least for this application. Stability is no longer an issue as only a small number of schemas are active at any one time. Incremental design is also possible, as it is only a matter of changing the strength of the inhibitory connections between the schemas to reflect the new hierarchy. The improvements in behaviour are not without cost. The addition of recurrent and competitive connections precludes the use of the many training algorithms designed for feed forward networks. Without the possibility of conventional training algorithms, the neural robot designer must either hand choose weights, or perhaps rely on some evolutionary context to select appropriate weight values.
7.
Conclusions
In this paper, two possible approaches to schema integration have been described. The results have shown that of the two approaches, competitive integration provides improved autonomous behaviour. We have shown that homogeneous integration is flawed with the stability and scalability problems. These flaws severely 5
undermine the usefulness of the homogeneous approach to more complicated systems. The results have shown that competitive integration offers an improved design approach for the reactive schemas required for minimalist neural control of mobile robots.
8.
Bibliography
[Braitenberg, 1984] Braitenberg, V. (1984). Vehicles: Experiments in Synthetic Psychology, MIT Press, Cambridge, MA. [Brooks, 1990] Brooks, R.A. (1990). Elephants Don’t Play Chess. Robotics and Autonomous Systems, vol. 6, pp. 3-15. [Dewdney, 1987] Dewdney, A.K. (1987). Braitenberg memoirs: vehicles for probing behaviour roam a dark plain marked with lights. Scientific American, vol. 256, no. 3, March 1987. [Otten, 1990] Otten, D. (1990). Building MITEE Mouse III. Circuit Cellar Ink. pp. 40-51. [Pfeifer, 1996] Pfeifer, R. (1996) Building Fungus Eaters: Design Principles of Autonomous Agents, From Animals to Animats 4. ed. Maes, P. et al., Cambridge, MA: MIT Press. [Wyeth, 1997] Wyeth, G.F. (1997). Neural Mechanisms for Training Autonomous Robots. Mechatronics and Machine Vision in Practice, Toowoomba, Australia, IEEE Computer Society Press, September 1997, pp. 194-199.
6