An Architecture for Behavioral Organization using Dynamical Systems Thomas Bergener, Axel Steinhage Institut fur Neuroinformatik Lehrstuhl fur Theoretische Biologie Ruhr-Univeritat Bochum Germany Thomas.Bergener,
[email protected] To increase the capacity of an autonomous mobile robot to act in initially unknown and dynamic environments we developed an architecture for behavioral organization and action selection which is based on continuous dynamical systems. The robot's overall behavior is generated by modules, each of them de ning a set of actions or simple behaviors. A system of continuous dierential equations activates a subset of these simple behaviors depending on an abstract sensor context, a working memory, logical presuppositions and mutual exclusions of actions. Nonlinear phase transitions in the solution of the underlying dynamical system result in activating or suspending the according behaviors. Mutual exclusions and logical presuppositions are coded in matrices of parameters while the symbolic sensor context is continuously calculated from the stream of sensor data. We present the dynamical system for behavioral organization and its embedding into a software architecture to control an autonomous mobile robot.
1 Introduction Autonomous mobile robots for human environments must cope with the main problems that are discussed in autonomous agents research: They operate in dynamic, imperfectly sensed environments, the eect of actions is not precisely predictable and the world is dynamic itself, especially if the robot operates close to humans. Several approaches have been proposed to control robots in such environments, many of them belong to the group of behavior-based robot architectures [5][10][8]. These approaches have in common, that sensing and acting is done closed loop and thus the robot operates mainly reactive on the bottom layer of these architectures. In 1995 Schoner, Dose and Engels introduced the Dynamic Approach to autonomous robot architectures [11]. They presented the idea of robot control by means of nonlinear dierential equations in an abstract state space. Such dynamical systems are used to realize paths of sub-symbolic data ow from sensors to actors. The dierential equations specify the control strategies and desired behaviors are described by attractors of these underlying dynamics. A great advantage of this approach is, that the concepts are mathematical denoted in closed form and can be analyzed using standard methods for dynamical systems. Moreover the system reacts instantaneously to changes in the sensor data and is controlled continuously in abstract system parameters. Though much work has been done on goal-seeking servo-mechanisms using continuous dynamics (e.g. for path planning [6], robot arm control [3], etc.) the upper and mid levels of the architectures mentioned above use standard methods of arti cial intelligence, e.g. discrete event systems [7], or hybrid approaches [10] to switch between behaviors i.e. they all produce mappings from discrete situations or conditions to discrete actions. While the characteristics of the sensor data (noise, con dence, etc.) are considered when it is fed into the behavioral modules (by feedback, ltering or self-stabilizing solutions) these characteristics are neglected when sensor information and internal states are used for action selection.
In contrast to existing approaches we present a framework for action selection and sequence generation that is, like the incorporated behavioral modules, based on dynamical systems. The approach uses hysteresis and refractory terms to prevent oscillations and to stabilize the selection of behaviors. This system is realized as a distributed system to control the anthropomorphic mobile robot Arnold[2].
2 Behavioral organization by means of dynamical systems The dynamical systems approach to behavior generation as it is presented in [11] requires that a variable is selected which describes the relevant aspects of the behavior to be generated and a dierential equation is set up in this variable which has the desired behavioral state as an attractor. The elegance of this approach has been proven in many robotic applications so far ([14],[4],[9]), as it oers a generative language for designing behaving systems. While the original approach focused on single behaviors for which the selection of the right behavioral variable was more or less obvious, we developed a mechanism for stably integrating many elementary behaviors in a complex highly coupled system. As we want to carry over the advantageous stability properties, we obey all the principles formulated by the original approach. As behavioral variable we select a state vector ~n, the elements ni of which describe the activity of every single behavior i. If, for instance, the behavior approach a target is denoted by the number i = 1, a value of n1 ' 0 means that this behavior is currently not active while jn1j ' 1 means that this behavior is active. Sometimes it is necessary to include instances into this scheme which one would usually not call a behavior. E.g. it may be required to express by an ni that a robot recognizes a door. Therefore we call the ni from now on elementary behavioral states or EBS. The subset of currently active EBS is therefore represented by all elements ni of the state vector ~n for which jnij ' 1. While for the generation of a single behavior the current sensor input may be
sucient, this is certainly not true for a complex system consisting of many elementary behaviors: logical and temporal requirements must also be taken into account when activating or deactivating EBS. If we want the robot to approach a speci c one of a set of many visible doors for instance, it is not sucient that the sensors detect the shape of a door but also that the right door is selected by a recognition behavior which must have been active before. Furthermore, some EBS can be active simultaneously while others may not. It does not make sense to activate a searching behavior simultaneously with the behavior target acquisition for instance, whereas the two behaviors target acquisition and obstacle avoidance must be active together if we do not want to risk a collision. In our approach we take these logical and temporal requirements into account by de ning two matrices i;j 2 f0; 1g and Ai;j 2 f0; 1g which act as constant parameters for the dynamical system that controls the state vector ~n. The matrix i;j de nes the mutual exclusions between the EBS: if the element j should not be active simultaneously with the element i, we set i;j = 1. If i can be active simultaneously with j , we set i;j = 0. The matrix Ai;j de nes the logical presuppositions for the EBS. We set Ai;j = 1 if the previous activity of j is necessary to activate i and Ai;j = 0 otherwise. For controlling the behavioral state vector ~n, we select a competitive nonlinear dynamical system: X
(1) n_ i = ini ? jijn3i ? i;j n2j ni + t j with i = 2iIi ? 1 (2) (3) and _i = (1 ? i )Ii + i(Ii ? 1) ;2 ;1 Here, is the time scale of the competitive dynamics and i is the so called competitive advantage. While i < 0, the dynamics (1) is in the stable xed point ni = 0 and the EBS number i is deactivated. Otherwise, if i > 0, the activation of ni depends on the activity of all the EBS j that compete with i and for which therefore i;j = 1: if all the nj are deactivated, the elementary behavioral state ni relaxes to one of the stable xed points ni = 1 which specify the activity of the EBS number i. However, if at least one of the competing nj is
active, ni can not be activated and remains in the stable xed point ni = 0. This mechanism prevents the simultaneous activation of two competing EBS characterized by i;j = 1. The small stochastic noise term t in (1) kicks the system out of unstable states ni = 0. The competitive advantage i in (2) depends on the input function Ii 2 [0; 1]: For i ' 1, an active input Ii ' 1 makes i ' 1 and thus activates EBS number i. Vice versa, Ii ' 0 deactivates EBS i through an i ' ?1. The so called refractory term i given by (3) acts as a lter for the input Ii: if an input Ii ! 0 switches o EBS i, the refractory term follows to i ! 0 on a fast time scale ;1 ' 1. When an input becomes active again (Ii ! 1), the dynamics (3) follows to i ! 1 on the slow time scale ;2 1 inhibiting the activation of EBS i for that time ;2. Through this mechanism, an EBS that has just been deactivated cannot instantaneously be activated again, preventing the system from undesired behavioral oscillations. The input function Ii is a combination of the so called sensor context Ci 2 [0; 1] and a part which contains the logical presuppositions: ! Y 1 (4) Ii = Ci 1 + 2 Ai;j (tanh(cmj ) ? 1) ; j 2 2 (1 ? m (1 + m i )ni i )(ni ? 1) m_ i = + (5) n i In the sensor context, we subsume all required sensor conditions required to activate EBS i: if Ci ' 0, the required conditions are not ful lled and the input in (4) is set to Ii ' 0. Ci ' 1 means, that all sensor conditions are ful lled. Under these circumstances the input depends on the product term in (4) which implements a logical and condition: if at least one of the factors in the product term is ' 0, the input is switched o too. This happens if one of the presuppositions with Ai;j = 1 is not ful lled, indicated by mj = ?1. Because we set c 1, the tanh-function becomes ' ?1 in that case and the product term is ' 0. In the dynamics (5) mi acts as a short term memory for the activation of an EBS: a deactivation of ni ! 0 is followed to mi ! ?1 on the slow time scale i 1, while an activation n2i ! 1 is instantaneously copied by mi ! 1 on the fast time scale n ' 1. Through this mechanism, a presupposition j for the EBS i can activate the EBS i for a time
i even though the underlying EBS j may already be switched o. By this, the switching process between mutually presupposing EBS is made more stable and even short activations of presuppositions can be used for slow switchings.
3 Distributed Behavioral Control Following the ideas of behavior based robotics[5][10], we propose to control the robot by a network of communicating modules. Each of these modules is part of a circuit of sensing, control and its feedback eect on the sensory input. Since we have to implement such an architecture of coupled dynamical systems as concurrently running and communicating processes on a conventional computer hardware, we must map the theoretical sketch of continuous data ow onto these technical boundary conditions: In the case of our robot, this is a network of Pentium processor boards, linked by Fast Ethernet and running the real-time operating system QNX.
3.1 Components We can de ne four types of processes in our system: 1. Behavioral Modules: Behavioral modules receive a stream of sub-symbolic data from either a sensor, another behavioral module or a perceptual process (see below). Analyzing this input stream a module calculates its sensor context Ci which codes if the implemented behavior is currently appropriate and sends it to the arbitrator process. Whenever it is activated by the arbitrator it computes an output which is sent to other behavioral modules or perceptual processes. 2. Arbitrator: The arbitrator process receives the sensor context from all behavioral modules, iterates the dynamical system (as described in chapter 2) and sends the calculated EBS-values back to all behavioral modules.
3. Perceptual Processes: Perceptual processes simply perform a data transformation from input to output (like e.g. image processing functions). They work solely data driven and independently of the arbitration system. 4. Hardware Interfaces: Hardware interfaces are covered by server processes that receive messages containing control data and that send their output (i.e. acquired images, the arm's joint angels etc.) into freely addressable channels. By this all sensors and actors are made accessible for every module in the network, while the system for behavioral organization is able to avoid colliding requests from dierent modules by mutual exclusions. For instance the vision interface is controlled by messages de ning the view direction, image size, color mode etc. and sends back the acquired images on a speci ed channel. Like the perceptual modules these interface modules work independently of the arbitrator. The presented arbitration system for behavioral organization and action selection forms the upper level in this control system, switching the control ow connections between the underlying behavioral modules and perceptual processes, see Fig. 1.
3.2 PLANET - A Communication Platform for Distributed Robot Control The proposed architecture requires an ecient implementation of communicating control processes. Since we want to realize fast feedback in acting and sensing on the level of behavioral modules on the one hand and in the reactive action selection on the other hand, the communication must be asynchronous, exible and extremely fast. To achieve this we developed PLANET, a platform for network communication and distributed robot control [1]. PLANET accomplishes to concurrently run many behavioral modules in a computer network connected by transparent communication channels. A con guration le de nes the modules that have to be started on speci ed nodes in the net and the process interconnection scheme for the application. Each channel can have an arbitrary number of senders and receivers
Sensor Context
A
C1
Working Memory M3
C2
M2
C3
I3 I1
M1
I2 R3
R1
Refr. Term
α1
comp. Adv. Activation
Ν2
α3
Ν3
γ
M1
Actors
Sensors
Ν1
α2
Arbitrator
Input
R2
M2 M3
Figure 1: Robot control architecture: A set of behavioral modules (M1, M2, M3) perform the data processing from sensors to actors. The arbitration system receives the sensor context from these modules, evaluates the dynamical system for action selection and controls the data ow on the level of the behavioral modules by activating or suppressing their output.
connected. This oers the capability of broadcast messaging: E.g. we can de ne one channel that transmits the EBS-vector to all behavioral modules in the application. Though all communication between modules is based on messages, the short cycle times allow closed loop control in a quasi-continuous way. The size of a single message is limited only by the machines memory size. This is of great importance since the sensor system of our robot is limited to the active vision head and processing sensor data often means transmitting and processing stereo images in practice.
3.3 Behavioral Modules and Coupled Dynamical Systems Arbitrator Ci 111 000 000 i 111 000 D 111 000 111 000 in 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111
ni
1111 0000 0000 1111 0000 1111 0000 1111 0000000D i 1111111 0000 0000000 out,1 1111 1111111 0000 1111 0000000 1111111 0000 1111 0000000 1111111 Behavioral 0000 1111 0000000 1111111 0000000 1111111 0000000 1111111 Module 0000 1111 0000000 1111111 0000 1111 0000000 1111111 0000 1111 0000000 1111111 i 0000 1111 0000000 1111111 i 0000 1111 0000000D 1111111 0000 out,2 1111 0000 1111 0000 1111
Figure 2: Behavioral module
Behavioral modules (Fig. 2) must obey some design principles to work in our architecture: A behavioral module must be driven by a stream of sub-symbolic data (i.e. continuous variables, images, obstacle positions etc.) and produce output of the same quality. A module receives and processes incoming data at all times, whether the behavior is activated or not. Whenever a module's EBS-value is in the stable xed point ni = 0 (the corresponding behavior is not active) this module must not produce any output and thus may not aect any other behavior or actor. Since we understand a behavioral module as a part of a closed loop of continuous control it is misleading to talk of `actions' on this level. This might lead to the understanding of an exchange of commands or events between the modules. In contrast our control system is a network of loosely coupled dynamical systems that share state descriptions or stimuli via their interfaces, aecting the stable states of the receiving modules. In this way it is advisable to associate a strength or con dence measure with such incentives and we can interpret the de-activation of a behavior as setting the strength of its stimulus on other modules to zero. To match the idea of coupled dynamical systems each behavioral module must provide a minimal latency time. The internal processing must be an quasi-instantaneous mapping of input data and internal states
to the output, which must then be sent to the modules connected. Furthermore we can adjust the time scales of behaviors in isolated processes by means of the real-time concepts of the underlying operating system QNX. This guarantees xed hierarchies of time scales in concurrently running behaviors or even an intelligent hierarchy switching by setting the time scales at run time (see [12], pp. 44-45 for an example).
4 Results The presented approach to behavioral organization emerged to be powerful, robust and easy to use in our experiments. After the required basic behaviors have been identi ed, setting up the matrices A and is straightforward. Since all entries can be understood as local rules concerning two behaviors and de ning a mutual exclusion or a presupposition, the suitable value is obvious in most cases. The control path towards the behavioral goal, as well as possible exceptions, are implicitly coded in A and . Extending an existing application appeared fairly easy, since we only have to integrate a new module into the process-interconnection scheme and to expand the matrices by one row/column, determining the interaction with existing behaviors. We embedded this system for behavioral organization into a distributed software architecture to control a vision guided multi degree-of-freedom robot. An arbitrator process controls the data ow in several concurrently running control loops, realizing the described organization scheme. In contrast to most other implementations of architectures in behavior based robotics, our system controls a full-sized service robot in human environments, realizing tasks like e.g. navigation, grasping or door passing. It combines the system for behavioral organization with a platform for distributed control on a powerful hardware, allowing visual robot control through stereo image processing. An elaborated application is given in [13].
5 Conclusion This work presents an architecture for robot control by means of dynamical system. We describe an approach for behavioral organization and action selection using competitive dynamics and the software realization to control a vision guided anthropomorphic robot through a network of communicating processes. Considering the current sensor signals, logical presuppositions concerning a short term memory and rules for mutually excluding behaviors the system generates reasonable behavioral sequences by activating and deactivating a subset of the control modules. Since all functional elements of this scheme for behavioral organization are formulated as dynamical systems the selection of behaviors is stable even for noisy sensor data and acts robust in dynamic environments. Thus the system for behavior selection adopts the properties of robot control by means of dynamical systems and integrates sets of interacting behavioral modules in a homogeneous formalism.
References [1] Alberts et al. Entwicklung einer Kommunikationsplattform zur Steurerung eines autonomen mobilen Roboters. Technical report, Fachbereich Informatik, Universitat Dortmund, 1997. [2] T. Bergener, C. Bruckho, P. Dahm, H. Janen, F. Joublin, and R. Menzner. Arnold: An anthropomorphic autonomous robot for human environments. In SOAVE'97, Selbstorganisation von adaptivem Verhalten, 1997. [3] T. Bergener and P. Dahm. A Framework for Dynamic ManMachine Interaction Implemented on an Autonomous Mobile Robot. In ISIE'97, IEEE International Symposium on Industrial Electronics, 1997. [4] E. Bicho and G. Schoner. Dynamic approach to autonomous robotics demonstrated on a low level-vehicle platform. Robotics and Autonomous Systems, (in press), 1997.
[5] R.A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2.(1), 1986. [6] P. Dahm and C. Bruckho. Autonomous decision making in local navigation. In From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior (SAB 98), (to appear). MIT Press, 1998. [7] J. Kosecka and L. Bogoni. Application of discrete event systems for modeling and controlling robotic agents. Int. Conference on Robotics and Automation, San Diego, 1994. [8] P. Maes. How to do the right thing. Connection Science Journal, Vol. 1(No. 3):291{323, 1989. [9] H. Neven and G. Schoner Neural dynamics parametrically controlled by image correlations organize robot navigation. Biological Cybernetics, 75:293{307, 1996. [10] N. J. Nilsson. Teleo-reactive programs for agent control. Journal of Arti cial Intelligence Research, 1:pp. 139{158, 1993. [11] G. Schoner, M. Dose, and C. Engels. Dynamics of behavior: Theory and applications for autonomous robot architectures. Robotics and Autonomous Systems, 16:213{245, 1995. [12] A. Steinhage. Dynamical Systems Generate Navigation Behavior (Ph.D. thesis). Number ISBN 3-8265-3508-1 in Berichte aus der Physik. SHAKER-Verlag, Aachen, Germany, 1998. [13] A. Steinhage and T. Bergener. Dynamical Systems for the Behavioral Organization of an Anthropomorphic Mobile Robot. In SAB '98, Simulation of Adaptive Behavior (to appear), 1998. [14] A. Steinhage and G. Schoner. Self-calibration based on invariant view recognition: Dynamic approach to navigation. Robotics and Autonomous Systems, 20:133{156, 1997.