Dynamical Systems for the Behavioral ... - Semantic Scholar

Dynamical Systems for the Behavioral Organization of an Anthropomorphic Mobile Robot Axel Steinhage, Thomas Bergener Institut fur Neuroinformatik Ruhr-Universitat Bochum, 44801 Bochum, Germany

Phone: ++49 (0)234 700 7969, Fax: ++49 (0)234 709 4209 Email: [email protected], [email protected]

Topic area:

Action selection, Autonomous robots short paper Abstract: We present a generative approach to behavioral organization which uses continuous dynamical systems for switching between multiple actions of an anthropomorphic robot. The logical context of the possible actions is coded in matrices of parameters for a system of continuous dierential equations. The sensor context is represented by a set of variable parameters depending on the sensor inputs. The switching between the behaviors is the result of nonlinear phase transitions in the solution of the underlying dynamical system. The stability of the overall system is guaranteed even for many dierent behaviors by keeping coupled behaviors on separated timescales. This is demonstrated on the anthropomorphic robot Arnold for the example of approaching a door. The robot searches for the door visually, recognizes it, points at it and approaches it. The dynamical systems approach allows for a exible change of a running behavioral sequence after unexpected perturbations brought about by dynamic changes in the environment.

Dynamical Systems for the Behavioral Organization of an Anthropomorphic Mobile Robot Axel Steinhage and Thomas Bergener

Institut fur Neuroinformatik, Ruhr-Universitat Bochum 44780 Bochum, Germany Tel: +49 234 700 7969 Fax: +49 234 709 4209 [email protected] [email protected]

Abstract

We present a generative approach to behavioral organization which uses continuous dynamical systems for switching between multiple actions of an anthropomorphic robot. The logical context of the possible actions is coded in matrices of parameters for a system of continuous dierential equations. The sensor context is represented by a set of variable parameters depending on the sensor inputs. The switching between the behaviors is the result of nonlinear phase transitions in the solution of the underlying dynamical system. The stability of the overall system is guaranteed even for many dierent behaviors by keeping coupled behaviors on separated timescales. This is demonstrated on the anthropomorphic robot Arnold for the example of approaching a door. The robot searches for the door visually, recognizes it, points at it, and approaches it. The dynamical systems approach allows for a exible change of a running behavioral sequence after unexpected perturbations brought about by dynamic changes in the environment.

1. Introduction

Autonomy basically means to generate behavior which is appropriate for the situation the system is currently in. Behavioral stability in this context means that from a set of possible behaviors that one is selected which is most adequate for the current situation and that a sucient dynamic change of the environment leads to a categorical change of the selected behavior. In general, these environmental changes lead to changing sensor information and therefore the action selection mechanism must be controlled by the continuous sensor information. For example the target acquisition behavior requires that the target is detected by the corresponding sensors and if changes in the scenery cause an occlusion of the target, the target acquisition behavior must be replaced by a behavior which tries to nd the target again. Sometimes there are situations in which not only the sensor information alone is sucient to decide whether a

certain behavior must be activated or not. This is the case in situations, in which not only a single behavior is needed to ful ll a task, but multiple parallel behaviors or even a whole sequence of behaviors must be acted out in a logical order. In addition to sensor information, the activation of a certain behavior must be controlled by a mechanism which selects the correct behavior in the sequence depending on the preceding behaviors in that case. This mechanism must also decide which behaviors must run in parallel and which are mutually exclusive. In this paper we present such a mechanism. It is based on the dynamical systems approach to behavior generation (see (Schoner et al., 1995) for a review, (Steinhage, 1998) for an extended application). In this approach, behaviors are generated by evolving the solution of a nonlinear dynamical system in time. The behavioral state of the system is projected onto so called behavioral variables which must be selected according to some basic design principles. One of these principles is that the tasks to be ful lled must be expressible as stable xed points or attractors of the dynamical system and the time scale of the dynamics must be selected such that the behavioral variables are in these attractors at all points in time. While the feasibility of this approach for generating single sensor driven behaviors has been demonstrated in many applications before (e.g. (Schoner and Dose, 1992)(Bicho and Schoner, 1997)), we want to show how the concept can be scaled up to the generation of behavioral sequences and to action selection. As an example case we apply the approach to the problem of detecting, identifying and approaching a door, implemented on an anthropomorphic mobile robot.

2. The anthropomorphic robot ARNOLD

Arnold is an autonomous anthropomorphic robot for human environments (Fig. 1) (see (Bergener et al., 1997) for detailed description of the robot and its behavioral capabilities). Equipped with a mobile platform, a three degree of freedom stereo-vision head and a seven DoF

robot arm Arnold is able to perform simple service tasks in human indoor environments. Grasping objects and navigating, as well as pressing light switches or door handles are examples for basic tasks that can be combined to perform complex jobs. The robot's only sensor is a double stereo camera system that is mounted on a head with pan, tilt and vergence. Two pairs of CCD-cameras are used for foveal color vision and monochrome periphery sensing. The arm is anthropomorphic in its movability with wrist, elbow and shoulder and lifts weights up to one kilogram in a distance of 60 centimeter in front of the body. Two industrial Pentium-boards running the real-time operating system QNX control the entire robot. Rechargeable battery packs make robot autarkic for more than two hours. Arnold's control architecture is based on the works of Brooks (Brooks, 1986) and Braitenberg (Braitenberg, 1984). A network of distributed control processes establishes paths of subsymbolic data ow from the sensors to the dierent actor devices. Most of the involved behavioral modules are designed or coupled following the dynamic approach as described in (Schoner et al., 1995) (see (Bergener and Dahm, 1997) for a sample application).

have to detect the target using two basic perceptual capabilities: { Find any door in the visual eld and estimate its position relative to the robot. { Classify a door in the visual eld as 'target' or 'some other door'. The rst is solved using the spatial geometry of doors, i.e. a pair of vertical door posts with a certain distance. Therefore we acquire stereo image pairs with a horizontal view direction and search for nearly vertical lines in the images using an Adaptive Hough Transformation. A stereo matching of these lines in each image pair gives each door post's position relative to the robot. Pairs of vertical structures that have a distance of 80 10 centimeters are candidates for the further check. To check if the detected door is the target we use an image classi cation module as described in (Kreutz et al., 1996). Autocorrelation features are extracted from a Laplace image pyramid and coded in a feature vector. A linear Bayes classi cator is trained with feature vectors calculated from image sets of the target and other doors in the robots environment. The images are acquired from dierent view directions and distances to make the classi ction invariant with respect to these parameters.

3.1 Visual search and target tracking

The visual search of the target door and tracking the target are two basic behaviors (B1 and B2 ) for the choosen task. The rst behavior B1 means that the robot's head is rotated in steps of 20 degree counterclockwise. Stereo images are acquired at every position and sent to the door-detection and classi cation modules. The targeting behavior B2 directs the head towards the nearest door in the visual eld und acquires new pairs of stereo images repetitively. By this the head will xate the target even when the robot moves.

3.2 Pointing

Since the robot has no display or speech device we want it to point at a detected door before it proceeds. Hence we de ne the behaviors pointing B3 (the arm moves up and points at the door), parking B4 (the arm moves to the parking position close to the body), and safe B5 (the arm is locked in the parking position). Figure 1 Robot Arnold

3. The Behaviors To realize a reactive sequence to approach a certain door a set of behavioral modules and the necessary sensor capabilities had to be developed. Since we did not want to realize a global navigation module that includes the target's position the target door must be recognized from image features. This means we

3.3 Target acquisition

Target acquisition should be the nal behavior in a successful sequence of actions. The robot is moved towards the target and stopped in front of the door (behavior local navigation B6 ).

3.4 Door recognition

The image classi cation module was trained with images of several doors in our lab - each door representing one class. Since we want to switch the classi cation module on or o whenever it is needed, we denote this capability as behavior B7 . To interact with other behaviors we

formulate the recognition of the target door as behavior B8 . For the case that a scene is classi ed as some other door we de ne the behavior B9 . A behavior B10 denotes the internal state that the classi cator is switched o or the classi cation result is not the target door.

system of continuous dierential equations. This avoids the diculties brought about by the combination of continuous dynamical systems and discrete symbolic algorithms.

4. Integrating the behaviors

To make the problem of behavioral organization accessible for continuous dynamical systems, we assign a continuous variable ni to every sub-behavior. This variable describes the state of activation of the corresponding behavior Bi : ni = 0 means the corresponding behavior is not active, while ni = 1 stands for an activated behavior Bi . A continuous dynamical system which controls this variable ni therefore has to have two stable states: one for a deactivation and one for activation of behavior Bi . A possible choice is the following dierential equation:

To generate appropriate overall behavior, the described sub-behaviors must be organized in a sequence such that the required task can be ful lled (Steinhage and Schoner, 1997a). While the active sub-behaviors are controlled by sensor inputs, the behavioral sequence depends additionally on logical conditions. Approaching the door for instance makes only sense if the door has already been detected by the visual search. The classical approach to incorporate logical dependencies between a number of instances is to assign a symbol to each of the instances and to design a symbolic algorithm which switches the instances on and o (Poole, 1995). The logical dependencies are encoded in the program structure of the symbolic algorithm. The most prominent example of a discrete symbolic algorithm controlling sub-behaviors consisting of continuous dynamical systems are the so called Hybrid Systems (Lemmon et al., 1993)(Brockett, 1993). For our purpose, this approach has some disadvantages (see (Steinhage, 1998) for a criticism): Scaling up the system to additional sub-behaviors requires a complete reprogramming of the symbolic algorithm as the logical interaction with the existing parts of the program structure must be captured. More importantly the sub-behaviors are controlled by continuous dynamical systems their organization through a discrete symbolic algorithm induces stability problems. As we have already explained, stable behavior is generated if the control variables of the sub-behaviors are in stable states at all points in time (Steinhage and Schoner, 1997b). These stable states are attractor states of the underlying continuous dierential equation. In a symbolic algorithm the sub-behaviors are represented by symbols. The behavioral sequences are generated through switching the sub-behaviors on and o by manipulating the symbols in the algorithm. Unless the algorithm simulates a continuous dynamical system, the symbols are switched instantanously between their on and o states. As behavioral organisation implies that the sub-behaviors are activated depending on the activation states of the other sub-behaviors, the instantanous switching means a discontinuity for the dynamical system which controls the activation of the sub-behaviors. For the dierential equation this may cause instabilities. To avoid these complications, we developed a mathematical concept which organizes the behaviors through a sub-symbolic continous dynamical system. If the subbehaviors are likewise described by continuous dynamical systems, the overall system which integrates and organizes all sub-behaviors can be entirely expressed by a

4.1 The competitive dynamics

n_ i = i ni ? ji jn3i ?

X j =i

i;j n2j ni + t

(1)

6

Here, t is a gaussian white noise term, preventing the system from remaining in unstable states. To understand the P characteristics of (1), we neglect the competitive term i;j n2j ni at rst. Depending on the so called competitive advantage i , the dynamical system relaxes into the corresponding stable xed point on the timescale : for i > 0, the dynamics relaxes to jni j = 1 and for < 0 the system relaxes to ni = 0. Through this mechanism, the sub-behaviors can be switched on and o.

4.2 The competition matrix

Some of the described sub-behaviors can not run simultaneously. For example, the behaviors tracking and visual search are mutually exclusive: tracking should only be activated after the visual search is nished. Other behaviors can run simultaneously, like for instance tracking and recognition. To incorporate these logical conditions in our concept a competition matrix i;j 2 f0; 1g is de ned. If i;j = 1 and jnj j = 1, the state ni remains in the stable xed point ni = 0 even for an i > 0. This means, an active behavior Bj inhibits the activation of behavior Bi . Only when j < 0 switches o behavior Bj , behavior Bi is free to be activated. If i;j = 0, behavior Bi can be activated independently of behavior Bj .

4.3 The competitive advantage

The competitive advantage has the following form: i = 2i i ? 1 (2) (3) _ = (1 ? i ) i + i ( i ? 1) i

;2 ;1 ;1 > 1(5) 1 + Ai;j i = C i 2 j 2 2 (6) m_ = (1 ? mi )ni + (1 + mi )(ni ? 1) i

m;1 m;1