Self-Organizing Feature Maps for Modeling and ... - Semantic Scholar

7 downloads 0 Views 269KB Size Report
Jones and Vernon (1994) described how the association of motor and visual stimuli enabled the SOM, controlling a RTX SCARA-configuration robot arm.
Journal of Intelligent and Robotic Systems 36: 407–450, 2003. © 2003 Kluwer Academic Publishers. Printed in the Netherlands.

407

Self-Organizing Feature Maps for Modeling and Control of Robotic Manipulators GUILHERME DE A. BARRETO and ALUIZIO F. R. ARAÚJO Department of Electrical Engineering, University of São Paulo, 13566-590, São Carlos, São Paulo, Brazil; e-mail: {gbarreto;aluizioa}@sel.eesc.sc.usp.br

HELGE J. RITTER Neuroinformatics Group, University of Bielefeld, 33501 Bielefeld, Germany; e-mail: [email protected] (Received: 19 February 2002; in final form: 5 September 2002) Abstract. This paper presents a review of self-organizing feature maps (SOFMs), in particular, those based on the Kohonen algorithm, applied to adaptive modeling and control of robotic manipulators. Through a number of references we show how SOFMs can learn nonlinear input–output mappings needed to control robotic manipulators, thereby coping with important robotic issues such as the excess degrees of freedom, computation of inverse kinematics and dynamics, hand–eye coordination, path-planning, obstacle avoidance, and compliant motion. We conclude the paper arguing that SOFMs can be a much simpler, feasible alternative to MLP and RBF networks for function approximation and for the design of neurocontrollers. Comparison with other supervised/unsupervised approaches and directions for further work on the field are also provided. Key words: compliant motion, hand–eye coordination, kinematics and dynamics, obstacle avoidance, path planning, redundancy, robotics, self-organizing feature maps.

1. Introduction For many years industrial robots have been applied successfully in several industrial applications (Golnazarian and Hall, 2000). Most of them are engaged in limited tasks in which they perform simple operations in an unchanging environment. However, many robot control problems demand the execution of complex tasks characterized by: (i) a real-world system that integrates perception, decision making and execution; (ii) complex domains, yet the expense of using actual robotic hardware often prohibits the collection of large amounts of training data; and (iii) real-time requirements in which decisions must be made within critical time constraints. To cope with these demands, it is required to provide robots with autonomy. Hence, the robot controller should be capable of self-generation of actions, au-

408

G. DE A. BARRETO ET AL.

tomatic execution and control of an action plan, sensing capability, and reaction against unexpected situations. In addition, a number of different components contribute to the task complexity. For example, signals from various sources such as vision, joint position, touch, force, need to be integrated and interpreted. Also, robots tasks are usually specified in Cartesian coordinates, while robot movements are governed by their actuator variables (Craig, 1989; Sciavicco and Siciliano, 2000). Thus, sensor readings and task specification need to be translated into a form that can be used for control purposes because the coordinates of sensory signals are different from the coordinates of the movements they are driving. These mappings are often highly nonlinear and it is difficult (if not impossible) to derive them analytically (Torras, 1995; Massone, 1995). Furthermore, because of environmental changes or robot wear-and-tear, the mappings may vary in time; in such cases, one would like the control structure to adapt to these variations. Thus, in order to be an autonomous device, robots should have the abilities of learning (acquisition of knowledge/skills through experience) and adaptability (rapid adjustment of the system to environmental changes). Artificial neural networks (ANNs) can be understood as a class of computational models that can represent nonlinear input–output mappings. ANNs are able to learn through examples, adapt themselves, and perform parallel processing (Medler, 1998). These properties make neural networks very attractive for adaptive robot control (Miller et al., 1990; Hunt et al., 1992; Balakrishnan and Weil, 1996), by learning the mapping between various movement variables without an accurate knowledge of the system parameters or the analytical equations governing the system (Zomaya and Nabhan, 1994; Tzafestas, 1995; Prabhu and Garg, 1996; Golnazarian and Hall, 2000). Among the different neural-network learning paradigms, unsupervised learning has appealing characteristics to be used in robotics. Learning in unsupervised neural networks emerges without the need of an external “teacher” who provides the desired response of the network. This self-organization ability can substantially reduce the robot programming burden that accounts for about one third of the total cost of an industrial robot system (Heikkonen and Koikkalainen, 1997). Moreover, unsupervised models are often fast, encouraging their use in incremental and online learning. Self-organizing feature maps (SOFMs) are important unsupervised ANN models that have shown great potential in application fields such as motor control, pattern recognition, and optimization, and have also provided insights into how mammalian brains are organized (Kohonen, 1997). Most works in SOFMs have focused on robotic systems that are solely sensory in nature, being applied to clustering of these data (Vesanto and Alhoniemi, 2000). More recently, several studies have proposed SOFMs also for the difficult tasks of nonlinear modeling and control of robots. The main motivation for the present work comes exactly from the fact that several review papers published in the 1990’s (Bekey, 1992; Zomaya and Nabhan, 1994; Tzafestas, 1995; Balakrishnan and Weil, 1996; Hutchinson et al., 1996;

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

409

Rylatt et al., 1998; Golnazarian and Hall, 2000) on the field of robot control do not contemplate the use of unsupervised learning in robotics, despite several models are already available since late 1980’s. A few other papers (Hunt et al., 1992; Torras, 1995; Prabhu and Garg, 1996; Kohonen et al., 1996) have very briefly described applications of unsupervised learning to robot modeling and control, without emphasizing some of its main advantages over the supervised methods and, hence, giving only a very limited vision of the potentiality of the unsupervised approach. In this aspect, the present paper comes to fill-in this gap by reporting applications of SOFMs to modeling and control of robotic manipulators. In addition, this paper has additional goals: (i) to discuss important issues in manipulator control and how they are traditionally tackled, (ii) to discuss the role of learning and, more specifically, the use of artificial neural networks in adaptive robot control, (iii) to compare the performance of the reviewed models with other learning paradigms, (iv) to show that SOFMs can be a feasible alternative to Multilayer Perceptron (MLP) and Radial Basis Function (RBF) networks for the design of neurocontrollers, and (v) to suggest directions for additional work in the field. The remainder of the paper is organized as follows. In Section 2, we describe briefly some basic issues in robotics, in particular those related with the computation of forward and inverse kinematics and dynamics manipulators. Section 3 presents some neural-network approaches to adaptive control. The Kohonen algorithm and its variants are described in Section 4. In Section 5, we review the application of SOFMs to kinematic-based robot control, and how they tackle the issues of adaptability and redundancy. In Section 6, we discuss the issues of path-planning and trajectory formation using SOFMs. In Section 7, we review the application of SOFMs to dynamic-based robot control. In Section 8, we discuss the main features and results obtained by SOFMs and compare them with other neural models. We conclude the paper in Section 9 by suggesting directions for further research.

2. On Robotic Problems This section gives a brief overview of some important concepts in robotics. No attempt will be made to cover the whole range of this enormous field. We shall instead concentrate primarily on those techniques that provide the basis for work set out in subsequent sections. Further information is available in survey papers and books on robotics (Craig, 1989; Murray et al., 1994; Tzafestas, 1995; Deo and Walker, 1995; Prabhu and Garg, 1996; Vukobratovic, 1997; Golnazarian and Hall, 2000; Sciavicco and Siciliano, 2000).

410

G. DE A. BARRETO ET AL.

The basic problem in robotics is how we can move a robot arm from one working location to another, i.e., how to generate a robot trajectory. The working environment of the robot (its workspace) allows several different possibilities to perform a certain movement. It is widely accepted that trajectory generation process involves three fundamental stages, performed on digital computers at a certain path update rate: task planning, trajectory planning, and motion control (Sciavicco and Siciliano, 2000; Prabhu and Garg, 1996). In the task-planning stage, high-level planners manage information concerning the job to be performed, providing data about the required initial, intermediate and final positions of the end-effector and robot joints. At each of these points, specifications for the velocities and accelerations and appropriate constraints (such as limits on velocity and acceleration or the need to avoid obstacles) are provided. The trajectory of a robot refers to the time history of position, velocity and acceleration for each degree of freedom (joint) of the robot. In this sense, trajectory planning involves finding the sequence of points (given in joint coordinates, for example) through which the manipulator end-effector must pass, considering the task planning. The intermediate points are linked through straight-line, circular, or higher-order interpolation techniques (e.g., spline techniques) to produce smooth and coordinated joint movements. The final steps in the motion-control problem consists of finding the joint torques which cause the arm to follow the specified path while satisfying the imposed constraints. This is achieved by using a suitable controlled actuator (e.g., electric motor) at each joint of the robot. The aim of the control is to ensure that the robot will maintain a desired dynamic behavior (e.g., follow a prespecified path). The robot control design problem is, in general, very difficult since the dynamics of the links are nonlinear and coupled. The main techniques used to control entail: PID local control, computed-torque control, resolved motion rate control, robust (sliding mode) control, self-tuning control, and model reference adaptive control (Tzafestas, 1995).

2.1. ON THE FORWARD AND INVERSE KINEMATICS Robot motion planning and control critically depends on the availability of accurate mappings from the Cartesian or visual spaces to joint or motor spaces. A generic coordinate transformation between the spaces just mentioned is called sensorimotor mapping (Massone, 1995), and relates various sensory modalities (e.g., vision, tactile, force sensors) to appropriate motor commands. Examples of such mappings are the inverse kinematics and dynamics. Kinematics is the science of motion which treats it without regarding to the forces that cause motion. Forward kinematics involves the computation of the position and orientation of the end-effector of the manipulator given the joint angles. Usually, the desired trajectories and associated constraints are specified in the Cartesian workspace, requiring the solution of the inverse problem. This is solved

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

411

by an inverse kinematic mapping, which refers to the problem of determining the robot joint angles to achieve a desired position and orientation of the robot end-effector. For non-redundant robots, at any joint position, an algorithm computes the velocity that causes the manipulator end-effector to move toward the next desired spatial location. The success of the inverse kinematic method depends on efficient matrix inversion and accurate knowledge of the robot kinematic parameters. In the absence of such knowledge, it may be necessary to estimate the parameters of the robot arm before trajectory planning can begin. For redundant manipulators, finding the inverse kinematic mapping for a given end-effector position is hard because this is an ill-posed problem in the sense that many solutions are possible. The inverse kinematics problem for redundant manipulators is an instance of the broader degrees of freedom (DOF) problem, first articulated by Bernstein (1967). This problem arises when more variables exist than are strictly required to complete a task.

2.2. ON THE FORWARD AND INVERSE DYNAMICS Dynamics involves the study of forces and torques required to cause motion and it is, like kinematics, a field of study in itself. The forward dynamics problem arises when positions must be found given forces or torques. Inverse dynamics involves the mapping from joint torques to the resulting joint positions, velocities and accelerations. When the manipulator is commanded to follow a desired path, generally subject to optimization constraints on performance, such as minimum overshoot, minimum energy, or minimum time, the actuator torque functions are calculated using dynamic equations of motion of the manipulator, obtained usually from the closed-form Lagrange–Euler formulation or the forward-backward recursive Newton–Euler formulation (Sciavicco and Siciliano, 2000). These formulations take into account inertia effects, Coriolis and centrifugal forces, friction in the joints, gravity and backlash in the case of gear driven robots. Hence, the inverse dynamics problem presents greater difficulty to be solved than the inverse kinematics problem. The conventional way of controlling manipulators employs a simple proportional-integral-derivative (PID) controller, which is completely error driven. These controllers are based on the assumption of fixed robot parameters and are inadequate since the parameters of a robot depend on its configuration. If available, an accurate dynamic model of the mechanical manipulator may be used to calculate the joint drive torques for a desired trajectory, the computed torque control (Sciavicco and Siciliano, 2000). This control scheme suppresses disturbances and tracks desired trajectories uniformly in all configurations of the manipulator. However, the performance depends on fast computation of the dynamic model and accuracy of estimates of the parameters appearing in the dynamic model (Prabhu and Garg, 1996). However, since robot parameters are dependent on the joint configuration, PID controllers are not satisfactory for robot control, since they assume fixed plant

412

G. DE A. BARRETO ET AL.

parameters. This limitation may be overcome through an adaptive control strategy based on identification of the robot parameters and subsequent adjustment of controller gains (Vukobratovic and Tuneski, 1996). The difficulty in obtaining fast and accurate estimates of robot parameters led to research in adaptive control applications to robot control which has been addressed from the point of view of learning, in the context of neural network models. 3. Neural Approaches to Modeling and Control From a computational perspective, many aspects of robot modeling and control can be captured by the style of computation of neural-network models. While the correspondence between such a computational approach and brain mechanisms remains an open and controversial issue (Roy, 2000), it is true that some of the basic properties of connectionist models match very well the problems that one must face when dealing with sensorimotor transformations and learning. That is, the mapping f: p → q , xout = f(xin ) that occurs between a p-dimensional input layer and a q-dimensional output layer of a network can be viewed as a general, nonlinear coordinate transformation process. This view becomes particularly important when one uses it in association with learning algorithms. The most common structure for adaptive control is based on two primary components: (i) a controller that generates the motor commands to be sent to the controlled object, and (ii) a controlled object (for example, a robot arm) that translates a motor command into an actual movement. Conventional adaptive control schemes (Narendra and Annaswamy, 1989) suffer from the drawback that the computational load for real-time parameter identification and the sensitivity to numerical precision and noise in the observation increases as the number of system variables grows. These limitations can be overcome to a certain extent through the use of learning methodologies, such as those provided by neural network models.

Figure 1. Forward modeling approach.

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

413

(a)

(b) Figure 2. (a) Generalized inverse modeling, (b) specialized inverse modeling.

Forward modeling. The procedure of training a neural network to represent the forward dynamics of the controlled object (plant) is referred to as forward modeling (Figure 1). The neural network model is placed in parallel with the system and the error (e) between the system output yd and network response yo is usually used to guide the network training. Over a number of learning trials the forward model learns to emulate the controlled object. Inverse modeling. Conceptually, the simplest approach for inverse modeling is the so-called direct-inverse modeling, also referred to as generalized inverse learning (Psaltis et al., 1988; Prabhu and Garg, 1996). The approach consists in activating the controlled object (or a model representing it) with control commands u generated randomly (Figure 2(a)). The output of the controlled object is used as the input to the inverse model, which in turn computes an estimated control command uo . The estimated control command is then compared to the actual control

414

G. DE A. BARRETO ET AL.

command, and an error is computed. The error is used to modify the inverse model, for instance, the error guides the updating of the weights of a neural network until the network eventually learns the inverse behavior of the controlled object. The trained network can then be used as a controller, i.e., the inverse model is simply cascaded (as an open-loop system) with the controlled object constructing an identity mapping between desired response (the network inputs) and the controlled system output. Although conceptually very simple, this learning scheme has some limitations in practice. In particular, it may not control properly nonlinear redundant manipulators because instead of exploring the benefits of redundancy, it tries to get rid of it during the training time, so that only a single solution is available at run-time. A second method of inverse learning is known as specialized inverse modeling (Psaltis et al., 1988; Prabhu and Garg, 1996). This approach can be used ofr on-line learning such that the network operates only in the regions of specialization. It has two phases: during the first phase the forward model is learned as shown in Figure 1; the second phase consists in training the neural controller (the inverse block in Figure 2(b)). As learning progresses, the controller becomes an inverse model of the controlled object. The weights of the forward model are held fixed in this second stage, while the weights of the controller network undergo adaptation. The specialized modeling approach can cope with redundant robots, i.e., the trajectory generation process can be biased (e.g., through optimization methods) to find a particular inverse with certain desired properties. The two methods for inverse modeling can be combined in such a way that generalized inverse modeling is used to learn the approximate behavior of the system to be controlled, and then specialized inverse modeling is employed to fine tuning the network to a particular range of operations. The above adaptive neurocontrol approaches were conceived under supervised learning procedures, specially those based on the MLP and RBF networks (Prabhu and Garg, 1996; Balakrishnan and Weil, 1996). In this paper we show how SOFMs can also learn nonlinear input–output mappings needed to control robotic manipulators, thereby handling important robotic issues such as the excess degrees of freedom, computation of inverse kinematics and dynamics, hand–eye coordination, path-planning, obstacle avoidance, and compliant motion. We can say in advance that most of the unsupervised neural models to be described next in this paper use the direct-inverse modeling paradigm. In combination with machine vision, that provides dynamical closed-loop position control of the robot end-effector, these networks perform visual servoing (Hutchinson et al., 1996). This is a difficult robotic problem that can be tackled nicely by SOFMs.

4. Self-Organizing Feature Maps In this review we are interested in a class of SOFM models based on the Kohonen algorithm (Kohonen, 1990, 1997), also called the self-organizing map (SOM).

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

415

The SOM learns from examples a mapping (projection) from a high-dimensional continuous input space X onto a low-dimensional discrete space (lattice) A of N neurons which are arranged in fixed topological forms, e.g., as a rectangular 2-dimensional array. The map i ∗ (x): X → A, defined by the weight matrix W = (w1 , w2 , . . . , wN ), wi ∈ X, assigns to each input vector x ∈ X a neuron i ∗ ∈ A. The function i ∗ (x) is usually determined by Euclidean distance: wi ∗ (t) − x(t) = min wi (t) − x(t),

(1)

i∈A

i.e., an input vector x(t) is mapped onto that neuron i ∗ for which wi (t) − x(t) is the smallest value. Assuming that the initial values of the weights wi are chosen randomly and the updating of the weight vectors wi is performed after the presentation of each input vector x(t), the weights of all neurons i in the network are adapted according to the following learning rule: wi (t) = η(t)h(i, i ∗ ; t)[x(t) − wi (t)]

(2) ∗

where 0 < η(t) < 1 is the learning rate, and h(i, i ; t) = f (ri ∗ − ri , t) is the neighborhood function, a scalar-valued function of the lattice coordinates ri and ri ∗ of neurons i and i ∗ , respectively. A widely used neighborhood kernel is the Gaussian function:   ri − ri ∗ 2 ∗ (3) h(i, i ; t) = exp − 2σ 2 (t) where the parameter σ (t) defines the radius of the kernel. For convergence of the map, it is necessary that h(i, i ∗ ; t) → 0 when t → ∞. To achieve this, usually both σ (t) and η(t) are monotonically decreasing functions of time. For the considered robotic applications, the SOM network is often not very large (e.g., a few hundred neurons at most), so that selection of the learning parameters is fairly easy. General hints for training the SOM can be found in (Kohonen, 1997). The operations defined in (1) and (2) are repeated until a steady state of global ordering of the weight vectors has been achieved. In this case, we say that the map has converged. The resulting map preserves the topology of the input samples in the sense that adjacent patterns are mapped into adjacent regions on the map. Due to this topology-preserving property, the SOM is able to cluster input information and spatial relationships of the data on the map (Vesanto and Alhoniemi, 2000). Despite being very simple, the SOM algorithm is powerful and has become one of the most important ANN architectures. It has been applied to a variety of real-world problems (Kohonen et al., 1996; Alhoniemi et al., 1999; Simula et al., 1999) and its computational properties are well-understood. Also, there are several theoretical studies on global ordering and convergence of the SOM (Cottrell, 1998). 4.1. LEARNING INPUT– OUTPUT MAPPINGS WITH THE SOM Commonly, the SOM is used to learn only the topology of sensory inputs by performing clustering on these data (Kung and Hwang, 1989; Faldella et al., 1997),

416

G. DE A. BARRETO ET AL.

and being used in control basically as a classifier. The final sensory map can then be used to classify new incoming data. For the sensorimotor learning and robot control tasks we are interested in this paper, these networks must be able to process sensory inputs and also to produce motor outputs. In robotics, an input data sample xin (t) can represent the current spatial position of the end-effector of the robot at time t, and the output data sample xout (t) is the required control action to attain that position. This control action can specify a value for a angular displacement, a force, or a torque. There are two main approaches for collecting input–output robotic data. In the first one, an operator guides the robot arm through a well-defined trajectory (e.g., a circular one), stopping at some desired positions for storing the input–output variables of interest at the computer’s memory. This trajectory will then be used to train the network. In the second one, the operator commands the robot to execute untargeted (random) movements that provide the robot controller with a number of coherent input and output patterns. The random movements of the arm should lead to uniform sampling of the workspace, and hence to an unbiased ability to reach for a given target. The vector xout may contain also variables associated with a given sensorimotor transformation, such as geometric parameters of the robot arm, and the Jacobian matrix (or its inverse). Once these data samples, {xin (t), xout (t)}, t = 1, . . . , N, have been collected, learning should take place through a mechanism of associative memory. To this end, the network input vector, x(t) and the weight vectors, wi (t) are defined as:  in   in  wi (t) x (t) and wi (t) = . (4) x(t) = out x (t) wout i (t) In words, the input data sample, xin (t), occurring at time t or earlier is associated with the output data sample, xout (t), that occurs at time t. The word “earlier” generalizes the definition in (4) since it allows the inclusion of systems with a delay between the application of the input signal and the response of the plant. Then, the SOM acts as a kind of associative layer which integrates sensory and motor signals. This strategy inspired the design of unsupervised neural-network models that we describe in this paper from Section 5 onwards. It is important to note that when using supervised models the error signals shown in Figures 1 and 2 are available directly at the output of the network and are explicitly used in a network learning rule. In the unsupervised case, the error signals are not computed directly, but implicitly through the use of the definitions in (4) and the SOM’s learning rule in (2). For this reason, when unsupervised neural models are used in modeling and control, they are usually referred to as self-supervised models (Nehmzow, 1998). This type of learning is controlled by knowledge of the external world provided by sensors and the consequences of actions performed in it. The winning neuron can be determined on the basis of the entire vector x(t) or, more usually, on the basis of the sensory inputs xin (t) only. Both cases can be

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

417

handled by defining a projection matrix P which can be used to select different subspaces within the augmented vector (Walter, 1996). For example, let p be the dimension of the sensory vector xin , i.e., p = dim(xin ). If the first p elements of the main diagonal of P are 1’s, while all other elements are zero, the following identity holds: Px(t) − Pwi (t) ≡ xin (t) − win i (t).

(5)

The output signals xout are used only to adjust the output part, wout , of the weight vector wi . After the learning phase is finished, an estimate of the output vector for a given input vector is found by recalling the output part of the weight vector of the winning neuron as xˆ out (t) = wout i ∗ (t) ≡ (I − P)wi ∗ (t).

(6)

Thus, the network can learn different types of nonlinear associative mappings, xout = f(xin ), providing a mechanism for flexible representation of continuous input–output relations. It has been demonstrated through methods of stochastic approximation that this unsupervised associative memory scheme reduces the function approximation error as the network training proceeds (Barreto and Araújo, 2002). This associative memory ability also supports pattern completion after the network is trained, depending on the specification of the matrix P (Figure 3). In this figure, the letters on the left-hand side of the network denote the input subspaces, while the letters on the right-hand side denote the estimated subspaces. For example, for network structure in Figure 3(a), we have:     ∗ A −→ xˆ out (t) ≡ (I − P)wi ∗ (t) =  (7) xin (t) ≡ Px(t) = B ∗  C where the asterisk (*) denotes the unused or unknown variables, that is, those variables for which the elements of P are zero. The hat (  ) symbolizes the estimated variables. Two other examples of possible input–output mappings that can be estimated using the same trained network are given in Figures 3(b) and (c). This pattern completion property has also been proved empirically to be robust to noise (Araújo and Barreto, 2002), confirming a theoretical result by Kosko

Figure 3. Pattern completion ability of the SOM.

418

G. DE A. BARRETO ET AL.

(1990) which states that globally stable feedback systems, such as competitive learning algorithms, are structurally stable. Structural stability allows such neural models to be perturbed, e.g., by noise without changing their qualitative equilibrium behavior. Thus, the ability of the SOM to approximate nonlinear input–output mappings, combined with the structural stability of unsupervised learning, makes it potentially useful in several robotic tasks. 4.2. SOM - RELATED NEURAL NETWORKS In this section we briefly describe five unsupervised neural networks which improve the performance of SOFMs in function approximation tasks, and hence in robotics. For the first three algorithms the topology of the array is not defined beforehand as in the standard SOM, but rather created incrementally in an adaptive fashion as the networks are trained. This flexibility in defining the neighborhood relations among neurons provides lower quantization errors. As the SOM, these three neural networks still produce quantized output values as defined in (6). However, depending on the desired degree of accuracy, plain vector quantization methods have poor performances in function approximation because of the discrete representation of the data. To alleviate this problem, the last two algorithms produce continuous output values based on interpolation techniques. We discuss only the properties that differentiate these learning algorithms from the standard SOM. Similar or equal properties will not be mentioned to avoid being redundant. The Neural Gas algorithm. The “Neural Gas” algorithm (NGA) utilizes a ranking of the weight vectors to find the winner neuron for a given input vector (Martinetz and Schulten, 1991). The rank index describes the neighborhood relations among neurons. Thus, the weight changes are not determined by the relative distances between the neurons within a topologically pre-structured lattice as in the SOM, but by the relative distances between the neurons within the input space. After the presentation of an input vector x(t), the neural units are sorted in increasing order as follows: wµ1 (t) − x(t) < wµ2 (t) − x(t) < · · · < wµN (t) − x(t).

(8)

Then, all the neural units are adjusted according to wµk (t) = g(k)[x(t) − wµk (t)]

(9)

where g(k) is a monotonously decreasing function of the closeness rank order k, such that, g(1) = 1 and decreases to zero as k increases. The constant  ∈ [0, 1] scales the overall size of the adaptation step. A convenient choice is g(k) = exp(−k/ρ), where ρ determines the size of the neighborhood region. Initially, its value is chosen fairly large for rapid, coarse adaptation of many neural units at each adaptation step, and they gradually decrease in the course of the learning phase in order to enable a fine-tuning of the system.

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

419

Topology-representing networks. The Topology Representing Network (TRN) is a combination of the NGA with a competitive Hebb-rule (Martinetz and Schulten, 1994). While the NGA part distributes the input weights according to the input probability distribution, the competitive Hebb-rule preserves the topology by introducing neighborhood relations using a winner-take-all (WTA) principle. The WTA scheme is applied to lateral connection weights cµi µj ∈ [0,1] between neurons µi and µj . In addition, a connection age tµi µj is associated to each cµi µj and the connection is removed if the age exceeds a certain lifetime without the corresponding connection µi → µj being activated. Initially, cµi µj = 0 indicating that neighborhood relations are unknown. For each learning step only the connection cµ1 µ2 between the units currently ranked as µ1 and µ2 in (8) is altered. If cµ1 µ2 = 0, then we set cµ1 µ2 = 1 and the age of the connection is initialized with tµ1 µ2 = 0. If cµ1 µ2 = 0, then we refresh the connection age, i.e., tµ1 µ2 = 0. At the same time, we increase the age of all connections cµ1 µj to tµ1 µj = tµ1 µj + 1 for all units µj with cµ1 µj = 0 and remove connections cµ1 µj which exceed a given lifetime tµ1 µj > T (t), T (t) = Ti (Tf /Ti )t /tmax and Ti = 0.1N, Tf = 2N. The instantaneous topological map. The Instantaneous Topological Map (ITM) consists of a set of neurons i with weight vectors wi , a set of undirected edges (or lateral connections), and a set of node neighbors N(i) for each node i (Jockusch and Ritter, 1999). The goal of the ITM is to deal with temporally correlated stimuli, such as those from sequential robot movements. The ITM starts out with only two connected nodes. The adaptation of the weight vectors are similar to those of the NGA. In addition, the following adaptation steps are necessary: 1. Edge adaptation: (i) Create an edge connecting µ1 and µ2 if it does not already exist. (ii) For each member µj of N(µ1 ) check if wµ2 (t) lies inside the Thales sphere through wµ1 (t) and wµj (t). If that is the case, remove the edge connecting wµ1 (t) and wµj (t). When deleting an edge, check wµj (t) for emanating edges; if there are none, remove that node. The testing formula is ∀µj ∈ N(µ1 ): if (wµ1 (t) − wµ2 (t)) · (wµj (t) − wµ2 (t)) < 0 then remove edge µ1 µj . 2. Node adaptation: (i) If the stimulus x(t) lies outside the Thales sphere through wµ1 (t) and wµ2 (t), and outside a sphere around wµ1 (t) with a given radius emax (threshold), create a new node µi with wµi = x(t). Connect nodes µi and µ1 . Mathematically, if (wµ1 (t) − x(t)) · (wµ2 (t) − x(t)) > 0 and x(t) − wµ1 (t) > emax then create node at x(t). (ii) If wµ1 (t) and wµ2 (t) are closer than 0.5emax , then remove wµ2 (t). Local linear maps. Local linear mappings (LLMs) constitute an economic way of describing a “well-behaved” function f: n → m , xout = f(xin ) (Ritter, 1991).

420

G. DE A. BARRETO ET AL.

The underlying principle is to approximate the function f(·) with a set of linear mappings associated with the neurons in the SOM. Each neuron represents a local n region of the input space n , and consists of a reference vector win i ∈  in the out m input space and an associated reference vector wi ∈  in the output space together with a matrix Ai defining a local linear mapping from n to m . Training is performed as in Section 4.1, with the addition of a learning rule for the matrix Ai . A zero-order approximation of the network output was defined in (6). LLMs can improve the precision of the network output by the introduction of a linear correction term formed by the matrix Ai ∗ and difference xin − win i ∗ . Therefore, a first-order approximation of the network output xˆ out is given by in in xˆ out (t) = wout i ∗ (t) + Ai ∗ (t)[x (t) − wi ∗ (t)].

(10)

This expression corresponds to a linear expansion around wout i ∗ of the exact transcorresponds the expansion term of zeroth order, formation f(·). The vector wout ∗ i and Ai ∗ is an estimate of the Jacobian matrix for y(x) at wi ∗ . It is worth noting that, while zero-order approximations are usually used for coarse positioning of the robot end-effector, fine positioning is achieved through first-order networks, see Sections 5.1–5.3. The parameterized self-organizing map. The Parameterized Self-Organizing Map (PSOM) consists of a continuous manifold M = {w(s) ∈ X ⊆ d | s ∈ S ⊆ m }, parameterized by a continuous variable s ∈ S and described by a smooth vector function w(s) (Ritter, 1993). The goal of the PSOM is to learn an input–output mapping even when few training examples are available. This is possible through the use of efficient interpolation mechanisms and previous knowledge about the topological order of the training examples. The manifold M is constructed such that it passes through the available set of few training samples whose topological order is known in advance. These samples are directly assigned to the reference vectors wi of the PSOM obeying a given neighborhood relation extracted from their topological order. If a large number of training samples is available, it is also possible to train a SOM with few neurons and use their weight vectors as support points for constructing the manifold M. The function w(s) is usually written as follows:  H (ri , s)wi (11) w(s) = i∈A

where H (ri , s), H (ri , ·): S → , is a “basis function” defined for each formal neuron, and designed to weight the contribution of its weight vector wi depending on the location s relative to the lattice position ri . The following conditions must hold: (i) H (ri , ri ) = 1 and vanishes on all other points of A. (ii) H (ri , s) must be ortho-normal, i.e., H (ri , rj ) = δij ∀i, j ∈ A to make the manifold M passing through all supporting neurons in the lattice;

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

(iii)

421



H (ri , s) = 1, ∀s. A simple construction of basis functions H (ri , s) becomes possible for the case of a multidimensional rectangular grid. In this case, the set of functions H (ri , s) can be constructed from products of onedimensional Lagrange interpolation polynomials. i

The response of the PSOM is determined by the value w(s) at the continuous bestmatch location s∗ ∈ A, defined by s∗ = arg min dist(w(s)in , xin )

(12)

where the distance dist(·) is used to find the best-match location s∗ ∈ S and is usually the Euclidean norm. Thus, the discrete winner search in the standard SOM is now replaced by the continuous minimization problem of determining s∗ using, e.g., the Levenberg–Marquardt scheme. 5. SOM for Kinematic-based Robot Control In this section we begin to describe a series of studies involving the application of SOFMs and the key concepts presented in the previous sections to the adaptive control of robot manipulators. The performance zero- and first-order SOFMs are compared on several robotic tasks, such as the approximation of forward and inverse kinematics of redundant and non-redundant robots, visuomotor (or hand– eye) coordination, visual servoing and reaching tasks. We also discuss their ability to adapt to sudden changes on the configuration of the robot or its environment. In subsequent sections, we present SOFMs for path planning and trajectory formation during reaching and grasping tasks, so that obstacle avoidance and recalibration of parameters can be made automatically. 5.1. ZERO - ORDER SELF - ORGANIZING MAPS : COARSE POSITIONING A zero-order map based on the SOM was developed by Coiton et al. (1991) to learn inverse kinematics and applied to move a non-redundant manipulator towards a stationary target. Each neuron received two kinds of data: (1) Cartesian coordinates of the end-effector and (2) joint angles, which were used to define respectively the arm position and the motor signal needed to attain such a position. The network had then six input units: three units corresponding to the Cartesian space coordinates (x, y, z) and three units corresponding to the joint angles (θ1 , θ2 , θ3 ). The entire augmented input vector was used to find the winner at time step t. The network weights were updated using a Mexican hat-like neighborhood function so that neurons in the immediate neighborhood of the winning neuron were adjusted in an excitatory fashion, while more distant neurons were updated in an inhibitory manner. The neighborhood size was decreased during training. This model built up an association of those two types of input by placing an object between the grippers of the end-effector and moving it randomly in space. This object later became the target to be reached during the tests. The network thus associated Cartesian

422

G. DE A. BARRETO ET AL.

space inputs with robot arm motor angles, explicitly and accurately solving the inverse kinematic problem without any a priori model of arm kinematics or of the characteristics of the sensors. Jones and Vernon (1994) described how the association of motor and visual stimuli enabled the SOM, controlling a RTX SCARA-configuration robot arm (3 DOF), to learn hand–eye coordination (visual servoing) so that the arm reached a desired position and tracked a visually presented target. Their approach assumed no a priori model of arm kinematics or of the imaging characteristics of the cameras. No explicit representation, such as homogeneous transformations, was used for the specification of robot postures. Camera calibration and triangulation were done implicitly as the system adapted and learned its hand–eye coordination by experience. For the training phase, an infra-red LED was placed between the robot arm grippers. This LED was viewed by two stationary filtered cameras which could detect only infra-red light. Both images were transferred to the controlling computer memory where the (i, j ) positions of the center of the LED in the cameras image plane were computed. Sensory data were given by the image plane (retinal) coordinates of the position of the LED (i1 , j1 , i2 , j2 ), while motor data were given by the corresponding vertical translation, shoulder and elbow robot arm motor positions (z, θs , θe ) which caused the LED to be in that particular position. Hence, the input to the SOM comprised of seven variables (i1 , j1 , i2 , j2 , z, θs , θe ). When training was completed the LED was removed from the grippers of the robot end-effector and was placed somewhere within the predefined workspace, in front of the robot arm. The LED, thus positioned, became the target for which the robot should reach. The (i, j ) coordinates of the center of the LED in the camera image planes were found once again. The controlling computer “interrogated” the robot arm and found its current vertical translation, shoulder and elbow positions (z, θs , θe ). A winning neuron was found and the weights corresponding to the motor inputs produced a set of motor positions, which resulted in the robot end-effector being closer to the target. This process was then repeated, i.e., the camera inputs (which should be the same as before if the target is stationary) and the new joint positions of the robot arm were fed into the trained network, a winning node was found and its weights associated with the motor inputs were used to command the robot arm to move to a new position which should be even closer to the target. This process continued until the the latest position to which the robot arm was moved was the same as the previous position, i.e., no more movements were required. The system required up to four movements to reach the target when a 20 × 20 network was employed. The error between the achieved position and the position of the target is, on average, approximately 3 cm. The smallest disparity in position achieved in tests was 1.28 cm. In terms of practical usefulness, however, an error of approximately 3 cm over a workspace of width 34 cm, depth 9 cm and height 35 cm is inaccurate when compared with traditional methods. This is an adaptive, self-calibrating system which learns by experience, and which does not involve calculations such as those involved with camera calibration models, triangulation

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

423

methods, and the development of homogeneous transformations. The system could, at least in principle, tracking a moving object although, for this implementation, not in real time because the robot arm moves quite slowly. Kihl et al. (1995) used the SOM in a visual servoing scheme similar to that of Jones and Vernon (1994) and implemented it as a transputer-based system. Of particular interest in this work is the fact that the authors did not rely on the conventional approach of generating training data moving the robot arm randomly to several thousands of positions, where each time an image is taken to produce visual input. They argued that this procedure takes time and also puts stress on the robot arm. Instead, they commanded the arm through random arm trajectories and record on the fly correlated image positions, which took a few minutes to record sufficient data to generate a set 900 trajectories with 6.7 points per trajectory on average. This set represents properly the input–output space but, however, looses partially its uniform spatial distribution. An alternative is to record trajectories by sweeping regularly the joint workspace and then randomize selected points from the trajectories. This method does not offer the possibility of on-line training the network. The mean positioning error was 0.013 m for a workspace surface of approximately 0.175 m2 . Zeller et al. (1996, 1997) presented a method for motion planning of a pneumatically driven robot with visual sensor feedback. For this robot, the exact mathematical relationship between joint space, sensor space and control signals is not known. Thus, a TRN model with 750 neurons was trained with 5000 random moves within the workspace by sending random pressure values p = (p1 , p2 , p3 ) to the robot and observing the end-effector position as well as reading out the encoder values. Camera and encoder readings at time t were collected in a vector u and used to out train the weights win µk and wµk as follows: in win µk (t) = g(k)[u(t) − wµk (t)] and out wout µk (t) = g(k)[p(t) − wµk (t)].

(13)

To control the robot after learning, the input weights win µk were used to find the neuron µ1 whose weight vector is the closest to the presented target position utarget, while the output weights of this winner neuron wout µ1 generated the pressure values to drive the robot. The practical feasibility of the approach was demonstrated using an industrial robot simulator and a pneumatically driven robot arm using visual sensing. A similar approach was developed by Zeller et al. (1995) who used the SOM instead of the TRN models. As a way to enhance the positioning ability, Zeller et al. (1995) have incorporated additional processing stages to their basic algorithm based on the SOM. Such stages involve repetition of a movement towards a particular target up to the situation in which the robot reaches some predetermined accuracy as Jones and Vernon (1994) and Kihl et al. (1995) did. The final error was typically 0.3% (against 12% without repeated practicing) of the workspace for a simulated planar arm.

424

G. DE A. BARRETO ET AL.

Buessler et al. (1999) proposed a SOM-based scheme to learn hand–eye coordination in a modular fashion. A decomposition of the problem allows the use of several SOMs of reduced dimensionality, decreasing computational costs. Each module learns both the direct and the inverse transformations between its inputs and outputs through a bidirectional learning algorithm (Buessler and Urban, 1998). The authors considered a robotic arm, fixed upside down to the ceiling, in movement in its 3-D workspace. The cameras of a robotic head followed its displacement and provide an image-based position information. The active vision head consisted of three motor angles, αP , αT and αV , defined respectively as the pan and tilt of the whole head, and the vergence variable, determining the camera configuration α = [αP , αT , αV ]. A 3-D target point was projected on the image plane of the two cameras and formed the right and left image coordinates of the projected points, V = [xR , yR , xL , yL ]. Thus, any 3-D target point P was represented by its image coordinates V for a given camera configuration α. The particular camera configuration that fixated a point P was denoted by α F (P ). P was considered fixated when its image appeared at the center of the image plane for both cameras. The robot arm configuration was denoted θ = [θ1 , θ2 , θ3 ]. The performance of network learning was evaluated on estimating α F and the angular positions θ. For a given 3-D point P , the learning vector was formed in the following way: V, α, θ were acquired, then the point was fixated and α F was acquired. A new point could then be presented and the procedure continued. For the learning phase, the training sets were presented in random order. During testing phase, the bidirectional learning scheme provided results slightly worser than those provided by the conventional zero-order SOM described in the previously in this section.

5.2. ZERO - ORDER MAPS WITH GLOBAL INTERPOLATION : THE PSOM MODEL It is worth pointing out that the results provided by zero-order maps are similar to those obtained by the (supervised) counterpropagation network (Wu et al., 1993). However, zero-order maps have limited positioning precision due to discretization of the space. The maximal errors occurred mostly near the envelop limiting the workspace. These errors can be reduced through prolonging the training time and increasing the size of the training set. An alternative is to improve positioning through the use of interpolation techniques over the discrete neurons of a SOM network. This is exactly the approach underlying the PSOM model. Walter (1996) investigated the interpolation ability of the PSOM for the inverse kinematics problem of a 3-DOF PUMA robot. The input vector consisted of three joint angles and the Cartesian position of the end-effector, (θ1 , θ2 , θ3 , px , py , pz ). The set of joint angles were obtained by positioning the robot arm at several points along a rectangular grid over the workspace, while the spatial positions were computed using the forward kinematics transform equations. To specify a PSOM with 27 (3 × 3 × 3) neurons, it was required only 27 training samples

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

425

which were directly assigned to the reference vectors wi . The PSOM was then used to obtain the corresponding joint angles for further 200, randomly chosen intermediate Cartesian positions. The network achieved a mean Cartesian position deviation ( r) of 17 mm with a normalized root mean square (NRMS) error of 0.041. The results can be significantly improved considering more training vectors, e.g., a 4×4×4 produced ( r, NRMS) = (2.4 mm; 0.0061) and a 5×5×5 PSOM produced (0.11 mm, 0.00027). Walter and Ritter (1995, 1996) used the PSOM to learn the forward and inverse kinematics of a 3-jointed robot finger, taken from a three-fingered modular hydraulic robot hand, developed by Menzel et al. (1993). To generate the data, the three joints were moved within their maximal extensions. Each sample consisted of a 9-dimensional data tuples composed by the joint angles, the piston displacement and the finger point position. The fingertip traced out an envelope with a bananalike shape which was subsequently discretized by means of a 10 × 10 × 10 mesh. The corresponding joint-angle space was observed to have a cubic topology. A total of 1000 training samples were available, but a set of only 27 data points was found to be sufficient to construct a 3 × 3 × 3 PSOM. Then, to visualize the inverse kinematics ability, a set of workspace finger points of known arrangement (extracted from the other 983 samples) was presented to the network and the corresponding joint angles and piston displacement were recovered. In particular, the bananalike workspace yielded a rectangular grid of joint angles. When measuring the mean Cartesian deviation the authors obtained a result of 1.6 mm or 1.0% of the maximum workspace length of 160 mm. In view of the extremely minimal training set, this appears to be a quite remarkable result. Nevertheless, the result can be further improved by supplying more training points.

5.3. FIRST- ORDER SELF - ORGANIZING MAPS : FINE POSITIONING A more computationally efficient way to improve robot positioning is through firstorder maps. Ritter et al. (1989), Martinetz et al. (1990), and Martinetz and Schulten (1993) have used a three-dimensional lattice, each node in the lattice corresponding to a neuron unit, to learn the nonlinear transformation θ(V) which specifies the joint angles that takes the tip of a 3-DOF robot arm to the target point defined by retinal coordinates V provided by two cameras. The approach used the LLM network to provide suitable control signals resulting in smooth movements over the entire workspace. Thus, the relation θ(V) was piecewise linearized, whereby the origin of the linearization was determined by the position of a neuron in the lattice. There were three types of adjustable weights associated with each neuron i in the lattice: “retinal” weights wi , “joint angle” weights θ i and “inverse Jacobian” matrix weights Ai . During the learning stage, randomly-chosen joint angles are sent to the arm controller to execute many random trial motions. For each trial motion, the camera inputs the visual information about the robot’s end-effector position in the

426

G. DE A. BARRETO ET AL.

workspace. The algorithm then searches for a winning node i ∗ with its visual vector wi ∗ closest to the camera’s input. The linearization parameters Ai ∗ and θ i ∗ are read from the winner location ri ∗ in the lattice. The robot movement is divided into two parts: first, initial movements are computed from the linearization parameters. Then, a small number of corrective control steps are necessary in order to further improve positioning accuracy. The number of additional fine movements decreases as learning proceeds and the movements become more and more precise. Finally, the winning neuron and its neighboring nodes, adjusts their retinal weight vectors, the joint angle vectors and the inverse Jacobian matrices. The capabilities of this algorithm were demonstrated using computer simulations. Extensive experimentation has shown that the network self-organizes into a reasonable representation of the workspace in about 30.000 learning cycles. The accuracy of the method is limited by the image-processing resolution and not by the control algorithm. Walter and Schulten (1993) implemented the above procedure on a real PUMA 560 robot. They also used the LLM procedure with the NGA network in the place of the SOM, and compared their performance. They reported for both networks an impressive positioning accuracy of 1.3 mm which is about 0.1% of the linear dimension of the workspace of the robot arm. They concluded that the SOM is slightly more dependent on proper tuning of the learning parameters than the NGA. A hierarchical extension of the previous model, in which a 2D-SOM is associated with each node of the 3D lattice instead a single neuron, was designed by Ritter et al. (1992) to include the control of a gripper. The extended learning procedure makes the 3D network converge to a discrete representation of the workspace, while the 2D subnet represents the gripper orientation space. Based on this model, de Angulo and Torras (1996, 1997) presented a method to calibrate automatically a commercial space robot after undergoing wear or damage, which works on top of the nominal inverse kinematics embedded in its controller. They have modified the model in order to suit a more practical setting. Instead of learning the whole mapping, their algorithm learns only the appropriate corrections with respect to the inverse kinematics embedded in the controller, which is thus maintained. The other modifications enhance the cooperation between neurons, speeding up learning by a factor of 70 (nearly 2 orders of magnitude). Hesselroth et al. (1994) investigated the NGA model in controlling a 5-joint, pneumatically driven robot arm which shares essential mechanical characteristics with skeletal muscle systems by employing agonist-antagonist pairs of rubber actuators mounted at opposite sides of rotating joints. The control scenario involved the positioning the arm (3-DOF) and its gripper (2-DOF) at target positions with the help of feedback signals from two video cameras. For this purpose, a network with 200 neurons was used to represent the three-dimensional workspace embedded in a four-dimensional system of coordinates from the two cameras, and to learn a threedimensional set of pressures corresponding to the end-effector positions, as well as a set of 3 × 4 inverse Jacobian matrices used to interpolate between these positions.

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

427

The gripper orientation was achieved through adaptation of a 2×4 inverse Jacobian matrix. It is worth emphasizing that the position as a function of supplied pressure is nonlinear, nonseparable, and exhibits hysteresis because of the properties of the rubber-tube actuators of the robot. Nevertheless, through the learning algorithm, the position could be controlled to an accuracy of about one pixel (≈3 mm) after only 200 learning steps and the orientation could be controlled to two pixels after 800 learning steps. This was achieved through employment of the LLM approach. Repeated applications of corrective movements in each positioning and grasping step leads to a very robust control algorithm since the Jacobians learned by the network only have to satisfy the weak requirement that the Jacobian yields a reduction of the distance between gripper and target. By comparing the networks described in this section with those in Section 5.1, one can note that all these networks learned the transformation between the Cartesian coordinate or the visual image of the target point and the actuator settings for the required arm position. Each winner neuron represented this nonlinear transformation for its neighborhood in the lattice. However, the models in this section also learned the inverse Jacobian matrix that gave the linear part of the first-order expansion of the transformation about the winner neuron. This way, the nonlinear transformation was approximated by an adaptive superposition of many linear mappings, each one valid only locally. The first-order approximation approach yields a considerably higher accuracy with the same number of neurons used by a zero-order approximation network.

5.4. DEALING WITH REDUNDANCY The 3-DOF robots discussed in the previous sections could reach any target location within the workspace by a unique set of joint angles because they were non-redundant. Common methods for the control of an arm with redundant degrees of freedom eliminate the under-determination of the control problem by selecting a joint angle configuration that optimizes an objective function in addition to reaching the target point (Massone, 1995). Psychophysical experiments show that humans also prefer arm configurations that minimize certain “costs”. A suitable form of the cost function is obtained if the arm is supposed to be as “lazy” as possible while moving (Cruse et al., 1990), i.e., the change of the joint angles of the arm should be as small as possible. This is a sensible, real-life requirement since it reduces both the wear and tear and the energy consumption. In robotics, the cost function often also serves to smooth the movements of the robot arm. For two adjacent target points an algorithm could select two completely different arm configurations when there are redundant degrees of freedom. By employing a cost function one ensures that adjacent target points will be assigned to similar arm configurations. The assignment of similar joint angles to adjacent target point is, in fact, one of the main features of the learning algorithm of the SOM and related networks. In contrast to other common methods, one does not have to

428

G. DE A. BARRETO ET AL.

minimize an explicitly formulated cost function. By the construction of a topologypreserving map between input the signal space and the neural net its is made sure that adjacent target points always activate adjacent elements in the network. In addition, the learning step in (2) forces adjacent lattice nodes to adapt their output towards similar values. At the end of the learning phase the output values will vary smoothly from node to node. Both features bring about a continuous and smooth transformation from the input signal space of target points to the output space of joint angle sets. This transformation guarantees smooth and efficient movements of the arm. To confirm this property, Ritter et al. (1992) used the algorithm described in Section 5.2 to drive a redundant (5-DOF) robot to move along a trajectory of predefined target points. By exploiting the topology preserving features of the SOM, the robot indeed performed a smooth motion while moving its end-effector along the diagonal of a cubic workspace. The learning algorithm used those arm configurations out of the many possible ones that lead to smooth changes in the joint angles as the arm moves. This was achieved without the explicit minimization of a cost function. It is noteworthy that the positioning error after 6000 learning steps reaches the same minimal value of 0.0004 length units as the 3-DOF robot described in Section 5.2, in spite of the increased number of degrees of freedom. Walter and Ritter (1995) and Walter (1996) described a hierarchical structured approach for context-dependent learning which uses several PSOMs in a modular fashion. This method is motivated by a decomposition of the learning phase into two different stages: A longer initial learning phase “invests” effort into a gradual and domain-specific specialization of the system. This investment learning does not yet produce the final solution, but instead pre-structures the system such that the subsequently final specialization to a particular solution within the chosen domain can be achieved very rapidly. The investment learning phase is realized by a set of PSOMs (called T-PSOMs) which learn a set of prototypical basis mappings attempting to cover the range of tasks in the given domain. The capability for subsequent rapid specialization within the domain is then provided by an additional mapping that takes a set of observations that is sufficient to characterize a task instance into a suitable combination of the previously learned prototypical basis mappings. The construction of this additional mapping again is solved with a PSOM (called Meta-PSOM) that interpolates in the space of prototypical basis mappings that were constructed during the “investment phase”. The approach was tested in learning 3D visuomotor maps for a redundant (6-DOF) PUMA 560 robot manipulator seen by a pair of movable cameras. The task of the two MetaPSOMs (one for each camera) was to rapidly re-learn the required visuomotor transformations needed for end-effector positioning whenever the cameras had been repositioned. Walter (1998) further investigated the control of a redundant manipulator using the PSOM. It is of particular interest to intentionally utilize the redundancy to solve auxiliary goals. Auxiliary goals can be formulated as additional cost function terms

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

429

and can be activated whenever desired depending on the task at hand. The cost function terms can be freely constructed with various functional forms and are supplied during the learning phase of the PSOM. If one anticipates useful auxiliary functions, the input vector x(t) in (4) can be augmented in advance, enabling to construct reconfigurable optimization modules. These can be later activated on demand (through the projection matrix P) and show the desired performance. For example, different optimization goals such as finding the target joint configuration that results in shortest path, moves the joints softly, or that finds a configuration far from joint range limits can co-exist within the same PSOM. This procedure allows to define priorities of goal functions, which are solved according their rank. To demonstrate the impact of the auxiliary constraints, an augmented PSOM is engaged to re-arrange a suitable robot arm configuration. The initial starting position was already a solution of the desired end-effector position. The PUMA manipulator turns from a joint configuration close to the range limits to a more comfortable configuration, while keeping the same tool center point.

5.5. DEALING WITH ADAPTABILITY Another desired property for autonomous robot learning is the so-called adaptability, i.e., the ability of the robot to adapt to new, unexpected changes of its environment or geometry. Many of the neural network models that have been utilized in robotics have a limited adaptability because learning usually takes place during a training phase of limited duration, after which the neural network is no longer allowed to learn. This constraint is necessary to ensure that stable learning can take place. However, this standard procedure of “freezing” the network weights after learning poses serious problems for robotics applications in an unstructured or unstationary environment. If the environment data presented to the network during learning is not representative of the environment encountered during operation, the network performance is likely to be poor. One alternative is to maintain the learning rate at very low constant values as done by Ritter et al. (1992) with the 5-DOF simulated robot described in Section 5.4. After 7000 trial movements, they extended the last arm segment by 10% of the length of the robot arm to test the adaptational capability of the system. The positioning error initially increases but then decays again with additional adaptation steps until the previous value of 0.0004 learning steps is regained. Walter and Schulten (1993) performed this test in a real PUMA robot, allowing the network to adapt itself to a sudden elongation of 100 mm (10% of the linear dimensions of the workspace) of the last arm segment. If the final learning rate is too low, the system takes long time to adapt itself and learns a new mapping poorly. On the other hand, if the final learning rate is too large, the system is able to adapt itself quickly, but never stabilizes enough to perform precise positioning. To allow the network to react autonomously to different failure events, Arras et al. (1992) proposed an automatic annealing procedure

430

G. DE A. BARRETO ET AL.

that couples the learning rate to the average positioning error ε, obtained from the available camera feedback, as follows:  (14) ε(t + 1) = ε(t) + γ V − Vc  − ε(t) where V is the image coordinate of the target position and Vc is the image coordinate of the end-effector after the execution of a coarse movement. Initially, ε(0) = 1. The learning rate η and the neighborhood size σ are adjusted based on this error: η(t + 1) = η(t) + β((ηh − ηl ) tanh(λε) + ηl − η(t)), σ (t + 1) = σ (t) + β((σh − σl ) tanh(λε) + σl − σ (t))

(15) (16)

where λ > 1 is the gain of the function relating the positioning error ε to η and σ . Constants αh and σh represent the upper bound of α and σ , while αl and σl correspond to their lower bound. The constants 0 < γ , β  1 control the velocity of the adjustment of η(t) and σ (t) as function of ε(t). This approach results in a “closedloop” control with no distinction between a learning phase and testing phase. It enables the network to adapt itself to changing conditions quickly by raising the learning rate when learning needs to be done (high ε), and by practically shutting off learning when the robot system suitably works (low ε). 6. Path Planning and Trajectory Formation The SOFMs described so far have learned some type of coordinate transformation, such as forward and inverse kinematics. That is, given a point in one specific space, find the corresponding point(s) in the other space of interest. These networks can now be used for path-planning or trajectory formation tasks. After the mapping has been established, we want to generate a path from any initial position to a given target, e.g., to guide an end-effector of a robot manipulator in the presence of obstacles within the workspace. In the following we describe three ways of performing path-planning with SOFMs: (1) specifying optimization criteria, (2) using diffusion techniques, and (3) using potential-field methods. Optimal trajectories. Using the TRN model, Zeller et al. (1996, 1997) showed that a locally optimized path can be determined by minimizing the Euclidean in distance dE from the current position, xin cur , to a given target position, xtarget . The motion plan can be generated as follows: in 1. Read current position xin cur and target position xtarget . in 2. Find the corresponding nearest weight vectors win cur and wtarget . 3. Plan to move from current unit win cur to a neighboring unit i with ccur,i > 0 that satisfies

in in in (17) dE (win i , wtarget ) = min (wj , wtarget ) . j ∈Fi

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

431

in 4. If win cur = wtarget then stop, otherwise continue with step 1.

Here Fi denotes the set of all nodes which are neighbors of neuron i as defined by the network topology. In the presence of obstacles within the workspace, step 3 has to check if a move will result in a collision and avoid it. Finally, if the motion plan meets a given goal, the movement can be initiated using the corresponding network output values, xˆ out = wout i , i = icur , i1 , . . . , itarget , to generate the sequence of motor commands that drive the robot from the starting to target positions. For other optimization criteria which can be equally used in this scheme see Flash and Hogan (1995). Diffusion-based trajectory formation. Zeller et al. (1997) used the diffusion-based path finding algorithm suggested by Ritter and Schulten (1987) on the TRN model. According to this algorithm the target neuron itarget is the source of a diffusing substance. The goal is to find a sequence of neurons i0 , i1 , . . . , in on the network in leading from the current position xin cur to the target position xtarget . A function fi (t), initially set to fi (0) = 0 (i = itarget ), represents the concentration at each neuron of the network and is held constant at the target neuron while diffusing through the links of the network. The motion plan can then be generated as follows: in 1. Read current position xin cur and target position xtarget . in 2. Find the corresponding nearest weigh vectors win cur and wtarget . 3. Detect obstacles in the workspace and eliminate those neurons i which are covered by an obstacle. 4. Define diffusion on network lattice and iterate the following equation until variations of fi are small enough:

 1

1  fi (t + 1) = fj (t)  Ni + 1 j ∈F

if i = itarget , otherwise.

(18)

i

5. Follow steepest gradient of fi (t) from current unit icur to target unit itarget . Here Ni is the number of neurons in Fi . Since the network graph is finite, the algorithm is guaranteed to terminate yielding a proper path {icur , i1 , . . . , itarget }. The path is the shortest in the sense that it takes a route that maximizes the increase of fi (t) at each step. Following the steepest gradient of fi (t) from icur to itarget eliminates the problem of local minima associated with many potential field path planning methods. While other relaxation procedures can be used as well, this simple relaxation scheme serves the path-planning purpose and is computationally inexpensive (complexity O(N 2 ) with N being the number of nodes in the network), an important step towards real-time control, e.g., in the presence of moving obstacles. A similar diffusion-based path planning scheme has been proposed by Szepesvári and Lörincz (1998).

432

G. DE A. BARRETO ET AL.

Potential field methods. A model for motion planning, called Self-Organizing Body-Schema (SO-BoS), was proposed by Morasso and Sanguineti (1995). SOBoS combines two components: (i) a forward kinematic model, represented as a SOFM, which can estimate the sensory consequences of a given motor pattern, and (ii) a real-time model inversion method, based on a gradient-descent mechanism in a potential field, which drives the output motor pattern toward a configuration of minimum cost. Unlike previously described attempts to use unsupervised techniques for learning sensorimotor mappings, SO-BoS overcomes the redundancy problem by using a forward modeling instead of a direct–inverse modeling approach, i.e., it learns, during exploratory movements, the motor-sensory transformations (that are always well-defined) instead of the sensorimotor transformations (that are usually ill-defined for redundant systems). Moreover, unlike a formulation of the forward modeling approach that uses multilayer networks, the SO-BoS can perform task optimization in an on-line way, exploiting the local computation of gradient fields in a SOFM. The gradient-descent mechanism, operating on the SO-BoS map, is driven by a potential function  target =  target (µ), defined by: 1 xc − xef (µ)2 (19) 2 which measures the distance between the current position xef of the end-effector and the current position of a time-varying virtual target xc . This virtual target is needed to specify timing aspects of the trajectory formation. As regards the evolution of target xc = xc (t) in time, the Vector Integration to Endpoint (VITE) scheme by Bullock and Grossberg (1988) can be used:  target =  target(µ) =

x˙ c = Go(t)(xT − xc )

(20)

where xT is the final position of the target. By controlling the gating action of the Go(·) signal, it is possible to obtain trajectories with different speed profiles. The gradient-descent strategy of the SO-BoS is, in fact, a mechanism that performs kinematic inversion. It operates equally well with redundant and non-redundant systems, implicitly computing an inverse matrix: the inverse Jacobian of the direct kinematic function, for non-redundant systems, or the Moore–Penrose pseudo-inverse matrix for redundant systems. In contrast, the models reported in Martinetz et al. (1990), Ritter et al. (1992), Walter and Schulten (1993) and Hesselroth et al. (1994) explicitly calculate the inverse Jacobian matrix. One of the nice features of performing the inversion via gradient descent in a SOFM is its computational robustness, even in the vicinity of kinematic singularities, such as the boundary of the workspace, which tend to make unstable conventional inversion methods. The So-BoS framework is further computationally characterized in (Morasso et al., 1997). Glasius et al. (1996) presented a two-layered neural network for trajectory formation and obstacle avoidance. Each layer consists of a modified SOM whose neurons have continuous dynamics. Stability of this approach is guaranteed through the

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

433

existence of a global Lyapunov function (Grossberg, 1988). When the first layer, the sensory map, receives receives sensory information about the robot environment the activity pattern in this map starts to evolve towards a landscape with hills and valleys. The locations of the top of the hills in the map topographically coincide with target locations in the robot workspace, the bottoms of the valleys coincide with obstacle positions. Each path is determined by the neural activity gradient on the activity landscape in the sensory map. Therefore, it is able to find the shortest possible path. The second layer, the motor map, receives inputs through feedforward connections from the sensory map. Additional inputs provide information about the current state of the system. This extra input could originate from sensors in the actuators of the limb or manipulator. Over time, a cluster of activity shifts over the motor map directed by the input from the sensory map, until it reaches the target location. The movement of the cluster is a smooth path corresponding to a smooth motion in workspace. No supervisor (teacher) is required. The output of the motor map was used to simulate the control of the actuators of a multi-joint manipulator. The system is able to reach a target even in the presence of an external perturbation. Yang and Meng (2001) proposed an unsupervised path-planning algorithm very similar yet more general and more computationally efficient than that by Glasius et al. (1996). Compared with classical potential field approaches described in (Kathib, 1986), the models by Glasius et al. (1996) and Yang and Meng (2001) do not suffer from undesired local minima (e.g., due to concave-shaped obstacles) and is able to react to unforeseen external forces. These models are also capable of choosing one out of multiple targets depending on the specification of initial conditions. For redundant manipulators, they will try first to make a smooth path from the present configuration to the closest target configuration. If an obstacle obstructs the path to the closest target in configuration space, then the basin of attraction belonging to the closest peak is reduced in size while that belonging to the other peaks has been enlarged. The model will choose the next nearest target configuration which is not obstructed.

7. SOFMs for Dynamic-Based Robot Control The SOFMs described in Section 5 explore the kinematic aspects of a robot arm. In this section, we describe SOFMs that learn to control a robot arm or gripper based on force/torque data. The few models to be described were categorized into two classes according to the nature of relation between the robot and its environment. The first one concerns the non-contact (unconstrained) motion in a free workspace, without any relevant environmental (external) influence exerted on the robot. In practice, a limited number of simple robotic tasks such as pick-andplace, spray painting, gluing or welding, belong to this group. In contrast, many advanced robotic applications such as cutting, drilling, insertion, fastening, contour following, drawing, assembly, require the manipulator to be mechanically coupled

434

G. DE A. BARRETO ET AL.

to other objects. These tasks include phases where the robot end-effector must come into a contact with objects in its environment, produce certain forces upon them, and move along their surfaces. The terms constrained or compliant motion are commonly used to refer to this type of task.

7.1. LEARNING THE INVERSE DYNAMICS IN NON - CONTACT TASKS The SOFMs discussed so far provided the required joint angles for any desired Cartesian position of the end-effector. Setting the joints to these angles is left to the joint servo motors of the arm. Usually, the servo motors are part of a simple PID control loop with fixed gain parameters, and try to achieve the target position by applying the motor torque in opposite direction to any angular deviation from the target joint settings. For slow movements, this is an appropriate strategy because the individual joint movements can then be regarded to good approximation as independent of one another. However, for rapid movements, the inertia of the arm segments leads to a coupling between movements of different joints. In this situation, a single motor can no longer determine its torque from the present and given joint position alone, but rather its torque must also depend on the movements of all joints. Thus, to achieve the desired movement, the Newtonian equations of motion of the arm, i.e., its dynamics, must be included as well. Fortunately, a real-time computation of the arm torques is possible by means of recent algorithms whose computational effort grows only linearly with the number of arm joints. However, the inertia tensors required for such a computation are often known imprecisely. This is due to the fact that the spatial mass distribution of the arm, which is described by these tensors, in general is very complicated for real systems. Hence adaptive algorithms with the ability to learn these properties of the arm are highly desirable. Ritter et al. (1992) showed how a SOM-based algorithm can also be applied to the problem of moving a robot arm by accounting for the arm dynamics. They control a 3-DOF robot arm by means of briefly applied torque pulses at its joints in such a way as to accelerate its end-effector to a prescribed velocity. Motions generated this way are termed ballistic movements. During the remaining time, the arm moves freely. The relationship between arm configuration, desired velocity, and required torque pulse is to be learned by a network again through trial movements. If the end-effector is initially at rest, a briefly applied torque pulse, d(t) = τ · δ(t), which imparts to it the acceleration a = x¨ . Here τ = (τ1 , τ2 , τ3 ) denotes three torques acting on the three joint of the arm. The learning algorithm developed by Martinetz et al. (1990) to learn the inverse kinematics (see Section 5.2) was adapted to learn the inverse dynamics. As before, they defined a vector wi and a matrix Ai for each lattice neuron i. This time, however, each wi specified joint angles instead of camera positions of the target. The matrix Ai , in the course of learning, converged to the transformation matrix connecting the desired end-effector acceleration a with the required torque amplitudes τ .

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

435

The training phase of the robot consisted of a sequence of random movements. For each trial, a random target configuration, θ, was obtained by requiring the endeffector to be at some randomly chosen position within the working area. From there, the end-effector is to be moved with a prescribed acceleration z (also chosen at random during the learning phase). On the basis of the initial configuration θ, the system selects that transformation matrix Ai ∗ , for which wi ∗ −θ = mini wi −θ. The robot then performs the movement resulting from the torque amplitude given by τ = Ai ∗ z. From the end-effector acceleration a actually obtained, an improved estimate A∗ for Ai ∗ is derived as follows:  (τ − Ai ∗ a)aT (21) A∗ = Ai ∗ + a2 and, taking into account the input quantity θ , the following learning steps wi (t) = h(i, i ∗ ; t)[θ − wi (t)] and Ai (t) = h(i, i ∗ ; t)[A∗ − Ai (t)]

(22)

are carried out for the variables wi and Ai , respectively. Subsequently, the next trial movement is executed. Extensive simulations have proved the efficiency of the method and the accuracy of the learned inverse dynamics. Behera et al. (1995) also proposed unsupervised learning schemes to learn the inverse dynamics of a single link robot manipulator using the SOM and the NGA networks. 7.2. LEARNING THROUGH COMPLIANT MOTION Robot tasks involving contact between parts and, particularly insertion and extraction, are very common in assembly and manufacturing in general. A robot arm must grasp a part, carry it to its destination, and insert it adequately. This is a very error-sensitive task. Due to uncertainty, the part may be badly inserted or extracted, or even damaged. Errors are typically caused by a deficient geometric model of the environment and uncertainty in the initial pose (sensor) and control. In order to be able to cope with these kinds of errors, the development of sensors and their integration into robots is of fundamental importance in robotic assembly. Force and torque sensors must be used to monitor the correctness of the contact state between the arm and its environment. There are complex relations between force and torque magnitudes that correspond to the different contact states. A formal analysis of these relations using other techniques will not generally be possible in real-world manufacturing situations. Cervera et al. (1996) used a two-dimensional SOM to learn different tasks involving motion in contact, particularly the peg-in-hole insertion task, and complex insertion or extraction operations in a flexible manufacturing system. In a general case, there are six input signals: the three force (Fx , Fy , Fz ) and three torque (Tx , Ty , Tz ) components acting on the robot wrist. After training, the SOM will be organized in well-defined regions of cells (clusters), which are related to the contact

436

G. DE A. BARRETO ET AL.

states of the system. If one identifies the state associated with each clusters, it will be able to detect errors in an assembly insertion/extraction task by observing the network response to the on-line measured force signals, i.e., which of the map clusters is more active, and thus, which type of contact state is occurring. In the two-dimensional peg-in-hole insertion task, a rectangular peg has to be inserted into a vertical rectangular hole. A force sensor was simulated and assumed to be attached to the upper face of the peg. When there is a contact between the peg and the surface, it is possible to compute the reaction forces measured by this sensor. In the simulations, relative contact positions between the peg and the hole were randomly chosen, and the appropriate reaction forces and torques were calculated. These were inputs to the neural network. Training took usually from ten to twenty thousand learning steps. For the complete insertion operation experiment, a real robot was used. The uncertainty was simulated by introducing small position errors in the arm. The network was trained with samples of 100 sensor signals for each insertion operation. After training and labelling the clusters in the map, the network was successful in detecting errors for both applications. If an unknown state is detected, the training process might be re-run to learn this new state on-line. Thus, new error states are dealt with as soon as they appear. Minimal feedback with the operator is required in order to provide a label for the new state. The low computational complexity of this approach makes it feasible for its use in a real-time environment. Jockusch (2000) proposed a layered control architecture based on the ITM network. It was designed to allow versatile control of a PUMA robot and the TUM hydraulic hand (Menzel et al., 1993). The author designed a dynamic (force-based) approach for controlling the arm–hand system instead of exploring conventional kinematic (position- or velocity-based) approaches. One reason for that is the absence of reliable positional sensors. The main reason, however, is that the author aims at performing intelligent compliant grasping of objects. Positional control can still be programmed as a special case. The force controlled system is equipped with a state machine layer which reacts to sensor patterns and external commands. Its task is managing controller settings (states) and changing them every time a set of transition criteria is met. The states are labeled to provide an abstract interface in which client applications or users trigger an action by naming the appropriate state. The set of states implements reflex-like behavior patterns: sensory input of a specific pattern would trigger a certain behavior within a short reaction time. This reflex-like behavior simulator, which is missing in standard systems, makes the system responsive and produces the desired behavior in most situations. The statemachine allows high-level layers (e.g., a path planning layer), to integrate their algorithms into the state machine by contributing new states and state transitions. The path-planning layer drives the state machine when triggered to find a path to a given target. Since the data from sensors have a trajectory-like (sequential) structure, conventional unsupervised learning algorithms that are usually based on the assumption of statistical independence of their training patters, cannot be

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

437

applied. In that case, it is used a path finding algorithm based on the ITM network, feeding the state machine with successive intermediate states to reach the given target. 8. Discussion SOFMs as look-up tables. The unsupervised neural networks presented in this paper for sensorimotor control of arm movements can learn arbitrary, continuous, nonlinear input–output relations. In such networks, each node location is responsible for a small subregion of the input signal space. In the inverse kinematics problem, for example, after presentation of an input vector (target Cartesian position) an output vector is chosen from the neurons in the map defining the joint angles or torques that are needed to reach the target. Other control signals can be equally stored at that node location. In this sense, one can say that SOFMs are working as look-up tables (Kohonen, 1997). The look-up table approach in robotics is based on a position of a table that contains the value of the function of interest (Raibert and Horn, 1978). To address this position an algorithm to detect features in the input signals can be used. In the case of SOFMs, this algorithm is provided by the best-matching unit search. However, it has to be mentioned that these mappings do not provide us merely with a massive look-up table that links motor and spatial positions. The major difference from a conventional look-up table is that the one based on SOFMs is self-programming, i.e., the table entries are found by unsupervised learning, and should evolve a good statistical representation of the input space. The use of SOFMs offers the advantage of flexible (adaptive) coupling between input values and the table entries (network neurons) in the table (network) such that entries which are neighbors in the table are assigned to input values which are neighbors in the input space, based on a given metric. Are topological constraints really necessary for robot control? It should be emphasized that, in order to learn input–output relations, no topological constraints on the weight vectors of the neurons are necessary in principle, and any classical vector quantization (VQ) algorithm could be used to describe the space, whereby the desired transformation could also be tabulated. There is even no strict need to use VQ, since a regular three-dimensional lattice can define the codebook vectors over the space (Kohonen, 1997). It is worth remembering that, no matter what kind of VQ algorithms we use, keeping a “list” of vectors that span the input/state space we wish to map means that we have to accept a quantization error introduced by the spacing of those vectors. Geometrical descriptions of the robot workspace do not have this drawback, but there is no easy way of building such descriptions for arbitrary environments. The quantization error is therefore a small price to pay for the simple and general formulation of topological maps produced by the networks described. Quantization error can be considerably reduced, if enough memory is

438

G. DE A. BARRETO ET AL.

available, by increasing the number of neurons in the network, or using first-order SOMs or the PSOM. Nonetheless it seems profitable to first learn and use adaptive SOFMs for the following reasons: (i) Topological constraints, such as those imposed by the neighborhood function, significantly speed up learning (due to similar corrections made at nearby nodes in parallel), and the mapping becomes smooth much faster than using traditional VQ methods. It even turns out that without the transfer of learning from the winning node to neighboring entries, the map does not converge towards a stationary state (Ritter et al., 1989; Martinetz and Schulten, 1993). (ii) It is possible to distribute optimally the weight vectors over the robot workspace, having higher density of lattice points where the control must be more accurate. (iii) Topology-preserving maps are very robust: a fault in a few neurons does not impair the correct functioning of the system since neighboring neurons having similar weight vectors can assume the control task with minor errors (graceful degradation). (iv) Neighboring neurons can be connected by edges which represent possible pathways from one neuron to the other. This viewpoint also allows us to benefit from graph theory, which provides efficient path finding algorithms (Section 6). SOFMs for optimal decision control. It is worth noting that SOFMs can simplify the use of neural networks in optimal decision control. In this application, the state space is partitioned into regions (feature space) corresponding to various control situations (pattern classes). The realization of the control surface is accomplished through a training procedure. Since the optimal surface in general is nonlinear it is necessary to use an architecture capable of approximating a nonlinear surface. This “optimal” surface is not known a priori but is defined implicitly by a training set of points in state space whose optimal control action is known. Hunt et al. (1992) suggested the approach in Figure 4(a). The state space is first quantized into elementary hypercubes in which the control action is assumed constant. This process is carried out with a learning vector quantization (LVQ) architecture. It is then necessary to have a MLP network trained with the backpropagation algorithm acting as a classifier. The LVQ and MLP architectures can be, for example, replaced by a single SOM as shown in Figure 4(b) to design nonlinear controllers (Sbarbaro and Bassi, 1995; Delgado, 2000). The SOM inherently performs vector quantization of the state space and control actions can be learned as well. It is also possible to design SOM-based controllers by using the concepts in Section 4.1. The advantages of these approaches are clear: (i) training can be done with unlabelled data, i.e., no a priori knowledge of the control surface is required; (ii) less computational efforts because SOFMs are substantially easier to be trained

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

439

(a)

(b) Figure 4. Optimal decision control using (a) LVQ and MLP and (b) the SOM.

than supervised networks, such as the MLP and RBF networks; and (iii) several optimal control strategies can be defined simultaneously and changed on demand. Stability of SOFM-based controllers. Stability is a crucial issue in designing robot controllers. In this sense, it is generally required to prove that the output of the controllers is bounded and/or converges to a desired steady state solution. The vast majority of applications of SOFMs to robot control has been mostly experimental. As pointed out by Prabhu and Garg (1996), these empirical demonstrations are valuable in that they allow to deal with complex nonlinear systems which cannot be directly handled by conventional linear approaches. The natural evolution of this relative young field of SOFM-based modeling and control requires attempts to address analytical issues, such as stability, controllability and observability of the system. The works by Glasius et al. (1996), Yang and Meng (2001) and Delgado (2000) can be considered important steps in that direction. Based on the Cohen–Grossberg theorem (Cohen and Grossberg, 1983; Grossberg, 1988), Glasius et al. (1996) and Yang and Meng (2001) proved the stability of their path-planning algorithms (see Section 6). This theorem describes a general principle for designing content-addressable memory (CAM) networks that admit a global Lyapunov function. Once a number of mild technical conditions hold (e.g., symmetry of weight matrix), it is fairly simple to rigorously prove the stability of

440

G. DE A. BARRETO ET AL.

the neurocontroller. This approach is particularly important because it allows to understand each robot path as being determined by the neural activity gradient on the activity landscape in the sensorimotor map found by SOFM algorithms. Delgado (2000) proposed two applications of the SOM algorithm in the context of nonlinear control, one in approximate feedback linearization and the second in optimal control. The author included a proof of the closed-loop stability of the controller and the results were illustrated with simulations of a single link manipulator. Generalized versus specialized inverse modeling. Massone (1995) and Hunt et al. (1992) listed a series of drawbacks of the direct-inverse modeling paradigm. These limitations, however, apply in most cases for the supervised approaches (e.g., using MLP and RBF networks). In the following, we describe these limitations and show how they can be overcome to some extent in terms of SOFMs. (i) In the case of multiple sensory channels, a number of different MLP networks would have to be trained independently; this would pose the additional problem of deciding how to integrate the different motor outputs. Comment. We have shown throughout the paper how easily several sensory and motor signals, originated from the circular-reaction strategy, can be simultaneously integrated within a single SOFM and learned with no need of an external supervisor, achieving high-accuracy input–output mappings. Indeed, this is one of the main advantages in using this type of network for sensorimotor learning. (ii) The direct-inverse modeling approach has clear difficulties if the motorsensory transformations is many-to-one (redundant), as is usually the case in motor control. To solve redundant or ill-posed inverse problems, one must generally specify some additional constraint or optimization criterion that must be satisfied to find the solution. Comment. This is not necessarily true when we use SOFMs for the control of redundant manipulators. In Section 5.4, we have shown that the topology-preserving property of SOFMs can provide suitable constraints automatically and, therefore, implicitly allows the control of a redundant manipulator by minimizing the changes of the joint angles of the arm, which reduces both the wear and tear and energy consumption. It is also possible to explicitly define optimization criteria as well, using for example the PSOM network or the SO-BoS model. (iii) A possibility that could allow one to maintain a direct-inverse model for redundant robots would be to implicitly embed the optimization criterion in the training set. The network will learn to associate a unique motor pattern for any given stimulus, consistently with the distribution of training patterns, but this is a method that does not allow any kind of task-dependent adaptation. Comment. Most of the models described in this review resolve the redundancy at training time, getting rid of any extra degrees of freedom. This means that, during run-time, the manipulator will always adopt the same joint angle configuration for a given target position. Biological systems, however, exploit the redundancy to adapt easily to a variety of tasks. One alternative in terms of unsupervised

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

441

networks is given by the PSOM through the use of a time-varying subspace selection matrix P(t) as described in Section 5.4. Another alternative is provided by DeMers and Kreutz-Delgado (1996) who proposed a method generalizing the approach by Ritter et al. (1992). By analyzing the the topology of the joint space of a 3-DOF planar manipulator with rotational joints, they demonstrated that one can learn and parameterize the global topological structure of many-to-one maps and apply the direct-inverse method and SOFMs to the inverse kinematics problem of redundant manipulators without having to enforce a task-specific regularization at training time. The resulting solution can be flexibly chosen at run-time to meet any redundancy-resolving side task or criterion. (iv) Using the direct-inverse model of a system as a controller (Figure 5(a)) relies heavily on the fidelity of the inverse model used as the controller. For general purpose use, serious question arise regarding the robustness of direct-inverse control. This lack of robustness can be attributed primarily to the absence of feedback. Comment. This problem can be alleviated to some extent by using on-line learning strategies so that the parameters of the inverse model can be adjusted online with errors signals provided by video cameras. A realization of this adaptive closed-loop scheme in terms of SOFMs can be made possible, for example, through the use of the strategies in (14)–(16) (see Figure 5(b)). Comparison with other unsupervised models. Few other unsupervised networks have been proposed to learn inverse mappings. Kuperstein and Rubistein (1989)

(a)

(b) Figure 5. Direct-inverse control with a SOFM (a) without feedback and (b) with on-line correction scheme.

442

G. DE A. BARRETO ET AL.

and Kuperstein (1991), extending the work of Grossberg and Kuperstein (1986) on unsupervised adaptive sensorimotor control, proposed the INFANT neural network model to achieve hand–eye coordination. The robot not only had camera-like eyes providing visual input, but also could adjust the orientation of its eyes with different tensions of eye muscles. Motor signals produced through circular reaction were first autonomously generated to move the arm to certain locations in the robot workspace. The images captured by the two “eyes” were combined to produce visual maps. The muscle activations of the two eyes were also combined to produce gaze maps. The visual maps and gaze maps were then combined to compute the necessary motor signal. In the INFANT model, each topographic map was only one-dimensional and had a fixed topographic ordering, which was imposed initially. This means that the topographic maps did not develop in the course of learning, which required more prior knowledge about the structure of the set of input data the network had to process. Also, the use of fixed topographic maps did not provide an automatic increase in the resolution of the representation of those movements which have been trained more frequently. Unlike the SOM, cooperation among neighboring neurons plays no special role in the convergence of the algorithm. Furthermore, as the maps in the INFANT model were one-dimensional and their outputs for each actuator were summed linearly, they approximated only a restricted class of control laws accurately. 9. Conclusion and Further Work In this review, we reported several approaches for using SOFMs for adaptive modeling and control of robot manipulators (see the appendix for a summary table). From a roboticist’s point of view, interpreting and using a topology-preserving map literally can provide new possibilities for adaptively tackling several important problems that so far remain without a completely satisfactory solution, such as learning of inverse mappings, path planning, obstacle avoidance, compliant motion, and fault detection. From the exposed, one can easily write down a general, systematic procedure for the design of neurocontrollers based on such neural networks. This approach is as simple as and much faster than those based on supervised networks like MLP and RBF. Furthermore, the inherent topology-preserving clustering property of SOMs produces well-defined, localized regions over the map that allow a clear interpretation of the learned mapping. Based on the models described in this review, one can list some common properties that make SOFMs an effective choice to control a robot manipulator: 1. Simple overall structure. SOFMs consist of a number of simple artificial neurons whose learning capabilities extract topological relationships from unlabelled data. These neurons compete globally for the right to respond to a given input sample vector, but cooperate locally to transfer knowledge through the neighborhood function. This competition–cooperation strategy gives rise to the

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

2.

3.

4.

5.

443

emergence of a well-defined and robust network structure representing a given input–output mapping. Fast learning. Weight updating in SOFMs is also very simple and, in general, the network produces useful representation of the knowledge gathered from input signals within a few thousand iterations of the learning process. Robustness. Kosko (1990) has proved that competitive learning models are structurally stable, i.e., their qualitative dynamic behaviors are insensitive to small perturbations (e.g., noise). Furthermore, due to the topology-preserving property of SOFMs, the resulting controller is very robust to faults in certain neurons since the network can still control the manipulator: neighboring neurons assume the task of the collapsed ones. Small changes in the neuronal population cause small changes in the resulting control action. This property is also useful from the point-of-view of security since it can prevent the robot from executing sudden movements that could be harmful to humans. Graph usefulness. Since the organized structure produced by the network after adaptation is an abstract representation of the robot workspace, it can be directly used for path finding. A path can be easily created from a starting point by using different criteria such as shortest path, obstacle avoidance or minimum joint-angle changes. Adaptability. The parameters of the arm can vary smoothly over time (e.g., aging), or suddenly (unexpected changes in payload). SOFMs are able to incorporate this new knowledge, for example, through adaptive learning rates without major changes in its overall structure.

In addition, the following properties should be pursued in further work in this field in order to give more flexibility to the learning algorithm of self-organizing maps engaged in control tasks: 1. Continuous learning. With the exception of the ITM model, all the other algorithms were designed to be used in two stages: training and execution. In a robot control scenario, it would be highly desirable not to implement a (re)training phase which implies to stop using the map. Instead, the neural network shall continue to adapt incrementally (on-line) using the steady stream of input stimuli. The use of an adaptive learning rate (Section 5.5) is an example of how a simple strategy can add some degree of autonomy to the robot learning task, avoiding the retraining of the neural network. 2. Flexible number of neurons. Finding the optimal number of nodes that produce a sufficiently accurate input–output mapping means having to run many tests. Since the number of neurons is strongly related with the speed of the algorithm, and having to run many tests is not coherent with the search for on-line learning algorithms, this is an important issue to be solved. These problems can be somewhat alleviated by networks like the LLM and the PSOM which were originally motivated by the SOM with the aim of obtaining a good map accuracy even with a small number of units. Another solution could use a network

444

G. DE A. BARRETO ET AL.

able to grow or shrink as necessary like the ITM (Jockusch and Ritter, 1999) or the Incremental LLM (Cimponeriu and Kihl, 1998). 3. Processing of temporal data. In many situations, the data originating from a manipulator is not uniformly distributed across the input space but, instead, has a trajectory-like form. For these sequential data, all the network models introduced here do not perform well (with the exception of the ITM), since re-ordering of the training set is not allowed. A possible solution might be the inclusion of dynamics (time) into the neural network model and the corresponding change of the winner selection criteria or the adaptation algorithm in order to cope with input data which exhibits temporal nature. A review of temporal variants of the SOFMs described in this paper can be found in (Barreto and Araújo, 2001a). The ability of these networks in capturing temporal information can be of great utility in several robotic tasks, such as path planning. Indeed, it has been our current focus of research to use such temporal SOFMs, together with the definition in (4), to model also input–output mappings with memory (Barreto and Araújo, 2001b, 2002). In these papers, we apply the SOM to time series prediction and modeling of a hydraulic manipulator. The accuracy of SOM in approximating such nonlinear dynamic mappings is similar to that of the MLP network, and better than that of RBF networks. Also, we verified that the SOM is less sensitive to weight initialization than both supervised models. See also (Martinetz and Schulten, 1993) for numerical results on comparing the NGA models with the MLP and RBF networks in time series modeling. 4. Performance comparison. We also suggest as further work systematic quantitative comparisons of the self-organizing algorithms presented in this review among themselves as well as with more traditional adaptive approaches from control theory (Vukobratovic and Tuneski, 1996). This is needed in order to establish their usefulness as a solid paradigm for adaptive control of robotic manipulators. Also, performance comparison involving some of the methods described here for visual servoing with other self-organizing architectures, such as the self-organizing invertible map (SOIM) (Srinivasa and Sharma, 1997), would be highly desirable. The SOIM uses the Fuzzy ART network (Carpenter et al., 1991), which is a vector quantization algorithm but does not have the property of topology preservation. This comparison could further elucidate if neighborhood preservation provide any other advantages for the improvement of the task in addition to those presented in Section 8. 5. Biological modeling. Most of the networks presented in this paper were used in engineering-oriented scenarios. We believe that SOFMs can be used for biological sensorimotor modeling as well. For example, in the case of an “antropomorphic” arm, output values describing muscle contractions could be used instead of joint angles. This would allow the comparison with other unsupervised models, such as the VITE (Bullock and Grossberg, 1988), the VAM (Gaudiano and Grossberg, 1991) and the DIRECT models (Bullock et al., 1993) in more

445

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

common grounds. Such self-organizing sensorimotor models can be of great utility to the research in robotics in general. As a matter of fact, Murray et al. (1994) pointed out that our impatience with the pace of robotics research and our expectations of what robots can and cannot do is in large part due to our lack of appreciation of the incredible power and subtlety of our own biological motor control systems. Acknowledgements The authors would like to thank FAPESP for its financial support under the grants #98/12699-7 and #00/12517-8. Appendix Table I. Summary table with the SOFMs described in this paper Authors

SOFM

Coiton et al. (1991) Jones and Vernon (1994) Kihl et al. (1995) Buessler et al. (1999) Zeller et al. (1996, 1997) Walter (1996) Walter and Ritter (1995, 1996) Ritter et al. (1989) Martinetz et al. (1990) Martinetz and Schulten (1993) Walter and Schulten (1993) Ritter et al. (1992) de Angulo and Torras (1996, 1997) Hesselroth et al. (1994) Walter (1998) Arras et al. (1992) Morasso and Sanguineti (1995) Morasso et al. (1997) Glasius et al. (1996) Yang and Meng (2001) Behera et al. (1995) Cervera et al. (1996) Jockusch (2000)

SOM SOM

Robot

Adaptive

Inverse kinematics Visual servoing

Non-redundant Non-redundant

No No

TRN PSOM

Visual servoing Inverse kinematics

Non-redundant Non-redundant

No No

LLM

Visual servoing

Non-redundant

Yes

LLM/NGA LLM LLM

Visual servoing Visual servoing Robot calibration

Non-redundant Redundant Redundant

Yes Yes Yes

NGA/LLM PSOM LLM SOM/TRN

Inverse kinematics Task-priority strategy Visual servoing Path planning

Redundant Redundant Non-redundant Redundant

No No Yes No

Path planning

Redundant

No

Inverse Dynamics Compliant motion Compliant motion

Non-redundant Non-redundant Redundant

No No Yes

SOM SOM/NGA SOM ITM

Task

446

G. DE A. BARRETO ET AL.

Table I. (Continued) Authors

SOFM

Delgado (2000) Kuperstein and Rubistein (1989) Kuperstein (1991) Grossberg and Kuperstein (1986) Barreto and Ara´ujo (2002)

SOM INFANT

SOM

Task

Robot

Adaptive

Optimal Control Visual servoing

Non-redundant Redundant

No No

Dynamic modeling

Non-redundant

No

References Alhoniemi, E., Hollmén, J., Simula, O., and Vesanto, J.: 1999, Process monitoring and modeling using the self-organizing map, Integrated Comput. Aided Engrg. 6(1), 3–14. Araújo, A. F. R. and Barreto, G. A.: 2002, Context in temporal sequence processing: A selforganizing approach and its application to robotics, IEEE Trans. Neural Networks 13(1), 45–57. Arras, M. K., Protzel, P. W., and Palumbo, D. L.: 1992, Automatic learning rate adjustment for selfsupervising autonomous robot control, in: K. Schuster (ed.), Applications of Neural Networks, VCH Verlag, Weinheim. Balakrishnan, S. N. and Weil, R. D.: 1996, Neurocontrol: A literature survey, Math. Comput. Modelling 23(1/2), 101–107. Barreto, G. A. and Araújo, A. F. R.: 2001a, Time in self-organizing maps: An overview of models, Internat. J. Comput. Res. 10(2), 139–179. Barreto, G. A. and Araújo, A. F. R.: 2001b, A self-organizing NARX network and its application to prediction of chaotic time series, in: Proc. of the IEEE-INNS Internat. Joint Conf. on Neural Networks (IJCNN’01), Vol. 3, Washington, DC, pp. 2144–2149. Barreto, G. A. and Araújo, A. F. R.: 2002, Temporal associative memory and function approximation with the self-organizing map, in: IEEE Workshop on Neural Networks for Signal Processing, IEEE Press, New York. Behera, L., Gopal, M., and Chaudhury, S.: 1995, Self-organizing neural networks for learning inverse dynamics of robot manipulator, in: Proc. of the IEEE/IAS Internat. Conf. on Industrial Automation and Control, pp. 457–460. Bekey, G. A.: 1992, Robotics and neural networks, in: B. Kosko (ed.), Neural Networks for Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, pp. 161–187. Bernstein, N. A.: 1967, The Coordination and Regulation of Movements, Pergamon Press, Oxford. Buessler, J.-L. and Urban, J.-P.: 1998, Visually guided movements: Learning with modular neural maps in robotics, Neural Networks 11(7/8), 1395–1415. Buessler, J. L., Kara, R., Wira, P., Kihl, H., and Urban, J. P.: 1999, Multiple self-organizing maps to facilitate the learning of visuo-motor correlations, in: Proc. of the IEEE Internat. Conf. on Systems, Man, and Cybernetics, Vol. III, Tokyo, Japan, pp. 470–475. Bullock, D. and Grossberg, S.: 1988, Neural dynamics of planned arm movements: Emergent invariants and speed-accuracy properties during trajectory formation, Psycholog. Review 95, 49–90. Bullock, D., Grossberg, S., and Guenther, F.: 1993, A self-organizing neural model of motor equivalent reaching and tool use by a multijoint arm, J. Cognitive Neurosci. 5(4), 408–435. Carpenter, G. A., Grossberg, S., and Rosen, D. B.: 1991, Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system, Neural Networks 4, 759–771.

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

447

Cervera, E., Pobil, A. P., Marta, E., and Serna, M. A.: 1996, Perception-based learning for motion in contact in task planning, J. Intelligent Robotic Systems 17(3), 283–308. Cimponeriu, A. and Kihl, H.: 1998, Intelligent control with the growing competitive linear local mapping neural network for robotic hand–eye coordination, in: Proc. of the 2nd Internat. Conf. on Knowledge-based Intelligent Electronic Systems (KES’98), Vol. 3, Adelaide, Australia, pp. 46–52. Cohen, M. A. and Grossberg, S.: 1983, Absolute stability of global pattern formation and parallel memory storage by competitive neural networks, IEEE Trans. Systems Man Cybernet. 13, 815–826. Coiton, Y., Gilhodes, J. C., Velay, J. L., and Roll, J. P.: 1991, A neural network model for the intersensory coordination involved in goal-directed movements, Biological Cybernet. 66, 167–176. Cottrell, M.: 1998, Theoretical aspects of the SOM algorithm, Neurocomputing 21, 119–138. Craig, J. J.: 1989, Introduction to Robotics: Mechanics and Control, 2nd ed., Addison-Wesley, Reading, MA. Cruse, H., Wischmeyer, E., Brüwer, M., Brockfeld, P., and Dress, A.: 1990, On the cost functions for the control of the human arm movement, Biological Cybernet. 62, 519–528. de Angulo, V. R. and Torras, C.: 1996, Automatic recalibration of a space robot: An industrial prototype, in: Proc. of the Internat. Conf. on Artificial Neural Networks (ICANN’96), Bochum, Germany, pp. 635–640. de Angulo, V. R. and Torras, C.: 1997, Self-calibration of a space robot, IEEE Trans. Neural Networks 8(4), 951–963. Delgado, A.: 2000, Control of nonlinear systems using a self-organising neural network, Neural Comput. Appl. 9(2), 113–123. DeMers, D. and Kreutz-Delgado, K.: 1996, Canonical parameterization of excess motor degrees of freedom with self-organizing maps, IEEE Trans. Neural Networks 7(1), 43–55. Deo, A. S. and Walker, I. D.: 1995, Overview of damped least-squares methods for inverse kinematics of robot manipulators, J. Intelligent Robotic Systems 14(1), 43–68. Faldella, E., Fringuelli, B., Passeri, D., and Rosi, L.: 1997, A neural approach to robotic haptic recognition of 3-d objects based on a Kohonen self-organizing feature map, IEEE Trans. Industr. Electronics 44(2), 267–269. Flash, T. and Hogan, N.: 1995, Optimization principles in motor control, in: M. A. Arbib (ed.), The Handbook of Brain Theory and Neural Networks, MIT Press, Cambridge, MA, pp. 682–685. Gaudiano, P. and Grossberg, S.: 1991, Vector associative maps: Unsupervised real-time error-based learning and control of movement trajectories, Neural Networks 4, 147–183. Glasius, R., Komoda, A., and Gielen, S.: 1996, A biologically inspired neural net for trajectory formation and obstacle avoidance, Biological Cybernet. 84, 511–520. Golnazarian, W. and Hall, E. L.: 2000, Intelligent industrial robots, in: R. L. Shell and E. L. Hall (eds), The Handbook of Industrial Automation, Marcel Dekker, New York. Grossberg, S.: 1988, Nonlinear neural networks: Principles, mechanisms, and architectures, Neural Networks 1, 17–61. Grossberg, S. and Kuperstein, M.: 1986, Neural Dynamics of Adaptive Sensory-Motor Control, Elsevier, Amsterdam. Heikkonen, J. and Koikkalainen, P.: 1997, Self-organization and autonomous robots, in: O. Omidvar and P. van der Smagt (eds), Neural Systems for Robotics, Academic Press, New York, pp. 297–337. Hesselroth, T., Sarkar, K., van der Smagt, P., and Schulten, K.: 1994, Neural network control of a pneumatic robot arm, IEEE Trans. Systems Man Cybernet. 24(1), 28–38. Hunt, K. J., Sbarbaro, D., Zbikowski, R., and Gawthrop, P. J.: 1992, eural networks for control systems – A survey, Automatica 28(6), 1083–1112. Hutchinson, S., Hager, G., and Corke, P.: 1996, A tutorial on visual servo control, IEEE Trans. Robotics Automat. 12(5), 651–670.

448

G. DE A. BARRETO ET AL.

Jockusch, J.: 2000, Exploration based on neural networks with applications in manipulator control, Unpublished doctoral dissertation, Faculty of Technology, Neuroinformatic Group, University of Bielefeld, Bielefeld, Germany. Jockusch, J. and Ritter, H.: 1999, An instantaneous topological mapping model for correlated stimuli, in: Proc. of the Internat. Joint Conf. on Neural Networks (IJCNN’99), Washington, pp. 529–534. Jones, M. and Vernon, D.: 1994, Using neural networks to learn hand–eye co-ordination, Neural Computing Appl. 2, 2–12. Kathib, O.: 1986, Real-time obstacle avoidance for manipulators and mobile robots, Internat. J. Robotics Res. 5, 90–98. Kihl, H., Urban, J.-P., Gresser, J., and Hagmann, S.: 1995, Neural network based hand–eye positioning with a transputer-based system, in: Proc. of the Internat. Conf. on High Performance Computing and Networking, Milan, Italy, pp. 281–286. Kohonen, T.: 1990, The self-organizing map, Proc. IEEE 78, 1464–1480. Kohonen, T.: 1997, Self-Organizing Maps, 2nd extended ed., Springer, Berlin/Heidelberg. Kohonen, T., Oja, E., Simula, O., Visa, A., and Kangas, J.: 1996, Engineering applications of the self-organizing map, Proc. IEEE 84(10), 1358–1384. Kosko, B.: 1990, Unsupervised learning in noise, IEEE Trans. Neural Networks 1(1), 44–57. Kung, S.-Y. and Hwang, J.-N.: 1989, Neural network architectures for robotic applications, IEEE Trans. Robotics Automat. 5(5), 641–657. Kuperstein, M.: 1991, INFANT neural controller for adaptive sensory-motor coordination, Neural Networks 4, 131–145. Kuperstein, M. and Rubistein, J.: 1989, Implementation of an adaptive neural controller for sensory-motor coordination, IEEE Control Systems Mag. 9(3), 25–30. Martinetz, T. M. and Schulten, K. J.: 1991, A “neural-gas” network learns topologies, in: T. Kohonen, K. Makisara, O. Simula, and J. Kangas (eds), Artificial Neural Networks, NorthHolland, Amsterdam, pp. 397–402. Martinetz, T. and Schulten, K.: 1993, A neural network for robot control: Cooperation between neural units as a requirement for learning, Comput. Electrical Engrg. 19(4), 315–332. Martinetz, T. and Schulten, K.: 1994, Topology representing networks, Neural Networks 7(3), 507–522. Martinetz, T. M., Ritter, H. J., and Schulten, K. J.: 1990, Three-dimensional neural net for learning visuomotor coordination of a robot arm, IEEE Trans. Neural Networks 1(1), 131–136. Massone, L. L. E.: 1995, Sensorimotor learning, in: M. A. Arbib (ed.), The Handbook of Brain Theory and Neural Networks, MIT Press, Cambridge, MA, pp. 860–864. Medler, D. A.: 1998, A brief history of connectionism, Neural Computing Surveys 1, 61–101. Menzel, R., Woelfl, K., and Pfeiffer, F.: 1993, The development of a hydraulic hand, in: Proc. of the 2nd Conf. on Mechatronics and Robotics, Duisburg/Moers, Germany, pp. 225–238. Miller, W. T., Sutton, R. S., and Werbos, P. J.: 1990, Neural Networks for Control, MIT Press, Cambridge, MA. Morasso, P. and Sanguineti, V.: 1995, Self-organizing body schema for motor planning, J. Motor Behavior 27(1), 52–66. Morasso, P., Sanguineti, V., and Spada, G.: 1997, A computational theory of targeting movements based on force fields and topology representing networks, Neurocomputing 15, 411–434. Murray, R. M., Li, Z., and Sastry, S. S.: 1994, A Mathematical Introduction to Robotics: Mechanics and Control, CRC Press, Boca Raton, FL. Narendra, K. S. and Annaswamy, A. M.: 1989, Stable Adaptive Systems, Prentice-Hall, Englewood Cliffs, NJ. Nehmzow, U.: 1998, Self-supervised and supervised acquisition of smooth sensory-motor competences in mobile robots, in: H. Cruse, H. Ritter, and J. Dean (eds), Prerational Intelligence in Robotics, Kluwer Academic, Dordrecht.

SELF-ORGANIZING FEATURE MAPS FOR ROBOTIC MANIPULATORS

449

Prabhu, S. M. and Garg, D. P.: 1996, Artificial neural network based robot control: An overview, J. Intelligent Robotic Systems 15, 333–365. Psaltis, D., Sideris, A., and Yamamura, A. A.: 1988, A multilayered neural network controller, IEEE Control Systems Mag. 8(2), 17–21. Raibert, M. H. and Horn, B. K. P.: 1978, Manipulator control using the configuration space method, Industr. Robot 5, 69–73. Ritter, H.: 1991, Learning with the self-organizing map, in: T. Kohonen, K. Makisara, O. Simula, and J. Kangas (eds), Artificial Neural Networks, Vol 1, North-Holland, Amsterdam, pp. 379–384. Ritter, H.: 1993, Parametrized self-organizing maps, in: Proc. of the Internat. Conf. on Artificial Neural Networks (ICANN’93), Springer, Berlin, pp. 568–575. Ritter, H. and Schulten, K.: 1987, Planning a dynamic trajectory via path finding in discretized phase space, in: Parallel Processing: Logic, Organization, and Technology, Vol. 253, Springer, Berlin, pp. 29–39. Ritter, H., Martinetz, T., and Schulten, K.: 1989, Topology conserving maps for learning visuomotor coordination, Neural Networks 2, 159–168. Ritter, H., Martinetz, T., and Schulten, K.: 1992, Neural Computation and Self-Organizing Maps: An Introduction, Addison-Wesley, Reading, MA. Roy, A.: 2000, Artificial neural networks: A science in trouble, SIGKDD Explorations 1(2), 33–38. Rylatt, M., Czarnecki, C., and Routen, T.: 1998, Connectionist learning in behaviour-based mobile robots: A survey, Artificial Intelligence Rev. 12, 445-468. Sbarbaro, D. and Bassi, D.: 1995, A nonlinear controller based on self-organizing maps, in: Proc. of the IEEE Internat. Conf. on Systems, Man, and Cybernetics, Vancouver, CA, pp. 1774–1777. Sciavicco, L. and Siciliano, B.: 2000, Modelling and Control of Robot Manipulators, 2nd edn, Springer, Berlin. Simula, O., Ahola, J., Alhoniemi, E., Himberg, J., and Vesanto, J.: 1999, Self-organizing map in analysis of large-scale industrial systems, in: E. Oja and S. Kaski (eds), Kohonen Maps, Elsevier, Amsterdam, pp. 375–387. Srinivasa, N. and Sharma, R.: 1997, SOIM: A self-organizing invertible map with applications in active vision, IEEE Trans. Neural Networks 8(3), 758–773. Szepesvári, C. and Lörincz, A.: 1998, An integrated architecture for motion-control and path-planning, J. Robotic Systems 15(1), 1–15. Torras, C.: 1995, Robot control, in: M. A. Arbib (ed.), The Handbook of Brain Theory and Neural Networks, MIT Press, Cambridge, MA, pp. 820–823. Tzafestas, S. G.: 1995, Neural networks in robot control, in: S. G. Tzafestas and H. B. Verbruggen (eds), Artificial Intelligence in Industrial Decision Making, Control and Automation, Kluwer, Dordrecht, pp. 327–387. Vesanto, J. and Alhoniemi, E.: 2000, Clustering of the self-organizing map, IEEE Trans. Neural Networks 11(3), 586–600. Vukobratovic, M.: 1997, How to control robots interacting with dynamic environment, J. Intelligent Robotic Systems 19(2), 119–152. Vukobratovic, M. and Tuneski, A.: 1996, Adaptive control of single rigid robotic manipulators interacting with dynamic environment – An overview, J. Intelligent Robotic Systems 17(1), 1–30. Walter, J.: 1996, Rapid Learning in Robotics, Cuvillier Verlag, Göttingen. Walter, J.: 1998, PSOM network: Learning with few examples, in: Proc. of the Internat. Conf. on Robotics and Automation (ICRA’98), Leuven, Belgium, pp. 2054–2059. Walter, J. and Ritter, H.: 1995, Investment learning with hierarchical PSOM, in: D. Touretzky, M. Mozer, and M. Hasselmo (eds), Advances in Neural Information Processing Systems 8 (NIPS’95), MIT Press, Bradford, pp. 570–576. Walter, J. and Ritter, H.: 1996, Rapid learning with parametrized self-organizing maps, Neurocomputing 12, 131–153.

450

G. DE A. BARRETO ET AL.

Walter, J. A. and Schulten, K. J.: 1993, Implementation of self-organizing networks for visuo-motor control of an industrial robot, IEEE Trans. Neural Networks 4(1), 86–95. Wu, C. M., Jiang, B. C., and Wu, C. H.: 1993, Using neural networks for robot positioning control, Robotics Comput.-Integrated Manufacturing 10(3), 153–168. Yang, S. X. and Meng, M.: 2001, Neural network approaches to dynamic collision-free trajectory generation, IEEE Trans. Systems Man Cybernet. B 31(3), 302–318. Zeller, M., Wallace, K. R., and Schulten, K.: 1995, Biological visuo-motor control of a pneumatic robot arm, in: Dagli, Akay, Chen, Fernandez, and Gosh (eds), Intelligent Engineering Systems through Artificial Neural Networks, ASME Press, New York, pp. 645–650. Zeller, M., Sharma, R., and Schulten, K.: 1996, Topology representing network for sensor-based robot motion planning, in: Proc. of the World Congress on Neural Networks (WCNN’96), San Diego, CA, pp. 100–103. Zeller, M., Sharma, R., and Schulten, K.: 1997, Motion planning of a pneumatic robot using a neural network, IEEE Control Systems Mag. 17, 89–98. Zomaya, A. Y. and Nabhan, T. M.: 1994, Trends in neuroadaptive control for robot manipulators, in: R. Dorf and A. Kusiak (eds), Handbook of Design, Manufacturing and Automation, Wiley, New York, pp. 889–917.