Document not found! Please try again

towards reasoning and coordinating action in the mental space

0 downloads 176 Views 1002KB Size Report
Towards Reasoning and Coordinating Action in the Mental Space 331 basic operations: (1) relaxation in the configura- tio
1st Reading July 29, 2007 21:59 00117

International Journal of Neural Systems, Vol. 17, No. 4 (2007) 329–341 c World Scientific Publishing Company 

TOWARDS REASONING AND COORDINATING ACTION IN THE MENTAL SPACE VISHWANATHAN MOHAN Doctoral School on Humanoid Technologies, Italian Institute of Technology, Via Morego 30, Genova, Italy [email protected] PIETRO MORASSO Neurolab, DIST, University of Genova, Via Opera Pia 13, Genova 16145 Italy [email protected] www.biomedica.laboratorium.dist.unige.it/neurolab.html Unlike a purely reactive system where the motor output is exclusively controlled by the actual sensory input, a cognitive system must be capable of running mental processes which virtually simulate action sequences aimed at achieving a goal. The mental process either attempts to find a feasible course of action compatible with a number of constraints (Internal, Environmental, Task Specific etc) or selects it from a repertoire of previously learned actions, according to the parameters of the task. If neither reasoning process succeeds, a typical backup strategy is to look for a tool that might allow the operator to match all the task constraints. This further necessitates having the capability to alter ones own goal structures to generate sub-goals which must be successfully accomplished in order to achieve the primary goal. In this paper, we introduce a forward/inverse motor control architecture (FMC/ IMC) that relaxes an internal model of the overall kinematic chain to a virtual force field applied to the end effector, in the intended direction of movement. This is analogous to the mechanism of coordinating the motion of a wooden marionette by means of attached strings. The relaxation of the FMC/IMC pair provides a general solution for mentally simulating an action of reaching a target position taking into consideration a range of geometric constraints (range of motion in the joint space, internal and external constraints in the workspace) as well as effort-related constraints (range of torque of the actuators, etc.). In case, the forward simulation is successful, the movement is executed; otherwise the residual “error” or measure of inconsistency is taken as a starting point for breaking the action plan into a sequence of sub actions. This process is achieved using a recurrent neural network (RNN) which coordinates the overall reasoning process of framing and issuing goals to the forward inverse models, searching for alternatives tools in solution space and formation of sub-goals based on past context knowledge and present inputs. The RNN + FMC/IMC system is able to successfully reason and coordinate a diverse range of reaching and grasping sequences with/without tools. Using a simple robotic platform (5 DOF Scorbot arm + Stereo vision) we present results of reasoning and coordination of arm/tool movements (real and mental simulation) specifically directed towards solving the classical 2-stick paradigm from animal reasoning at a non linguistic level. Keywords: Forward/inverse models, passive motion paradigm, recurrent neural networks, mental simulation, Animal Reasoning, 2-stick paradigm.

329

1st Reading July 29, 2007 21:59 00117

330

1.

V. Mohan & P. Morasso

Introduction

Considerable progress in brain science coupled with rapid increase in robots’ computing capabilities and quality of their mechanical components has resulted in tremendous interest of the scientific community towards development of artifacts which evolve their cognition and motor control autonomously and develop new skills in structural coupling with their environments. One of the central aspects of higher level cognitive behavior is the ability to build and manipulate mental representations of the body and environment in order to evaluate consequences of one’s actions in the world of distinct behavioral options.5,7 By means of these mental representations, it is possible to decouple behavior from direct control of the environment and react to features that are not present at the moment, i.e. the ability to plan ahead. On the other hand, tasks and goals (stack 2 cylinders, bring the red ball here, etc.) are specified at a rather high, often symbolic level and the motor system faces the daunting task of eventually working out the problem at a much detailed level in order to specify the activations which lead to joint rotations, movement trajectory in space, and interaction forces.10 At the same time, the solution must be compatible with a multitude of constraints: internal, external, task specific and their possible combinations. To further complicate things, there is no single solution to the problem of motor control: in fact, there may be countless ways of doing the same task. Hence arises the problem of tackling with kinematic redundancies and if possible, exploiting them effectively. If these considerations are needed for making a single movement, then the task of coordinating sequences of movements to achieve a goal directed behavior also necessitates the ability to decide what to do next based on past decisions. Hence, before making a movement aimed at achieving a task, human beings either run a mental process that attempts to find a feasible course of action or select it from a repertoire of previously learned actions, according to the parameters of the task. If neither reasoning process succeeds, a typical backup strategy is to look for a tool that might allow the operator to match all the task constraints. A cognitive robot should support a similar reasoning system.4 A central element of the proposed architecture is a coupled pair of controllers: FMC (forward motor controller: it maps tentative trajectories in the joint space into

the corresponding trajectories of the end-effector variables in the workspace) and IMC (inverse motor controller: it maps desired trajectories of the endeffector into feasible trajectories in the joint space). If we consider the arm as the target system, the FMC predicts the next state (position and velocity) given an initial state and motor command. The IMC does the inverse, i.e. it takes a goal end-state as input and produces a sequence of motor commands. That such forward models of the motor system occur in the brain has been demonstrated by numerous authors.11 More work in the area has resulted in proposition of methods by which such forward models can be used in planning7 (where actual motor action is inhibited during the running of the forward model) or in developing a model of the actions of another person.12 On the other hand, recent studies on animal reasoning are very relevant to guide as to how we may develop primitive reasoning and imaginative abilities in artificial systems. These processes are increasingly well known in chimpanzees, such as in tool use and in imitation14 or in apes in terms of the results of language learning experiments. However, even a larger circle of animals appear to be involved in some form of cognitive behavior, such as: (a) reasoning powers of crows, wood peckers etc, who can fashion tools appropriate for extracting food from receptacles by means of bending pieces of wire into suitable hooked shapes to retrieve the food,13 (b) reasoning powers, especially of a social nature in corvids such as caching food and reburying it in a different site if they see that they have been observed by another of their species (c) use of appropriate stones as tools by Egyptian vultures to crack open ostrich eggs,15 (d) bait-fishing in herons. In this paper, we will consider in more detail a family of simple architectures employing internal models capable to exhibit some form of reasoning power and goal oriented behaviors. We begin with the description of a FMC/IMC pair architecture that provides a general solution for mentally simulating an action of reaching a target implicitly knowledgeable of geometric constraints as well as effort-related constraints. The proposed model operates with any degree of redundancy and can deal concurrently with geometric constraints (range of motion in the joint space, internal and external constraints in the workspace) and effort-related constraints (range of torque of the actuators, etc.). It operates by alternating two

1st Reading July 29, 2007 21:59 00117

Towards Reasoning and Coordinating Action in the Mental Space

basic operations: (1) relaxation in the configuration space (for reaching a target pose); (2) relaxation in the null space of the kinematic transformation (for producing the required interaction force). If either relaxation settles into a state different from the goal state, such mismatch can trigger a further level of reasoning. This process involves integrating body/environment-state signals (reach error, gripper occupancy, availability of tools, goals, past actions), evaluating options ahead, and making appropriate choices of next action. We present a recurrent neural network based architecture that encodes additional knowledge related to the causal nature of the task and operates on the forward/inverse models and other modules to coordinate the successful realization of goals (or report that a solution does not exist). Using a simple robotic platform adopted in the EU funded GNOSYS project (5 DOF Scorbot arm + Stereo vision) we present results of reasoning and coordination of arm/tool movements (real and mental simulation) specifically directed towards solving the classical 2-stick paradigm from animal reasoning at a non linguistic level.

2.1. General formulation Let x be the vector that identifies the pose of the endeffector of a robot in the workspace and q the vector that identifies the configuration of the robot in the joint space: the kinematic transformation x = f (q) is characterized, for each time instant of the planning process, by the following equation x˙ = J(q) · q˙

T = JT F

(3)

(c) Relax the arm configuration in the applied field: q˙ = B · T

(4)

(d) Map the arm movement into the workspace: x˙ = J · q˙

(5)

where Kx is a stiffness matrix and B is a viscosity matrix. Figure 1 illustrates the method. The algorithm always converges to a “reasonable” equilibrium state, whatever the degree of redundancy of the robot: if the target is within the workspace of the robot, it is reached; if it is not reachable, the robot fully extends the arm to a position that is at a minimum distance from the target. Moreover, the timing of the relaxation process can be controlled by using a TBG (Time Base Generator2 and the concept of terminal attractor dynamics3 : this can be simply implemented by substituting the relaxation Eq. (4) with the following one: (6)

where a possible form of the TBG or timevarying gain that implements the terminal attractor

(1)

where J is the Jacobian matrix of the kinematic transformation. In general, we shall consider redundant robots, i.e. robots in which the dimensionality of q is (much) greater than the dimensionality of x. An inverse motor controller is one that, given a desired law of motion to a target in the workspace, computes a coordinated joint rotation pattern that implements it. We build the IMC by using the passive motion paradigm (PMP: Mussa Ivaldi et al. 1988) that consists of the following steps: (a) Define a virtual attractive force field to a designated target: XT : F = Kx (xT − x)

(b) Map the force vector into an equivalent torque vector (principle of virtual works):

q˙ = Γ(t) · B · T

2. The FMC/IMC Pair

331

(2)

Fig. 1. The FMC/IMC pair architecture. J: Jacobian matrix of the kinematic transformation; Kx , Kq : (virtual) stiffness matrices that generate attractive force fields in the task and joint spaces, respectively; B: (virtual) viscosity for damping the relaxation to the equilibrium state; Γ(t): central pattern generation for coordinating the two relaxations and setting the time to equilibrium.

1st Reading July 29, 2007 21:59 00117

332

V. Mohan & P. Morasso

dynamics is the following one (it uses a minimumjerk generator with duration τ ): Γ(t) =

ξ˙ 1−ξ

(7)

where ξ(t) = 6(t/τ )5 − 15(t/τ )4 + 10(t/τ )3

(8)

In general, a TBG can also be used as a computational tool for synchronizing two plans, coordinating two arms or even the movements of two robots. 2.2. FMC/IMC pair: dealing with constraints The FMC/IMC model can be further extended in order to simultaneously deal with multiple task dependent constraints, concurrent with the trajectory formation process. In other words, a non redundant system is just the limit case. We consider the following two cases in the block diagram shown in Fig. 1. First relaxation: to the target pose in agreement with the joint limits. Second relaxation (dashed line): in the null space in order to produce the target effort in agreement with the actuator limits. Relaxation is achieved as follows: (1) Keep the robot away from the joint limits (an internal constraint): this can be expressed by adding an attractive force field in the joint space Kq (qref − q), where qref is a set of joint angles in the middle of their range of motion (an alternative implementation would be a repulsive field from the joint limits). (2) After reaching the target pose, perform a movement in the null space of the kinematic transformationa in such a way that for a given desired effort FT the torque required of each actuator is within the allowed limits: this is implemented by introducing FT as an additional force drive in the IMC and saturating the computed torque vector according to the actuator constraints.

Fig. 2. Neural network for mapping the kinematic Transformation.

Jacobian matrix in a simple way. Suppose for example that we use a three-layer network (Fig. 2), the input-output relations can be written as   hj = wij qi    i   (9) x = f (q) ⇒ zj = g (hj )     x = w z jk j  k j

where hj is an intermediate variable, zj is the output of the hidden layer, g(.) is a sigmoid nonlinearity, wij , wjk are the connection weights from input to hidden and hidden to output layer respectively, qi and xk are the inputs and outputs of the ANN. After training, we can extract the Jacobian matrix (linking distal space to proximal space in the FMC/IMC) from the neural network in the following way:  ∂xk ∂zj ∂hj ∂xk = Jki = ∂qi ∂zj ∂hj ∂qi j  = wjk g  (hj )wij (10) j

The incremental force and displacement vectors can be obtained from the Jacobian as follows:  Jki q˙i x˙ = J · q˙ ⇒ x˙ k =

= 2.3. FMC/IMC pair: neural network solution The scheme of Fig. 1 can be implemented in a neural way by training a multi-layer feed forward network to learn the kinematic transformation x = f (q). From this trained network it is possible to extract the a



 

i

i



wjk · wij · g  (hj ) q˙i

=

k

(11)

j

T = J T · F ⇒ Ti = 



 



Jik Fk

k





wji · wkj · g  (hj ) Fk

j

The null space of the kinematic transformation x = f (q) is characterized by the equation: J(q) · q˙ = 0.

(12)

1st Reading July 29, 2007 21:59 00117

Towards Reasoning and Coordinating Action in the Mental Space

The procedure can be easily generalized to a network with more than 1 hidden layer. Prior to adapting the forward inverse models to suit the GNOSYS robotic environment, we performed initial simulation experiments with respect to a planar robot with three revolute joints and link lengths of one for each link. Assuming, for each joint, a range of motion of ±90◦ , we sampled the configuration space with about 60000 samples. After a few trials we chose a (3, 30, 15, 2) Multi Layer Perceptron to approximate the forward kinematic transformation. Figure 3 illustrates the reaching trajectories obtained by the FMC/IMC pair without application of Joint reference fields and the external force drive (for Null Space movements), starting from an initial configuration of (π/4, π/4, 0). Of the four different targets are in the picture, three are within the workspace and one (−3, 4) are not reachable. Note that in case the desired target is outside the workspace (−3, 4), the manipulator robot nevertheless tries to follow the desired coordinates as well as possible and fully extends the arm to a position that is at a minimum distance from the target .The manipulator finally points in the direction of the desired endpoint coordinates. Although there is no solution to the problem, the network “does do its best.” The timing of the relaxation process is set as tf = 2, and is controlled by using a neural time

Fig. 3. Relaxation to the target pose using FMC/ IMC pair.

333

base generator. Also note the relatively straight line motion of the end effector (with a bell shaped velocity profile) which agrees with experimental evidence of kinematic invariances in hand reaching movements of humans.24 Figure 4 (top panel) shows the different solutions reached after adding attractive force fields in the joint space and different specified end-effector orientations to reach the target (1, 1). Finally, we simulated virtual movements in the null-space in such a way to minimize the overall effort required of all the actuators while satisfying the required target effort. These are coordinated movements of the body/arm/gripper that keep the same pose and are characterized by equation J(q) · q˙ = 0. Note that this is only possible for a redundant system and the effort constraint FT is turned on only after the target has been reached. Figure 4 (bottom panel) shows

Fig. 4. Top panel: Solutions with different constraints (joint constraint; orientation constraints). Bottom panel: Relaxation in the Null space.

1st Reading July 29, 2007 21:59 00117

334

V. Mohan & P. Morasso

the final trajectories reached by the FMC/IMC pair after performing null space movements. In summary, the forward inverse model pair provides a general solution for mentally simulating a reaching action to a target position, taking into consideration a range of geometric constraints (range of motion in the joint space, internal and external constraints in the workspace) as well as effort-related constraints (range of torque of the actuators, etc.). The system powered by Passive Motion Paradigm exploits redundancy efficiently and allows to make predictions on where the tool/end-effector is going for a given articulation and which kind of effort is required for a given end-point interaction. In case the forward simulation is successful, then the movement is executed, otherwise the residual “error” or measure of inconsistency can be taken as a starting point for breaking the action plan into a sequence of sub actions. We must also note that the same system is used to mentally testing reachability of targets as well as actual motion execution. The relaxation implied by the PMP always converges, even if the explicit inverse transformation fails. 3.

3.1. Robotic platform After initial simulation experiments on a planar 3 link arm, we carried out preliminary experimetal acitivities on a simple robotic platform of the Gnosys consortium (Fig. 6) consisting of a 5 DOF arm and 2 cameras for stereo vision. The goal was to test the forward/inverse models in real environment and thereby use it for further experiments on animal reasoning tasks. The set up was designed in such a way that the arm and cameras shared enough common workspace to test a range of reaching, grasping, reasoning and tool use situations. A three-layers MLP was trained to map the forward kinematics of the arm from which the Jacobians were extracted using Eq. (10).  C1  (L1 − uC1 LC1 9 ) uC1 − LC1 4  C1 C1 C1 C1 C1  v − L8  (L5 − v L9 )     uC2 − LC2  = (LC2 − uC2 LC2 ) 4  1 9 v C2 − LC2 C2 C2 C2 8 (L5 − v L9 )

We used the DLT (Direct Linear Transform)8,9 for carrying out camera calibration and 3D reconstruction. Projection equations 13 express the mapping of the 3D coordinates of a generic point in space (x, y, z) into the corresponding coordinates on the plane of the camera (u, v). This equation is nonlinear wrt to both the coordinates and 7 configuration parameters: camera position: [xo , y0 , zo ]; camera orientation: [rij ]; principal distance: d.  r11 (x − xo ) + r12 (y − yo ) + r13 (z − zo )    u−uo = −d r31 (x − xo ) + r32 (y − yo ) + r33 (z − zo )  r (x − xo ) + r22 (y − yo ) + r23 (z − zo )   v−vo = −d 21 r31 (x − xo ) + r32 (y − yo ) + r33 (z − zo ) (13) The trick of the DLT is to transform Eq. (13) (non linear in 7 independent unknowns) into an equation (14) which is linear in 11 parameters that, on the other hand, are not independent:  L1 x + L2 y + L3 z + L4     u = L9 x + L10 y + L11 z + 1

Robotic Platform, 3D Reconstruction and Robot-Camera Integration



3.2. Stereo camera calibration and 3D reconstruction

 L5 x + L6 y + L7 z + L8   v = L9 x + L10 y + L11 z + 1

(14)

Camera calibration then consists of presenting to the cameras a number of “control points”, i.e. targets of which we know, by design, both the Cartesian coordinates (xi , yi , zi ) and the corresponding camera coordinates (ui , vi ). The robot itself is used for generating the training set by means of “babbling” movements. Once the training set is obtained, the L matrix of the DLT is estimated for each camera by means of the Least Square method: LC1 , LC2 . After having calibrated both cameras, the reconstruction algorithm of the 3D coordinates of a target point (we ignore optical distortions for the moment) can be determined by Eqs. (15) and (16):

 C1 C1 C1 C1 (LC1 L10 ) (LC1 L11 )   2 −u 3 −u x C1 C1 C1 C1  (LC1 L10 ) (LC1 L11 )    6 −v 7 −v  · y  C2 C2 C2 C2  (LC2 L10 ) (LC2 L11 ) 2 −u 3 −u z C2 C2 C2 C2 C2 C2 (L6 − v L10 ) (L7 − v L11 )

(15a)

1st Reading July 29, 2007 21:59 00117

Towards Reasoning and Coordinating Action in the Mental Space

or, in vectorial form Y =A·X

(15b)

from which we can recover the target coordinates:  −1 T X = AT A A ·Y (16) Once the 3D coordinates of markers or image salient points in the robot reference system are estimated by means of Eq. (16), they can be directly fed as input to the FMC/IMC pair for testing reachability of targets and/or executing actual movement of the arm. We also note that recognition of salient points in the image is out of scope for discussion in this article and for our purpose we use the recognition modules already present in the integrated GNOSYS brain. Interested reader may refer to Ref. 18 for further information about the visual modules. Fig. 6A shows the robot generating training

335

set during calibration (the point of interest is a red sticker in the gripper) and Fig. 6B after 3D reconstruction from image plane coordinates and initiation of movement (towards the red sticker) using the FMC/IMC pair. 4.

Reasoning and Coordination of Body/Arm/Tool Movements for Reaching/Grasping/Tool Use

The forward inverse model (Sec. 2) provides a dynamic approach to integrate motor redundancy, a range of internal/external constraints and allows to virtually make predictions on where the tool/endeffector is going for given command patterns and which kind of effort is required for a given end-point interaction. In this section we introduce a recurrent neural network based coordinator as an extension to the FMC/IMC pair so as to enable ‘running forward in mind’ a sequence of actions and the ensuing consequences, hopefully to gain a final desired state. For simplicity, this section is split up into sub sections that specifically outline different aspects of the overall architecture, information flows and experimental paradigms considered. 4.1. Experimental scenarios based on low level animal reasoning

Fig. 5.

(A)

Gnosys robotic platform.

(B)

Fig. 6. Panel A: generation of visual targets for Camera Calibration. Panel B: test of reaching using FMC/IMC after 3D Reconstruction.

Recent ideas on animal reasoning are very relevant to guide as to how we may develop primitive reasoning and imaginative abilities in artificial systems. We refer to the term animal reasoning as the ability of animals to draw conclusions from facts and initiate suitable actions when trying to achieve a certain goal usually (but not always) of a food nature. A general discussion of observed reasoning abilities in animals was presented in the introductory section. We turn here to analyse more specific cases (suited to the robotic platform), hopefully to make clear how the FMC/IMC pair with suitable additions may be useful in solving them. This is a paradigm employed by experimenters working with chimpanzees. It is a slightly complicated version of the task in which the animal reasons about using a nearby stick as a tool to reach a food reward that was not directly reachable with its end effector. The above task of using one stick itself is non trivial in the sense that it involves understanding the relative discrepancy in distance, selection of

1st Reading July 29, 2007 21:59 00117

336

V. Mohan & P. Morasso

Fig. 8.

Fig. 7.

Two-stick paradigm.

a tool of suitable length and then the crucial process of forming sub goals to grasp the tool, update the kinematics of the arm and then start all over again. The two stick paradigm involves two sorts of sticks: Stk1 (short), and Stk2 (long), one of each being present on a given trial, only the small one being immediately available, and the food reward only being reachable by means of the larger stick. We also assume that the chimpanzee is sequestered in a cage and cannot move around beyond the enclosure. To obtain the food reward (the main goal of the plan), the animal goes through the following sequence of virtual or real actions: I. It attempts to reach the food directly: it fails. II. It visually searches for a tool of suitable length: it identifies stick Stk2. III. It initiates, as first sub-goal, a plan to grasp the long stick Stk2: again this leads to failure, triggering a further search. IV. It detects the presence of the small stick, which, if grasped, can enable it to reach the longer stick: it initiate the second sub-goal, i.e. to grasp the small stick Stk1. This plan is successful. V. It comes back to sub-goal 1, with a change: now it tries to reach the long stick using the short one. Since there is already a stick in its hands, it cannot grasp the longer one as of now, so using the short stick, it must maneuver the large stick within reach of the end-effector. VI. It releases Stk1 and reaches Stk2, which is now directly accessible. It grasps Stk2. VII. It uses the large stick to maneuver the food within reach, it releases Stk2 and obtains the food, thus attaining the main goal.

Goal representation.

The task of using two sticks contains in itself various other smaller reach/grasp sequences, also including the task of using one stick. The paradigm is also interesting in the sense that it necessitates integration and use of several components like drives, rewards, goal lists, vision, movement, tool use, forward/ inverse models to mention a few.

4.2. Neural net for coordinating action sequences In this section, we briefly outline the architecture of a neural network based coordinator operating along with the FMC/IMC pair, to reason about a diverse range of reaching and grasping sequences with tools. The forward/inverse models take as input the Cartesian space co-ordinates (reconstructed using Eqs. (15) and (16) of the object of interest. This is the ‘Where’ part of the problem. At convergence it is possible: 1. to test if the target is reachable, 2. if not, to estimate the size of the tool, 3. to make predictions on the required effort. What other modules are needed to supplement the forward models so as to achieve reasoning, mainly for a diverse range of reach/grasp/related tool use sequences? Firstly, we need a long term memory where goals, location of tools etc are stored (in case the robot already has a record of locations of tool, otherwise vision can be used to search for tools; in any case the contents have to be stored somewhere at least temporarily); a short term memory where goals are loaded for subsequent execution (may be based on some priority function: User requests/Self Motivation etc). In this preliminary experiments, we decided to have a relatively simple representation of goals as shown in Fig. 9. We should note here that even though reaching action is a subset of grasping, it is convenient to explicitly state this in the goal structure (there may be cases where only reaching and application of some interaction force is needed, like pressing a switch; in case the switch is not reachable,

1st Reading July 29, 2007 21:59 00117

Towards Reasoning and Coordinating Action in the Mental Space

337

gripper) is also used to initiate tool search, goal sequencer or sub-goal generation unit (in the case that error in reach is zero and the goal is to grasp an object, but gripper is already occupied with another tool). 3: Availability of tools: A winner in solution space indicates availability of possible tool. In case a tool is available, its location is passed to the sub-goal generation unit (to create a new goal for the FMC/IMC) and the length of the tool is passed on to the kinematics adjustment unit (in case it is grasped successfully).

Fig. 9.

RNN for coordinating the reasoning process.

a sub-goal to grasp an available tool initiates the grasp sequence). To coordinate different processes involved in achieving the goal, we decided to use a Jordan-type Recurrent Neural network shown in Fig. 9. The input layer receives the following information: 1: Grasp/Reach goal: Indicates if the object has to be reached or grasped (in which case necessary gripper control modules need to be activated if reaching was successful); 2: Reach error: Error signal is important information to decide whether to initiate the real movement (by calling the arm server and passing the motor commands generated by IMC). This signal comes from the virtual simulation of reaching using the FMC/IMC pair. As mentioned earlier, in case the target is outside the workspace, the FMC pair still tries to do its best (gives a solution that is at a minimum distance from the target, pointing in the same direction). At convergence, the discrepancy in distance can be calculated and thereby exploited to search appropriate tools in the solution space. The error signal along with occupancy signal (from the

4: Gripper state: This is a feedback from the gripper, used for initiating kinematics adjustment and subgoal generation process. The network also takes as input the activations of the RNN neurons at the previous time (t-1) considered as the context knowledge. The output is the action at time t, directed towards accomplishing a goal stored in the working memory. The activations at the output of the RNN are fed to an additional array of eight neurons which control initiation of different actions as illustrated in Fig. 9. They are briefly outlined below: WML: Loads the first goal (which has the highest priority) from long term memory as the current goal. ANN KA: This trained neural network supplements the FMC/IMC pair by providing the adjustment terms in the kinematics if reaching is to be done with a tool (stick). This is a function of the gripper occupancy. FMC/IMC Pair: Takes as input the coordinates in space to be reached, constraints etc and outputs an estimate of error. GS: On successful execution of current goal, the next goal of highest priority is placed at the initial position in Long term memory to be subsequently loaded by WML. In case a sub-goal is generated it is always given the highest priority. TSU: Searches for the tool which approximately matches the residual discrepancy error generated by the FMC/IMC system. This can be a memory or a self organizing map, a winner here outputs the length of the tool and the location in space. Location is space will be used by SGU while length will be used by ANN KA in case the tool is grasped successfully in subsequent reaching actions.

1st Reading July 29, 2007 21:59 00117

338

V. Mohan & P. Morasso

SGU: Frames sub-goal of grasping the tool as per the goal format described earlier and sends it with highest priority to the LTM. GC: On successful grasp activates the Gripper occupancy input so that in the next action the kinematics adjustment of tool is taken account of. SGUG: This action is initiated if the current goal is to grasp a tool but the gripper is already occupied by another tool. In this case the tool which has to be grasped is brought to a location near the body, the tool in gripper is released and a sub-goal to reach the tool again (now at new position) is framed and loaded in the long term memory. The results of the action at time t is once again fed back to the RNN as ‘St’ and this continues till the goal is executed successfully or impossibility of solving the goal is reported (in which case the robot can be motivated to go into exploration mode). Once the RNN is trained by back propagating errors for sets of sequences of actions, it is reasonably able to coordinate the eight processes for a variety of reaching tasks. Figures 10 and 11 show some of the sequences of action generated in different scenarios. If a goal is successfully accomplished the entire sequence can be memorized as a motor program for later reference.

Below we list few sample sequence of actions generated by the trained RNN+FMC/IMC system in different situations: • Normal Successful Reach with Unoccupied Gripper: A-C-A situations by the trained RNN+FMC/IMC system • Normal Successful Reach with Tool: A-B-C-A • Normal Successful Grasp with Unoccupied Gripper: A-C-E-A • Normal Successful Grasp with already occupied Gripper: A-B-C-G-A-C-E-A • Unsuccessful Reach /Grasp and No tool Available: A-C-D-A • Initial Unsuccessful reach with Possibility of Success with Tool: A-C-D-F-A-C-E-A-B-C-A • Two Stick Paradigm: A(Main Goal)-C (Target not reachable)-D-F (tool available, sg1 Grasp Tool 1)-A(sg1)-C-D (Tool in sg1 also not reachable)F (Second Stick available by which tool in sg1 can be reached)-A(sg2 Grasp Tool 2)-C-E(tool 2 Grasped)-A(Back to sg1 again)-B(Kinematics adjustment with Tool 2)-C-E-G(Sub goal to bring tool1 within direct reach, release tool2, sg3 Grasp tool1 at new proximal xyz position)-A(sg3)-CE(Tool1 grasped successfully)-A(Back to main goal)-B(Kinematics adjustment with Tool 1)-C-A (New Goal if any).

Fig. 10. Panels 1–4: The main goal is to reach the red ball but is unsuccessful (panel 1). A blue stick is detected and a sub-goal is formed to grasp it (panel 2). After grasping the tool, the kinematics is updated (panel 3); the earlier goal is attempted again, this time the ball is reached successfully (panel 4). The panels on the right hand side show a similar use of tools in animals: chimps using sticks to extract yoghurt from the tube, wood pecker finch prying grubs out of the branch with a cactus spine.

1st Reading July 29, 2007 21:59 00117

Towards Reasoning and Coordinating Action in the Mental Space

Fig. 11.

Sequences of actions generated in different.

We also tried out a variant of the two sticks case in which there are two small sticks, both are magnetized. The robot tries to reach the target unsuccessfully, then grasps the first stick. Still the target is not reachable, so now the new discrepancy in distance is taken as a measure to search for a tool, it finds the second small stick, (This also involves the task

Fig. 12.

339

of erasing from the solution space the memory of the tool which has already been used). The robot reaches the second stick from the top (attractive force fields in joint space to constraint joint angles), increases the effective length of the tool, (kinematics update again)now tries to solve the previous goal. This time it is successful. Figure 12 illustrates the process.

Use of two small sticks to reach a target.

1st Reading July 29, 2007 21:59 00117

340

5.

V. Mohan & P. Morasso

Concluding Remarks

In this article, we present results as to how some of the non linguistic reasoning paradigms observed in animals can be solved using a combination of the FMC/IMC-RNN system. The FMC/IMC pair is a part of an internal body model (arm in our case) that represents a set of all geometrically possible solutions of mentally reaching a target in space. The proposed FMC/IMC architecture sculpts a trajectory in real time, implicitly knowledgeable of multiple task dependent, geometric as well as effortrelated constraints. The same model is used for both virtually performing the action as well as initiating real movement. The RNN on the other hand encodes additional knowledge related to causal nature of the task and operates on the forward/inverse models and other modules to coordinate the process of successful realization of goals (or report non existence of solution). This process involves integrating body/environment-state signals (reach error, gripper occupancy, availability of tools, goals, past actions etc.), evaluating options ahead, and making appropriate choices of next action. Using a simple robotic platform, we present results of reasoning and coordination of arm/tool movements (real and mental simulation) specifically directed towards solving the classical 2-stick paradigm from animal reasoning at a non linguistic level. By using a real system (in contrast to a computer simulation), we also attempt so solve the engineering (and software) problems in addition to conceptual/ scientific ones for successful performance of complex tasks where many real systems have to simultaneously interact and pass information. Much work lies ahead, in suitably extending this preliminary model with guidance of attention, incorporating rewards, values, use of superior learning strategies, ability to use complex tools and suitable adaptation of the FMC/IMC architecture for more complex robotic embodiments. As a final note, future efforts will also be directed towards extending the present version of the FMC/IMC system to reason and achieve goals in at least three distinct levels: (1) Simplest: using only learnt single chunks of sequences of object/action, to directly obtain a goal being attempted; (2) Intermediate: gluing together (by concatenation) different sequences of object/action pairs so as to obtain a final goal, in the process reaching several recognizable subgoals, noted as part of the overall task decomposi-

tion; (3) Highest: developing new learnt sequences (by trial and error or analogical methods or linguistically or by other means), so that they can be used in getting to sub-goals that lead to the overall goal. Acknowledgements This work was partly supported by the EU FP6 project GNOSYS and a doctoral fellowship to V. Mohan from the Italian Institute of Technology. The authors would like to thank J. G. Taylor and S. Kasderidis for fruitful interactions. We would also like to express our gratitude to V. Spais and N. Chatzinikolaou for providing support with the robotic platform and low level software communication. This work benefits from all our partners of the GNOSYS project. Thanks to all of them. References 1. F. A. Mussa Ivaldi, P. Morasso and R. Zaccaria, Kinematic Networks. A distributed model for representing and regularizing motor redundancy, Biological Cybernetics 60 (1988) 1–16. 2. T. Tsuji, Y. Tanaka, P. Morasso, V. Sanguineti and M. Kaneko, Bio-mimetic trajectory generation of robots via artificial potential field with time base generator. IEEE Transactions on Systems, Man, and Cybernetics, Part C — Applications 88(4) (2002) 426–439. 3. M. Zak, Terminal chaos for information processing in neurodynamics, Biological Cybernetics 64 (1991) 343–351. 4. V. Mohan and P. Morasso, A forward/inverse motor controller for cognitive robotics, Lecture Notes in Computer Science (LNCS), Proceedings of 2006 International Conference on Artificial Neural Networks (ICANN), Athens, Greece (2006). 5. P. S. Churchland, Self-representation in nervous systems, Science 296 (2002) 308–310. 6. H. Cruse, The evolution of cognition — a hypothesis, Cognitive Science 27 (2003) 135–155. 7. R. Grush, Emulation and cognition, Doctoral Dissertation, University of California, San Diego (1995). 8. R. Shapiro, Direct linear transformation method for three-dimensional cinematography, Res. Quart. 49 (1978) 197–205. 9. H. Hatze, High-precision 3D photogrammetric calibration and object space reconstruction using a modified DLT-approach, J. Biomech 21 (1988) 533–538. 10. D. M. Wolpert and Z. Ghahramani, Computational Principles of movement neuroscience, Nature Neurosci 3 (Suppl), (2000) 1212–1217. 11. D. M. Wolpert and M. Kawato, Multiple paired forward and inverse models for motor control, Neural Networks 11 (1998) 1317–1329.

1st Reading July 29, 2007 21:59 00117

Towards Reasoning and Coordinating Action in the Mental Space

12. E. Oztop, D. M. Wolpert and M. Kawato, Mental state inference using visual control parameters, Cognitive Brain Research (on-line, 2004). 13. N. J. Emery and N. S. Clayton, The mentality of crows: convergent evolution of intelligence in corvids and apes, Science 306 (2004) 1903–1907. 14. S. T. Boysen and G. T. Himes, Current issues and emerging theories in animal cognition, Annual Reviews of Psychology 50 (1999) 683–705. 15. Tool use in animals: www.pigeon.psy.tufts.edu/ psych26/tools.html. 16. J. L. Elman, Finding structure in time, Cognitive Science 14 (1990) 179–211. 17. M. I. Jordan, Attractor dynamics and parallelism in a connectionist sequential machine, in Proc. of Eighth Annual Conference of Cognitive Science Society, Hillsdale, NJ: Erlbaum (1986), pp. 531–546. 18. P. Morasso and V. Sanguineti, Self-organization, Cortical Maps and Motor Control, North Holland (1997). 19. P. Morasso, A. Bottaro, M. Casadio and V. Sanguineti, Preflexes and internal models in biomimetic

20.

21.

22.

23.

341

robot systems, Cognitive Processing 6(1) (2005) 25–36. J. Tani and S. Nolfi, Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems, in From animals to animates 5, R. Pfeifer, B. Blumberg, J. Meyer and S. Wilson, eds. Cambridge, MA: MIT Press., 1998, later published in Neural Networks 12 (1999) 113. S. Kasderidis (2006), A computational model of multiple goals, Lecture Notes in Computer Science (LNCS), Proceedings of 2006 International Conference on Artificial Neural Networks (ICANN), Athens, Greece (2006). J. G. Taylor, S. Kasderidis, P. Trahanias and M. Hartley, A basis for Cognitive machines, Lecture Notes in Computer Science (LNCS), Proceedings of 2006 International Conference on Artificial Neural Networks (ICANN), Athens, Greece (2006). GNOSYS project documentation: www.ics.forth.gr/ gnosys.

Suggest Documents