Self-organized multi-modular Robotic Control. - Electrical & Computer ...

1 downloads 0 Views 219KB Size Report
Unidad periférica en Xalapa. UNAM. 2Facultat de Física e Inteligencia Artificial. UV. 3Departamento de Electrónica ININ. Calle Sebastián Camacho # 5. Xalapa ...
Proceedings of the Third International Symposium on Robotics and Automation. September 1-4, 2002, pp 169-174.

Self-organized multi-modular Robotic Control. 1

José Negrete-Martínez 1, 2, Roberto Cruz 3 Departamento de Biología Celular y Fisiología. IIBm. Unidad periférica en Xalapa. UNAM. 2 Facultat de Física e Inteligencia Artificial. UV. 3Departamento de Electrónica ININ. Calle Sebastián Camacho # 5. Xalapa Centro. 91000.Veracruz. México. [email protected]

Abstract

In the present paper we show that a set of self-organizing modules can produce complex Behaviors in a robot consisting of a Car that carries an articulated arm. The free end of the arm has an IR sensor that senses an environmental IR Beacon. The self-organization is implemented through self-inactivating modules. Every module moves, stepwise, a different motor(or set of concurrent motors) of the robot in one direction or the other. The modules first move and then determine if the distance from a robot sensor to a selected object has been shortened. If this is the case, the module’s next movement is in the direction of the previous movement. If this distance is longer than the previous one, the next movement is in the opposite direction. But if the movement produces no significant change in distance, the module self-deactivates. The self-organization is in the module’s sustained activity. The modules were programmed in C language on a PC. The Main program calls the modules in a cyclical order. However, the sustained activity of the modules does not depend on this calling order. Actually, the Main Program could call the modules in any order and the free-end of the arm will eventually reach the Beacon from any initial position and initial attitude. However, some pair-nested together with a certain calling order of the pairs greatly improved the effectiveness of the Behavior. Since the calling order does not determine the sustained activity of any module or nesting pair, the Main Program does not actually select actions, but induces them. Thus, the robot control is not strictly hierarchical. We have implemented a structure that could be called a Central Nervous System Seed for a robot: a modular, self–organizing, attentional, polyfunctional and potentially growing robotic control system.

1. Introduction We postulate that a simple Set of Self-Organizing Modules (SOM) implemented with self-inactivating

modules with no interconnections between them can produce complex behaviors in robots beyond navigation. A previous communication on the subject can be seen in [1].

2. The implementation of a SOM We have implemented a hybrid SOM which is both mechatronic and computational. The main mechatronic part is a three-wheeled car that carries on its deck an articulated arm having a grasping tool on its free end. Other mechatronic parts include the servo motors which move each of the two wheels, the five articulations of the arm, and the computer-servo’s interface (DS), a single infrared sensor (IRS) moved by one of the servos, an Amplifier-Filter-Detector Circuit (A-F-D) for the IRS signal and the AD interface between the A-F-D and the computer. See Figure 1. Finally, there is a mechatronic part consisting of a set of relays commanded by the computer through a DA interface. There is a Beacon in the robot’s environment that produces a flickering IR light. This light generates a corresponding square wave voltage in the IRS. This voltage wave is amplified and then filtered (in order to eliminate the high frequency noise). Finally, the wave is filtered again (Detection) leaving its envelop as the output of the A-F-D (the amplification parameter in the A stage is controlled by the relays). See Figure 1. Figure 1 also shows the basic software organization scheme of any module in the SOM. The software part of the module is a Process implemented as a Function of a ‘C’ language for a PC. Each module, during its activation, reads the light intensity coming from an IR beacon. The module has a Payoff(i) function that calculates the difference of light intensity before and after the module’s motor action (payoff value of the movement). There is a function Decision(i) that uses this last value to determine the direction of the next movement. The module repetitively executes the movement or its opposite until the payoff value is near zero. In the latter case, the module self-inactivates. The

1

movement is actually a step movement either in one direction or the other in the joints of the robot. In the case of the wheel motors, the modules produce a fixed rollingtime movement, either forward or backward.

step action

Mechatronic

Software

Decision algorithm

m B

IRS

AFD

Motor(i)

AD

DS

* step

Last value

Payoff(i)

Decision(i)

Figure 1. This figure depicts the implementation of a generic module. The module has a Mechatronic part and a software part. The mechatronic part has: an Infrared sensor IRS that receives the signal of a beacon B; An A-F-D Amplifier-Filter-Detector that receives the IRS output signal; AD is Analog to digital transducer to the computer; DS is a Digital to servomotorposition transducer from the computer. The Software part has: a Memory register m; a Payoff(i) function that calculates the sign of the difference of light intensity before and after the module’s motor action and a Decision(i) function that adds a signed step to the present position of the Motor(i). The Decision(i) function calls the Payoff(i) function. In every module the Decision Function makes use of an algorithm (related to the ‘Two Arm Bandit’ learning algorithm [2]) that inputs the Payoff value and its previous Output value. The Decision Function Output is the same as the last Output when the Payoff is positive, but switches to the opposite sign if the Payoff Value changes to negative. The Output Value calculated by the algorithm is multiplied by a step value typical of every function and then added to the Last Action value stored in the function. Figure 2. However, if the Payoff Value is near zero, before calculating anything else, the function selfinactivates.

+ Last action

Last action

Payoff value

Figure 2. Detail of the Decision and Action taken by a Module. The output of the Decision algorithm [+1, -1,0] is multiplied by a parameter step and the result is added to the value of the Last Action, in order to produce the new action. The Main program calls a sequence of modular functions in an endless cycle. We also implemented some nesting among several pairs of modules: namely, one module called another before its own activation took place. In Figure 3 we represent a generic nested pair, a Rotating Module and a Probing Module. In the figure the Probing(i) module is a function that calls its Rotating(i) module. The Probing module calculates its own Payoff when the Rotating module self-deactivates. Depending on the pair, the Rotating module produces horizontal rotation of the sensor, waist rotation or rotation of the whole Car. Depending on the Probing module of the pair, the module Rolls the car forward or backward, moves the shoulder, the elbow or the wrist.

m B

IRS

AFD

Motor(j)

AD

Payoff(j)

DS

Rotating(j

n B

IRS

AFD

Motor(j)

AD

DS

Payoff(i)

Probing(j)

2

Figure 3. Generic scheme of an associated pair of modules. The Probing(i) function calls the Rotating (j) function and then its Payoff(i) function that with the help of the value stored in n calculates the new payoff value for the Probing(i) function. If the payoff value is near zero this last function self-deactivates; otherwise it moves the motor in one direction or the other, depending on the sign of previous step-action taken. The Probe Functions at this stage of the implementation also self-deactivate when an activation-time (which has been assigned to the function) expires. A different pairing of modules, rotating-rotating, was designed in order to have an Orienting Reaction of the car in the direction of the Beacon. The rotating module of the wheels calls the module rotating the sensor motor and this module transmits to the rotating-module of the wheels the angle to be rotated.

3. Some Experiments After the Orienting Reaction pair, the modules were called in different order in different experiments. In all of them the Grasping-tool eventually reached the Beacon from any initial position and initial attitude. However, the most efficient order in our robot was through calling the nested pairs in the following order: {Orienting Reaction Pair}, {Elbow, Roll}, {{Waist, Wrist} Shoulder}. Notice that {Elbow, Roll} is a Car Reaching Action, and {Waist, Wrist} is an Arm Pointing Action. However {{Waist, Wrist} Shoulder} is an Arm Approach Action. The fore mentioned Sequence repeated in cycle produces a complete Grasping Behavior. Figure 4 shows some ‘snapshots’ of this experimental sequence.

Figure 4. Three ‘snapshots’ of the experiment: {Orienting Reaction}, {Elbow, Roll}, {{Waist, Wrist} Shoulder}. The picture at the bottom of left column is the attitude and position of the robot immediately after the Orienting Reaction Pair. The following picture (upper part of this column) is the position of the robot mainly due to the action of the {Elbow, Roll} pair repetition. The last picture in the column is the result of the whole selforganized sequence. Changing the sign of the step parameter can drastically modify the Grasping Behavior to a Running-away Behavior in the Probing Functions. If the Orienting Reaction was kept intact, a very interesting conflicting

3

behavior was seen between turning toward the Beacon and running away from it.

Figure 5. Running-away Behavior of the robot found by changing the sign of the step Parameter in the Probing Functions, with the Orienting Reaction remaining intact. The robot was initially placed near the Beacon. In the picture one can see the moment at which the robot distances itself from the Beacon while “keeping an eye” on it. When an obstacle hinders the rolling of the car, the Arm still tries to reach the Beacon. See Figure 6.

Figure 6. Three ‘snapshots’ of an experiment with an obstacle in the path of the Car. The position and attitude of the robot after its Orienting Reaction can be seen in the lower part of the left column. The picture shows the robot near the obstacle with little change in its initial attitude (top picture in this column). The picture immediately above shows that when the obstacle stops the wheels, the robot tries to reach the Beacon by extending its arm toward it.

4

4. Discussion The Grasping Behavior can be made to emerge in many variants by changing the Main Program or by placing obstacles in the path of the robot. We sequenced {Orienting Reaction}, {Elbow, Roll}, {{Waist, Wrist} Shoulder} nested pairs in the Main Program in search of a more efficient Grasping Behavior. The sequence can be translated into words: unconditionally first Orient, then try to reach the Beacon with the Car, then try an Arm Approach while pointing to the Beacon with the help of the torsion of the Waist. We did not experiment with the actual grasping of the Beacon, which would have required adding tactile sensors to the Grasping tool and dealing with the transfer of the robot attention to another sensorial modality. We anticipated two very important problems in the Robot’ s Grasping Behavior: The “Intentional Tremor” and the Too-Soon-or-Too-Late Final-Grasping Action. The Intentional Tremor is the increase of oscillating movements of the joints and wheels as the sensor comes closer to the Beacon. The Tremor is not caused by sensor saturation because of a decrease in the Amplification in the F-A-D; the Main program avoids such saturation when the sensed light intensity is very high. Here we should mention that this “pupilary” reflex implementation was another engineering problem that had to be solved outside the general architecture. The Too-Soon-or Too-Late-ArmApproach was a persistent problem. It was, however, greatly attenuated by the sequencing and pairing mentioned in previous paragraphs. These two movement abnormalities are characteristic of patients with Lesions in the Cerebellum. This comparison with human pathology suggests that the endeavor of producing smooth and precise Grasping Behavior within this architecture requires higher order modules that would correct the tremor and the over/under shooting. In our implementation every module is attentive. When we nest a pair of modules, we are implicitly separating overt Orienting Attention from its motor-consequences. In our architecture robot-attention is a distributed function in the modules, an organization well-suited to Parasuraman’ s [3] statement on Human Attention: “ [Attention] is not a single entity, but a finite set of brain processes that interact mutually and with other brain processes in the performance of perceptual, cognitive and motor skills” The facile transformation of a Grasping Behavior into a Running-away Behavior in our reported experiments endorses Parasuraman’ s opinion, which states that none of the brain functions is as crucial as Attention to the performance of the other brain tasks. [3].

The Orienting Reaction Module is an Attentive Module, but of a different kind. The module appeared to be the solution to how to start the expected behavior. The result of implementing the Orientation Reaction was the introduction of a low level reflex activity, passing on angular information (proprioceptive information would be the biological term) from one Rotation Module to the other. Perhaps the lesson here is that there no proper Attention can exist without the participation of proprioceptive information. This, in addition, brings us to consider the possibility that the postulated Last Action variable present in every module also represents some of the proprioceptive information in question. The unsolved conflict between Orientation Reaction and Running-away Behavior shown in our experiments can be equated to the commonly reported behavioral conflicts observed in ethology [4]. This conflict persistence has for the robot, as for the animals, a potentially adaptive value (the robot is running away but not “loosing sight” of the Beacon).

5. Conclusions. In our endeavor to implement a Robotic Control out of Self-organizing Modules we have constructed a structure that could be conceptualized as a Central Nervous System Seed because its following properties: •

Modular.



Self– Organizing (self-rewiring [5]). Through self-inactivation of the modules



Attentional. The set generates (makes emerge) complex Open Attention



Poly-functional. The set, through homogeneous modification of parameters in some modules, generates different potentially adaptive behaviors



Evolving. By simply adding new modules, the emergence of still more potentially adaptive behaviors can probably be brought about

We have implemented this self-organizing CNS for a robot in which there is no way to effectively command actions by calling modules. Since the calling does not determine the sustained activity of any module, the Main Program of this CNS is actually not an Action Selector but rather an Action Inducer to a Preference Order. This seems to be the case in biological CNS Control [4]. A robot control of this kind is not a hierarchical control of

5

servomechanisms as is the case in some related works [6, 7]. The pairing “ Rotating-Probing” and its sequential calling produces an autonomous behavior that could be equated with Clarks’ ‘Wideware’ [8] in the sense that no module in the pair is isolated from the World and each pair exploits the global benefit created by its partner in the pair and by another pair.

7. References [1] Negrete-Martínez J. A Multi-Process ‘Central Nervous System’ for A Robot. In Proceedings of the 2nd International Symposium on Robotics and Automation, ISRA’ 2000. 421425,2000. [2] Kaelbling, L. P. Learning in Embedded Systems, MIT press Cambridge, MA, 1993. [3] Parasuraman, R. The Attentive Brain: Issues and Prospects. In The Attentive Brain. R. Parasuraman (Ed.) MIT press. Cambridge, MA, 2000. [4] Tyrell T. The Use of Hierarchies for Action Selection. Adaptive Behavior 1,387-240. 1993. [5] Merzeninich, M. Recanzone, G. H., Jenkins, W.M., and Nudo R.J. How the brain functionally rewires itself. In Arbib M.A. and Robinson J.A. (Eds.) Natural and Artificial Parallel Computation. Cambridge, MA: MIT Press. 1990. [6] Albus J.S. Brains, behavior and robotics. Byte Books. Peterborough, NH, 1981. [7] Powers W.T. Behavior: The control of perception. Aldine. Chicago, 1973. [8] Clark A. Where Brain, Body, and World Collide. In The Brain . Edelman G.M. and Changeux J.P. (Eds), Transaction Publishers, New Brunswick USA. 1998.

6