Robust Modeling of Dynamic Environment based ... - Semantic Scholar

1 downloads 0 Views 1MB Size Report
R. Brooks argued that evolution ... the evolutionary approach is also quite encouraging [3][4]. .... opment of color vision enabled primates to discriminate.
Proceedings of the 2003 IEEE International Conference on Robotics & Automation Taipei, Taiwan, September 14-19, 2003

Robust Modeling of Dynamic Environment based on Robot Embodiment Kuniaki NODA1), Mototaka SUZUKI1), Naofumi TSUCHIYA1), Yuki SUGA1), Tetsuya OGATA1)2), and Shigeki SUGANO1) 1) Humanoid Robotics Institute, Waseda University 3-4-1 Ookubo Shinjuku Tokyo 169-8555, Japan {march, mottaka, tsuchiya, ysuga, sugano} @paradise.mech.waseda.ac.jp

2) RIKEN Brain Science Institute Hirosawa, 2-1 Wako-shi, Saitama 351-0198, Japan [email protected]

Abstract - Recent studies on embodied cognitive science have

degree sufficient to achieve the necessary maintenance of life and reproduction - it is much harder [2]. In addition, recent studies in sensory-motor coordination show us how an agent reduces its sensor input space so that it can abstract the regularity from the sensor input. For the design of an agent which enables sensory-motor coordination, the evolutionary approach is also quite encouraging [3][4]. The essence of the concept of sensory-motor coordination is that active reaching out to the environment by an embodied agent makes the sensor inputs more structured and does learning it easier. It allows the cost and complexity of the successive computations to be reduced [5]. To generate an adaptive behavior against an object in the changing environment, an agent has to get meaningful information and couple them with its motor system. Besides, the agent has to have a model of the object, which is based on the multimodal sensor inputs through the sensors of the robot own, to generate an adaptive behavior against it. For example, P. Dario et al. have shown that the integration of visual and tactile information can be applied to a specific disassembly task successfully [6]. But, to have such models or representations is not the eventual purpose of an adaptive robot in the dynamic environment. Considering behaviors in the real world, it is not necessary nor desired to model the environment minutely either [2]. The purpose to have a model of the environment, which has the spatiotemporal correlation (i.e. spatial and temporal correlation) of multimodal sensory inputs, is to select an appropriate action suited to the situation in the dynamic environment. Based on these ideas described above, we report our trial to model the dynamic environment and evaluate it using the human-like autonomous robot, WAMOEBA-2Ri (Fig. 1). Our basic idea as modeling is to shrink multidimensional aspects of the dynamically changing environment into the bodily data of the robot that is very easy to deal with.

shown us the possibility of emergence of more complex and nontrivial behaviors with quite simple designs if the designer takes the dynamics of the system-environment interaction into account properly. In this paper, we report our tentative classification experiments of several objects using the human-like autonomous robot, “WAMOEBA-2Ri”. As modeling the environment, we focus on not only static aspects of the environment but also dynamic aspects of it including that of the system own. The visualized result of this experiment shows the integration of multimodal sensor dataset acquired by the system-environment interaction (“grasping”) enable robust categorization of several objects. Finally, in discussion, we demonstrate a possible application to making “invariance in motion” emerge consequently by extending this approach.

1. Introduction Our goal is to make robots which can move around and select their action to achieve a variety of purposes in the environment changing everymoment, and survive consequently. Particularly, the ability to behave suited to accomplishing many ends in our habitat and communicate with us smoothly is mostly required. On the other hand, almost all of the creatures on the earth have behaved adaptively to eat, and protect their own body from the predators and inherited their genes since about 550 million years ago. Therefore, when we design such a robot which can generate adaptive behaviors, bio-inspired approach, abstracting the principles from living creatures, is one of the most hopeful way to do that [1]. It is most important for such an adaptive robot to percept the change of their environment appropriately, and to behave rationally in response to the perception to adapt to the ecological niche. R. Brooks argued that evolution has concentrated its time to get the ability to move around in a dynamic environment, sensing the surroundings to a

0-7803-7736-2/03/$17.00 ©2003 IEEE

3565

Fig. 1. Human-like autonomous robot WAMOEBA-2Ri which has 7 degree-of-freedom arms.

Fig. 2. Integration of Kohonen Maps.

sensor data. This coherence plays an important role for the agent to behave to the object rationally because the coherence of the multiple sensor inputs which are coupled with an each actuator activate them temporally, and potentiate an adaptive coordinated behavior using its whole body consequently (Fig. 2).

2. Modeling of Dynamic Environment 2. 1. Active Interaction: Grasping Proper perception needs proper action. We should not discuss only the mechanism of perception as an independent function, but consider it as an aspect of the coupled system with behaviors of agents in the environment [6]. Well-suited sensory-motor coordination reduces the complexity of sensor input space and make them more structured. For human beings, grasping behavior is one of the most effective ones to percept an unknown object. When an infant finds something novel in the environment, he/ she sometimes grasps it and brings it close to his/her face. This behavior enables the infant to observe the object more closely and normalize the scale of the object automatically [7]. Then, starts interacting with it using his/her own body (e. g. turn it, put it into his/her mouth, throw it away). This process is quite dynamic, active and flexible. In this course, he/she gets multimodal data about what the object is. These physical interactions are also quite useful for autonomous robots when they learn or adapt to the dynamic environment. Interaction between an autonomous robot which has a proper sensory-motor system and the dynamic environment leads to getting a spatiotemporal correlation of its multimodal sensor data. Only through its multiple sensor inputs acquired by its own behavior, an autonomous robot is able to have a meaningful model of the changing environment for generating adaptive behaviors.

3. Experiment 3.1. WAMOEBA-2Ri This study has developed an autonomous robot named WAMOEBA-2Ri to investigate the robot-human emotional communication and autonomous intelligence of robot systems inspired by the mechanism of the endocrine system into both the hardware and the software [8][9]. WAMOEBA-2Ri is a wheel-type independent robot which consists of 20 degree-of-freedom in total. WAMOEBA-2Ri has the sensors to acquire not only the external information by the CCD cameras, the microphones, and the torque sensors but also the internal information such as the voltage of the battery, the electrical current, the temperature of the motor et al. More detailed data of the hardware of WAMOEBA-2Ri are shown in Table 1. WAMOEBA-2Ri evaluates the information by using the self-preservation functions corresponding to these sensors. The function is a kind of fuzzy membership function which evaluates the durability of the robot hardware. Based on the result of the evaluation, WAMOEBA-2Ri can constantly control the sensor range, the motor speed, the cooling fan output, and the power switches of every motor and sensor. WAMOEBA-2Ri communicates with human beings in emotional ways by the expressions produced by these internal and external body changes.

2.2. Integration of Multimodal Sensor Data Physical interaction between an autonomous agent and the environment generates a coherence between multiple

3566

Dimensions Weight Operating Time Max speed Payload Neck Vehicle External DOF Arm Hand Cooling Fan Internal DOF Power Switches Image Input Audio Input Audio Output External Distance Detection Sensors Joint Torque Grip Detection

Internal Sensors

Object Detection Temperature Battery Voltage Motor Current Material CPU OS

WAMOEBA-2R 1390(H) x 990(L) x 770(W) mm Approx. 130 kg Approx. 50 min 3.5 km/h 2 kgf/hand 2 2 4x2=8 1x2=2 10 4 CCD Cameras x 2 Microphones x 3 Speaker Ultrasonic Sensors x 4 Torque Sensers x 6 Photoelectric Sensors x 2 Pressure Sensors x 2 Touch Sensors x 8 Thermometric Sensors x 8 Voltage Sensor Current Sensor Duralumin, Aluminum Pentium III (500MHz) x 2 RT-Linux

Fig. 3. A close view of WAMOEBA-2Ri grasping an object. The position of the object is movable on the table. Capturing the object by the two cameras activates the reaching behavior.

mechanism of categorization is the parallel sampling of an environment by the sensor maps through same or different sensor modalities. The task of this experiment is to classify three objects using the own sensors and actuators of the robot, WAMOEBA-2Ri. The appearance of the objects and experimental environment is shown in Fig. 4, and the characteristics of the three objects are summarized in Table 2. After grasping the object, the robot holds out its arm horizontally. This is because capturing object by two cameras activates the behavior. Holding out its arm horizontally makes the values of torque sensor at its shoulder joint maximum. That means that the torque sensor becomes most sensitive to the difference between the weights of the three objects at that moment. This is an easy example of sensory-motor coordination (we, human beings hardly do that when comparing the weights of two objects thanks to our dextrous, quite sensitive hands and arms without such an effort). Then, the input data of the torque sensor maximized in this way, the color data i.e. YUV values of the object are calculated, values of the pressure sensors at the gripper, and width of the gripper (hereinafter, these pressure and width data are combined and called “hand data”) are integrated using Kohonen maps in this experiment. Using Kohonen maps allow the robot to organize the models of the three objects based on its embodiment without the designer’s intervention because of the self-organizing characteristic of the map. To investigate the robustness of this method, we added the every type of noise to the integrated dataset of the three objects and examined robustness of maps for categorization tasks under that various conditions. The noise is generated by random numbers whose maximum amplitude is 0.5 and added to the original normalized data from each sensor modality.

Table 1. Here showing specific data on the hardware of WAMOEBA-2Ri.

3.2. Grasping From the point of view of evolution, the eventual goal of vision system is not only to recongnize objects, but to enable an autonomoous agent to emerge its behavior effectively in the real world [10]. In this experiment, we designed the grasping behavior as a part of sensory-motor coordination coupled with the sensor input. Actually as a reflective action caused by the color input of the robot vision. The vision system of this robot can search an area which color is more specific in comparison with the other and scale is suitable for grasping. Consequently, this robot can grasp the suitable object wherever it is on the table. As designing this vision-reflection system, our methodology is also based on the idea about evolution of color vision in primates. D. Osorio et al. claimed that the development of color vision enabled primates to discriminate fruits against the background [11]. Discriminating an unique colored area against the environment is also an important competency for an autonomous robot which moves around in the human habitat. Grasping behavior is shown in Fig. 3.

3.3. Experimental Task: Classification Categorization is the underlying competency of intelligence. It is important for categorization by a robot to emerge a meaningful behavior which can exploit the mechanism of sensory-motor coordination. The essential

3567

sor dataset results in robust organized map that enable the classification task easily against any disturbance in this experiment.

3.5. Additional Experiment To investigate the robustness of this multimodal integration that we examined, we compared the result of it with the one using an only single layered Kohonen map to integrate the multimodal sensor inputs. In this experiment, same type of random noise described at 3.3 was added to each sensor modalities. The integrated data with no noise were also classified clearly even in the single layered one, but the integrated data with hand noise were disturbed a lot particularly. Visual comparison between multi-layered Kohonen maps and single layered ones shows us the difference easily and clearly (data not shown). However, to investigate that difference quantitatively, we calculated Euclidean distances between the integrated data with no noise and the other ones with some kinds of noises in both cases. This calculation means to test of coincidences about the firing rate of whole neurons. The result of calculation is also shown in Fig. 6. This graph shows the data with hand noise is much more disturbed in the case of single layered maps. Additionally, it shows that the noise of sensor input affects the disturbance on the Kohonen map in proportion to the dimension of sensor in the single layered ones. In the Kohonen algorithm, the influence on the organized map is determined by the ratio of the component data in the input vector. Therefore, in the multi-layered maps, abstracting sensor inputs from the lower maps and equalizing the dimensions of them results in much smaller influence of some kinds of noise on the organization of the maps. This result also means that it is not always necessary to use Kohonen map as a lower layer to equalize the dimension of sensor inputs. The other algorithm which is able to abstract sensor inputs can play the same role as Kohonen map in this experiment.

Fig. 4. Three objects for classification experiment. These objects have different characteristics which are summarized in Table 1. Hereinafter we call these objects a, b, and c from left to right.

object

color

stiffness diameter weight

a

red

hard

wide

heavy

b

yellow

hard

thin

light

c

green

soft

wide

light

Table 2. Characteristics of three objects. These characteristics are different apparently for us, but that is not necessarily for the other.

3.4. Result Kohonen maps displaying distributions of firing rates of the neurons in the integration layer are shown in Fig. 5. Three maps at the top row clearly indicate that the multimodal sensor data have been integrated successfully and a slight glance at the three maps enables us to classify them obviously as well. The other maps shows the robust modeling of the three objects based on the robot embodiment even if these objects moved anywhere on the table. The robot could generate the embodied models of these objects using its own sensory-motor system in the dynamic environment. By grasping an object, the robot could organize a robust model that is composed by the data of color of the image, pressure at the hand, width of the gripper, and torque loaded with the shoulder of the object. Compared with the integrated data with no noise, the map of the integrated data of object ‘b’ with hand noise (in the middle of third row from the top) show the noise of hand data have much more influence on the integration data ‘b’, but it is clear that the integration of multimodal sen-

4. Discussion In this experiment, the robot dealt with the static data of the objects and organized a model of the environment by integrating Kohonen maps which are assigned to each sensor modality. By grasping objects with its own sensorymotor system, the robot could reduce the dynamics of the objects and enabled to convert them into the easy, static, and bodily data from its own body. As a result, the multimodal integrated dataset led to robust categorization of the three. By the way, the environment doesn’t have only such a static aspect, but also the invariances (i.e. the unalterable characteristic emerged during a change) that is generated

3568

a

b

c

Integrated data with no noise.

Integrated data with color noise.

Integrated data with hand noise.

Integrated data with arm noise.

distance

dimension

Fig. 5. Kohonen maps displaying the distribution of the firing rate in the integration layer.

EyeNoise EyeNoise EyeNoise HandNoise HandNoise HandNoise ArmNoise ArmNoise ArmNoise a b c a b c a b c

Fig. 6. Comparison between multi-layered Kohonen maps and single layered ones on robustness to classify the three objects.

3569

use an autonomous robot which has an appropriate sensory-motor system and the dynamics generated by a variety of active robot-environment interaction. It is not until we start from this stage that we can step forward to the hard problem on generating active and flexible behaviors in autonomous systems because the common essential of both two characteristics is being dynamic.

in the system-environment acquaintance. In case of human beings, some invariances emerging when a dynamical system composed of our bodies and objects for perception generate appropriate behavior. In other words, we find “invariance in motion”. For living creatures that have perception competency, it dominates their behaviors to percept the macro patterns of the environment. It is quite important for robots moving around in a dynamic environment to percept such a invariance directly because the invariance represent macro property of the environment. Now we have to consider how an autonomous agent acquires such invariances in the dynamic environment using its competencies. As an easy example that we can try easily with the robot, we conceived an additional task, shaking the grasped object. First, two objects (e.g. cylinders in this paper) with same weight and different height are prepared. Shaking behavior shows us a characteristic property of the object. The difference in height of these objects appears in the difference in time-series data of the torque loaded with the moving (shaking) joint. This is caused by the difference between two objects in moment of inertia of the dynamical system which consists of its arm and the object. Here the moment of inertia is an invariance which emerges through the behavior, “shaking the object”. An agent never percept this property of the objects without moving dynamically. In this experiment, the robot can percept the difference in moment of inertia between two objects directly by the shaking behavior based on its embodiment. This is an easy example of emergence of environmental properties caused by dynamics, that is invariances. As we described above, we have to consider perception for an autonomous robot as a competency to control its behavior suited to macro properties in the dynamic environment, and explore further emergence of such a dynamic sensory-motor coordination.

References [1] B. Webb and T. R. Consi, Biorobotics. Cambridge, MA: MIT Press, 2001. [2] R. A. Brooks, “Intelligence without Representation,” Artificial Intelligence, 47, 139-160, 1991. [3] S. Nolfi, “Adaptation as a more powerful tool than decomposition and integration,” in: T. Fagarty and G. Venturini (Eds.), Proceedings of the Workshop on Evolutionary Computing and Machine Learning, 13th International Conference on Machine Learning, University of Bari, Italy, 1996. [4] R. D. Beer, “Toward the evolution of dynamical neural networks for minimally cognitive behavior,” in P. Maes, M. Mataric, J.-A. Meyer, J. Pollack, and S. W. Wilson (Eds.), From animals to animats: Proceedings of the 4th International Conference on Simulation of Adaptive Behavior (pp. 421-429). Cambridge, MA: MIT Press, 1996. [5] R. Bajcsy, “Active Perception”, IEEE Proceedings, 76, pp 996-1005, 1988. [6] P. Dario, M. Rucci, C. Guadagnini, C. Laschi, “Integrating Visual and Tactile Information in Disassembly Tasks”, Proceedings of the ‘93 ICAR International Conference on Advanced Robotics, pp. 191-196, 1993. [7] J. J. Gibson, The ecological approach to visual perception. Boston: Houghton Mifflin, 1979. [8] E. Bushnell and J. P. Boudreau, “Motor development in the mind: The potential role of motor abilities as a determinant of aspects of perceptual development”, Child Development, 64, pp. 1005-1021, 1993. [9] S. Sugano and T. Ogata, “Emergence of Mind in Robots for Human Interface -Research Methodology and Robot Model,” in Proceeding of IEEE International Conference on Robotics and Automation (ICRA’96), pp. 1191-1198, 1996. [10] T. Ogata and S. Sugano, “Emotional Communication Between Humans and the Autonomous Robot which has the Emotion Model,” in Proceeding of IEEE International Conference on Robotics and Automation (ICRA’99), pp. 3177-3182, 1999. [11] R. Pfeifer and C. Scheier, Understanding Intelligence. Cambridge, MA: MIT Press, 1999. [12] D. Osorio and M. Vorobyev, “Colour vision as an adaptation to frugivory in primates,” Proceeding of the Royal Society of London, B, 263:593-599, 1996.

5. Conclusion We proposed the robust way to model the dynamic changing environment through the physical interaction (grasping in this paper) followed by the active reaching coupled with the visual input and abstraction of multimodal sensor data based on robot embodiment. As an evaluation experiment, we reported the result of a classification experiment of objects which could move everywhere on the table using the human-like autonomous robot, WAMOEBA-2Ri. Finally, we discuss the possibility to develop this method into the further level that dynamic data generated by negotiating between the system (robot) and the environment are taken into account. We described that it is one of the hopeful approaches to

3570

Suggest Documents