The International Journal of Virtual Reality, 2008, 7(1):9-14
9
Utilizing Virtual Environments to Enable Learning in Human-robot Interaction Scenarios Ayanna M. Howard and Sekou Remy Georgia Institute of Technology, Atlanta, GA 30332, USA 1
Abstract—This paper presents two different approaches for utilizing virtual environments to enable learning for both human and robotic students. In the first approach, we showcase a 3D interactive environment that allows a human user to learn how to interact with a virtual robot, before interaction with a physical robot. In the second approach, we present a method that utilizes a simulation environment to provide feedback to a human teacher during a training session in order to concurrently allow adaptation of the learning process for both the teacher and the robotic student. We provide details of the approaches in this paper and provide results of the learning outcomes for the two different scenarios. Index Terms—Robot learning, virtual instructors, humanrobot interaction.
I.
INTRODUCTION
As the inclusion of robotics in everyday life begins to become a reality, the capability of both humans and robots to learn from each other begins to gain in importance. Robots that are integrated into dynamic environments, such as the home, must have the ability to be taught new tasks and new objectives by their human users. And yet, we cannot assume that the human user is a skilled expert in robot interaction, or that they understand how to efficiently explain a new task without physically demonstrating the task to their robot assistant. In the early 80s, Trist [1] discussed the concept of the socio-technical system in order to formalize the reciprocal interrelationship between humans and machines. The theory states that humans can be made to adjust to technologies and technologies can correspondingly be made to adjust to humans. These adjustments must occur interdependently in order to maximum usage of technology within a social organization since processes that are optimal for humans may not be optimal for machines. Following this sociotechnical systems approach, we seek to develop a process that allows both humans and robots to adapt together in familiar ways. This is accomplished by utilizing a virtual environment to enable learning for both human and robotic students. To create a mechanism for this learning process to occur, virtual environments function as the classroom setting. Assuming that the virtual environment implements the same physical laws as those in the real world, the learning attained thus becomes transferable. This type of training, in a high fidelity environment, not only provides the Manuscript received on 16 July, 2007. E-Mail:
[email protected].
ability to record, monitor, and observe the progress and performance of deployed human-robot teams, but allows the virtual environment to implicitly become the virtual instructor for users operating the simulated robot within them. In essence, through human interaction with the robot, the virtual environment becomes the entity that guides the team in improving their performance. In this paper, we discuss two different approaches for this integration of human and robot students in a virtual environment. In the first approach, using the concept of exploratory learning within a 3D interactive environment, a training process is discussed that enables a human to learn how to interact with their robot assistant in a virtual environment. In the second approach, a hybrid instruction system is presented that concurrently adapts both human and robot interaction in order to train a robot to perform new tasks. In this work, the human seamlessly transitions between teacher and student roles, while the robot adaptively learns during the teaching process. By introducing these two learning approaches, the overall goal is to provide a basis in which to improve our approaches for human-robot team interaction within the environment [2]. II.
A VIRTUAL ENVIRONMENT TO LEARN SKILLS FOR ROBOT INTERACTION
2.1 Exploratory Learning within a Virtual Environment The concept of exploratory learning is defined as the process of learning new skills, typically through a trail-and-error process of actual use. Exploratory learning approaches using virtual environments have ranged from assisting in learning unfamiliar software applications [3] to learning difficult mathematical concepts [4]. Although research in exploratory learning has been integrated with virtual environments, there is minimal focus on applying it to human-robot interaction (HRI) scenarios. HRI has many unique characteristics [5, 6] including the fact that, although humans might interact with a virtual representation of a robotic system, there is also a remote device having a physical embodiment that is controlled through the interaction. This difference means that the human operator must not only provide direction to the robotic device through the interaction, but must also be capable of receiving feedback to ensure adequate control is applied. In [7], it was suggested that human users prefer to learn novel device usage through exploration in the context of real tasks. For robot applications, allowing the human to learn during real task implementation poses a major challenge due to the
10
The International Journal of Virtual Reality, 2008, 7(1):9-14
possibility of damaging the robot, issues of intermittent communication delays, and lack of knowledge of robot capabilities. We therefore focus on the use of a virtual environment that allows exploratory learning for robotic devices. This provides a means to allow adequate training for a user to become more effective in implementing a new task using a robotic device without requiring direct interaction with the hardware. We showcase this environment and discuss its use for increasing operator efficiency [8]. 2.2 The Virtual Environment-HumAnS-3D HumAnS-3D (Fig. 1) is a 3D virtual test environment that has been developed to allow user access to a virtual representation of the world and control of a virtual robot. The graphical user interface is also designed to allow connection of the virtual robot, viewable by the human user, to the real robot for seamless integration with the real world environment. For our application, we utilize the Sony ERS-7 robot for human interaction.
the amount of information that must be shared between human and robot. The first category represents on/off behaviors: A) Grasp/ Release-allows the user to manipulate objects within the environment, B) Lift/Lower-allows the user to move the robot’s head position up and down, C) Mate/Unmatecommands the robot to slowly approach an object before manipulation, and D) Track/Untrack-commands the robot to follow closely behind an object while moving within the environment. Pressing these buttons once performs one of the actions, and pressing it a second time performs the second complementary action. The next category of behaviors is more computationally intensive and requires additional user inputs: A) Identify-give known information concerning object specified by human user, B) Locate-give position information related to object specified by human user, C) Plan-plan navigation path from current location to object location specified by human user, and D) Model-create map containing objects within environment.
Fig. 1. HumAnS-3D environment showing virtual robot, which is connected to the ERS-7 robot. See Color Plate 2.
There are two modes of interaction with which a human user can control the robot. A direct interface consisting of five buttons allows the user to move the robot by turning left, turning right, moving forward, moving backward, or stopping. The second mode of interaction enables human-robot collaboration at a very high-level with the goal of making the human-robot interaction as close as possible to human-human interaction. A human user can activate a number of autonomous robot behaviors using button-based menus on the software panel. The buttons are subdivided into four categories based on
Fig. 2. Snapshot of interaction through language discourse in HumAnS-3D environment.
There are then two traverse buttons, supervised and unsupervised, depending on whether or not the user wants the robotic platform to plan out its on path within the environment or follow a predetermined course. Finally, the ‘Ask for Help’ button will allow the robot to obtain other necessary
The International Journal of Virtual Reality, 2008, 7(1):9-14 information through discourse with the human user when it is unable to complete a task or encounters unknown situations. As interaction is a key component for learning, one of the software features is a natural language processor, which uses a rule-based approach to ensure correct translation from words to robot actions (Fig. 2). 2.3 Implementation and Analysis Our focus is to utilize the virtual environment to enable a user to develop the skills necessary to control a mobile robot through implementation of exploratory learning practices. Our experimental operator test set consists of 6 novice users having no previous experience with either the user interface, or the robots themselves. The users were segmented into two groups of 3 members each: one group that would employ exploratory learning techniques and the other that would not. Each member of the exploratory learning group was allowed 2 minutes to experiment with the interface, which was interfaced directly to the physical robot.
11
From this data, we can determine that the benefit of using the virtual environment to improve the skill set of a novice user is a reduction in task learning time by 22.2%. This reaffirms our theory that the use of the virtual environment for learning to control a robotic device provides enough training to allow a user to become more effective in implementing a new task in a novel situation. Thus, we show a case in which exploratory learning can be a suitable substitute for practice when training humans to interact with robots for task achievement.
TABLE 1: EXECUTION PARAMETERS FOR HUMAN CONTROL OF ROBOTS. Average Task Time
Standard Deviation
Average # Commands
Standard Deviation
Untrained
63.00
4.36
27.33
2.31
Trained
54.67
6.35
28.33
2.31
TABLE 2: EXECUTION PARAMETERS FOR EXPERT USER OF VIRTUAL ENVIRONMENT. Expert
Average Task Time
Average Number of Commands
46.00
24
After learning, each member of the exploratory learning group was provided a single novel task: to move the robot to a target block unit positioned within the environment and to push it to a designated goal location (Fig. 3). For the other test group, each member was provided with instructions and the same novel task requirements, but without the benefit of experimenting with the interface. From this test setup, we extract two parameters-task execution time and number of commands issued to the robot. TABLE 1 depicts the average values for each group, where Trained is used as the designated label for the exploratory learning group. We also document the task execution time and average number of commands associated with an expert user in order to provide a baseline for comparison (TABLE 2). In this case, an expert user (who was not a member of either test group and had previous experience interfacing with the robot and in task implementation) was so designated after the distribution of their task time reached the 90% confidence interval of 2.5 seconds such that:
⎛ σ ⎞ 1.96⎜ ⎟ ≤ 2.5 ⎝ N⎠
(1)
where N is the number of trials, and σ is the standard deviation associated with those trails. We use this calculation in order to determine performance convergence for the task. It was calculated that this point occurred after completing 18 iterative trials of the box-pushing exercise.
Fig. 3. Screen snapshots of block pushing task.
III. A VIRTUAL ENVIRONMENT TO CONCURRENTLY TRAIN HUMANS AND ROBOTS 3.1 Interactive Learning within a Virtual Environment Much of the focus of learning in virtual environments centers on a human student acquiring or improving expertise within a given set of tasks. The emphasis in the term “Virtual Learning Environment” (VLE) appears to be on establishing a learning environment that, although virtual, has properties or qualities that enhance the student learning process. The improved (or
12
The International Journal of Virtual Reality, 2008, 7(1):9-14
novel) expertise acquired through the use of the VLE is typically sought after since studies have shown improved performance after learning when the subject is subsequently required to perform in a real environment (as discussed previously). In Section II, an application was presented that enabled a human user to become more effective in implementing a new task using a robotic platform integrated in a physical environment, after they were provided with practice sessions using a virtual environment. In this section, we present an alternative approach that uses the concept of interactive learning applied within a virtual environment. Interactivity is a complex term that goes beyond user activities such as clicking a mouse or moving a joystick. Interactivity can be defined as the process of actively engaging with another entity and adapting to it (as the other agent possibly adapts as well) over a definitive time period. Interactive Learning (IL) is an approach that uses interaction as an essential component to enable learning such that a teacher both provides instruction and gets real-time feedback from the student during the learning process. This allows a teacher to adapt to the learning scenario and thus can account for differences such as the student’s rate of learning, the previous knowledgebase of the student, and the complexity of the subject matter to learn. IL has been demonstrated by [9] in which a user interacted with an adapting robotic agent as they collaborated together to perform a novel task. (After the interaction session, the robot was equipped with the capability of performing the new task, and the learning session could be supplemented to improve the robot’s performance at any time.) Analyzing the strategies a human teacher utilizes while interfacing with an “intelligent” robotic agent that learns from the interaction is very important. As VLEs are increasingly used in the 21st century, populating the environment with autonomous virtually situated agents is likely to be a key enabler in terms of providing avenues for personal entertainment or even for creating realistic worlds. If robotic agents can learn from users in VLEs, this also opens the possibility of training agents in virtual environments and, through a process of reverse immersion, enabling them to function in the real world. To some degree this approach has already been applied [10] but advances in VLE technology in terms of computational complexity, graphics and physics engines, have not yet been capitalized upon in mainstream research [11]. One capability that seems be to lacking is that which allows command, control and interaction with the virtual robots. Interactive learning within a virtual environment is the process that we present to enable this type of capability and in this work we present some preliminary findings and some future directions.
teacher to provide instruction, as well as receive feedback from the robotic student. The environment, KIKS 2.2.0 [12], provides a realistic simulation of a Khepera robot replete with noisy infrared sensors and noisy actuator motors. A keyboard, mouse and joystick are used to provide input to the environment. Feedback from the environment is provided through the use of dialog boxes and the environment itself. A dashboard is also provided, and depicts current sensor and actuator values for all components on the robot, in addition to virtual sensors such as distance traveled (a value inferred from the wheel encoders). The motivation for the simulation environment was to create a modular, extensible tool that enabled Interactive Learning with a human user and a pre-selected (virtual) robot. The overall objective was to provide the ability for the human teacher to transfer task knowledge to the robots through interactive demonstration. The task involved demonstrating a maze navigation behavior. Based on data from sensors {Si} where i ranged from 1 to 8, the robot was to learn the appropriate motor actuator values (aL, aR) such that the robot traversed the maze guided by the wall to the robot’s left. The Interactive Learning process proceeded as follows: The user was initially given the opportunity to teleoperate the robot to gain a sense of how to control the robot. This involves a test maze scenario in which a yellow light source was made visible for which the user had to navigate the robot to (Fig. 4). When the user reached the light, the next light source was made visible. When the user reached the fifth light source, the training session ended. The goal of this was to, not only provide the user practice using the joystick and interface, but also teach the user the behavior that they would have to teach the robot. This practice is in-line with the exploratory learning approach, as discussed in the previous section. During this practice scenario, the robot did not activate its learning process.
(1)
(2)
3.2 The Virtual Test Environment In this research, we would like to assess the effect interactive learning has on both the human teaching and robot learning process. As such, a 2D test environment was utilized to evaluate the learning process derived from the interactive learning approach. A key requirement of the virtual environment was to provide a realistic simulation of a robotic platform embedded within a workspace that allows the human
(3)
(4)
Fig. 4. Screen snapshots of the maze navigation behavior.
The IL process was then activated. During this cycle, the user was instructed to teach the robot to demonstrate the behavior by suggesting appropriate action via joystick
The International Journal of Virtual Reality, 2008, 7(1):9-14 triggering whenever the robot did not demonstrate the proper action. In other words, the human teacher was free to correct the robot by providing the actuator control values (through the joystick and interface) that they felt appropriate, even if they did not utilize the full ranges of possible values. During this phase, the human user was cautioned not to “confuse” the robot by providing inconsistent instruction (as is expected conduct with a human student/teacher pair). The IL scenario concluded when the robot had finally reached the end goal position, as shown in Fig. 5.
13
3.3 Results and future directions To evaluate the performance of the IL approach, two metrics for evaluation were utilized-1) number of interactions per time step and 2) performance versus time step. The first metric was primarily used to evaluate the role of the human teacher, i.e. over time; the human teacher should adaptively reduce the number of teaching suggestions provided to the robotic student. The second metric was used to evaluate the ability of the robotic student to learn using the Interactive Learning method. In this case, performance was calculated as a weighted average of distance from desired path and number of time steps to complete the given path. A total of 10 simulation runs were executed to test the performance of the Interactive Learning approach. Figure 6 shows the results from a typical run. As shown in Figure 6, interaction decreased as performance increased. The results presented confirmed that IL was an effective learning mechanism to transfer the maze navigation behavior. Additional results are documented in [13].
Fig. 5. Screen snapshot of navigation after Interactive Learning.
Fig. 7. Screen snapshots of 3D version of world. See Color Plate 3. Fig. 6. (a) Number of Teaching Interactions vs. Time. (b) Robot Learning Performance vs. Time.
In the Interactive Learning approach, a 2D test environment was utilized for evaluation of the method. Transition of the
14
The International Journal of Virtual Reality, 2008, 7(1):9-14
process must naturally now evolve into the 3D virtual world (as shown in Fig. 7). Fig. 7 depicts the same type of robot as shown in the 2D world, but now in a 3D environment. The same user interface was used to enable interaction with both environments. We found that after training in the 2D world, the robotic student was still able to perform effectively in the 3D world. The transition to a 3D environment though raises several interesting questions, such as 1). Are there any potential advantages in the use of a 3D environment for Interactive Learning? 2) What are the interface tools that would make operation in 3D effective, plausible and efficient? 3) Does IL provide significant advantages over other learning methods in 3D environments? Future work in this area involves answering these research questions, and thus helping to fill a void in the research on Virtual Learning Environments with respect to the integration of human and virtual instructors in robotic learning scenarios.
[7] [8] [9] [10]
[11] [12] [13]
IV. CONCLUSIONS In this paper, we have discussed two different methods for utilizing virtual and human instructors to enable learning for both human and robotic students. In the first approach, we present the concept of exploratory learning and the use of a 3D interactive environment that allows a human user to learn how to interact with a virtual robot. Results of using this methodology for training before interaction with a physical robot shows that the use of the virtual environment for learning to control a robotic device provides sufficient training to allow a user to become more effective in implementing a new task in a novel situation. In the second approach, we present the Interactive Learning method that utilizes a simulation environment to provide feedback to a human teacher during a training session in order to concurrently allow adaptation of the learning process for both the teacher and the robotic student. Results show that, after exploratory learning, Interactive Learning was an effective learning mechanism to transfer knowledge from a human teacher to robotic student. Future work in this research domain will involve expanding the virtual environment infrastructure and incorporating different modalities of communication, such as speech, for enhancement of the training process. REFERENCES [1] [2] [3] [4]
[5] [6]
E. Trist. The Evolution of Socio-technical Systems: A Conceptual Framework and an Action Research Program, Issues in the Quality of Working Life, vol. 2, Toronto, 1981. A. Howard. A Systematic Approach to Assess Performance of Human-automation Systems, IEEE Transactions on Systems, Man, and Cybernetics--Part C, vol. 37, no. 4, pp. 594-601, July 2007. J. Rieman. A Field Study of Exploratory Learning Strategies, ACM Transactions on Computer-Human Interaction, vol. 3, no. 3, pp. 189-218, 1996. A. Bunt, C. Conati and K. Muldner. Scaffolding Self-explanation to Improve Learning in Exploratory Learning Environments, proceedings of 7th International Conference on Intelligent Tutoring Systems, Maceio, Brazil, 2004. T. Fong and C. Thorpe. Vehicle Teleoperation Interfaces, Autonomous Robots, vol. 11, no. 1, pp. 9-18, 2001. J. Scholtz, B. Antonishek and J. Young. Evaluation of a Human-robot Interface: Development of a Situational Awareness Methodology,
proceedings of the 37th Hawaii International Conference on System Sciences, 2004. A. L. Cox. What People Learn from Exploratory Device Learning, proceedings of the Fourth International Conference on Cognitive Modeling, Mahwah, NJ, 2001. A. Howard and W. Paul. A 3D Virtual Environment for Exploratory Learning in Mobile Robot Control, IEEE International Conference on Systems, Man, and Cybernetics, Waikoloa, Hawaii, October 2005. N. Kubota and D. Hisajima. Interactive Learning for Partner Robot Based on Behavior Knowledge, SICE 2003 Annual Conference, vol. 1, pp. 214219, 2003. Dean F. Hougen. Learning with Holes: Pitfalls with Simulation in Learning Robot Control, Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), August 21-24, 2003, Washington, DC, USA. Erik Champion. Meaningful interaction in Virtual Learning Environments, IE2005: proceedings of the second Australasian conference on interactive entertainment, pp. 41-44, 2005, Sydney, Australia. T. Nilsson. Kiks is a khepera simulator, Masters Thesis, Ume University, Sweden, 2001. S. Remy and A. M. Howard. Learning Approaches Applied to Human-robot Interaction for Space Missions, International Journal of Intelligent Automation-special issue on Soft Computing for Space Autonomy (to appear 2008).
Ayanna M. Howard is an associate professor at the Georgia Institute of Technology. Her area of research is centered on the concept of humanized intelligence, the process of embedding human cognitive capability into the control path of autonomous systems. This work, which addresses issues of autonomous control as well as aspects of interaction with humans and the surrounding environment, has resulted in over 60 written works in a number of projects-from autonomous rover navigation for planetary surface exploration to intelligent terrain assessment algorithms for landing on Mars. To date, her unique accomplishments have been documented in over 12 featured articles-including being named as one of the world’s top young innovators of 2003 by the prestigious MIT Technology Review journal and in TIME magazine’s “Rise of the Machines” article in 2004. Dr. Howard received the IEEE Early Career Award in Robotics and Automation in 2005 and is a Senior Member of IEEE. Sekou Remy is currently pursuing his doctorate in Electrical and Computer Engineering with a minor in Industrial and Systems Engineering at the Georgia Institute of Technology. He also holds a B.S. in Computer Science from Morehouse College. Remy’s work focuses on ways to utilize advances in ECE, ISYE and CS for applications in the field of robotics. With a special interest in Human-robot Collaboration, his current research focus is on equipping robots to learn from humans in a manner more natural for each.