3d-icon based user interaction for robot programming by demonstration

66 downloads 77537 Views 207KB Size Report
ming by Demonstration (PbD) programming method- ology gained more and ..... tions pn Robotics and Automation, 10(6):799{822, Decem- ber 1994. 5] Henry ...
in: International Symposium on Computational Intelligence in Robotics and Automation, 1997

3D-ICON BASED USER INTERACTION FOR ROBOT PROGRAMMING BY DEMONSTRATION HOLGER FRIEDRICH, HARRY HOFMANN and RU DIGER DILLMANN Institute for Real-Time Computer Systems & Robotics (IPR) University of Karlsruhe D-76128 Karlsruhe,Germany E-Mail: [email protected].

ABSTRACT. Within the last years the Program-

the robot's, i.e., to compile user intentions into actual robot programs. Secondly, within this programming process, the system has to generate control knowledge and conditions for control structures such as loops, and branches. Moreover, applicability conditions and selection criteria for objects to be manipulated are to be setup. The correctness of all of these conditions and decision rules are crucial for the correctness and reliability of the resulting robot program1. Therefore both of the two interaction aspects have to be taken into account to build robot systems that are able to acquire human action and control knowledge from observing human performance and to establish a suitable communication link to the user.

1 INTRODUCTION

For meeting these goals the Programming by Demonstration (PbD) paradigm [2] seems to be the right way to follow. However, one crucial aspect for successful application of the PbD methodology is the ability to communicate in a very user friendly way. This demands interaction modes that ensure high quality of interaction with respect to user friendliness, robustness, intuitiveness, and eciency.

ming by Demonstration (PbD) programming methodology gained more and more attention in robotics. However, a high quality way of Human-Robot Interaction is crucial for a successful application of the PbD methodology in robotics. Hypotheses derived from the programming system, and control knowledge included in a generated robot program has to be checked back with and to be veri ed by the user in order to avoid potentially harmful errors that lead to faulty code. This paper describes a method for User-Robot Interaction that is based on 3D-Icons and supports and facilitates the programming process in a Robot Programming by Demonstration system signi cantly.

One of the major cost factors involved in robotic applications is the development of robot programs. Especially the use of advanced sensor systems and the existence of strong requirements with respect to the robot's exibility ask for very skillful programmers and sophisticated programming environments. These programming skills may exist in industrial environments but they are certainly not available if the use of robots in a personal environment is considered. For opening the expected new, mainly consumer-oriented service robot market [8], it is therefore essential to develop techniques that allow untrained users to use such a personal service robot both safely and eciently. Two basic aspects of the interaction between the robot and the user can be distinguished. Firstly, the user wants to be able to con gure and instruct the robot. This requires translation of the user's language into

In this paper a robust and intuitive Human-Robot Interaction method based on 3D-Icons is presented that enables the setup of a communication link between the user and the robot to be programmed which satis es the various demands. In the following rstly an overview and analysis of sev1 Here correctness is meant in the sense that the program produces an output which matches the user's intentions.

eral PbD systems2 and the user interaction modes used in these systems is given. Secondly, the robot PbD system given in this case is described and the interaction requirements posed by the system are stated. Thereafter, the 3D-Icon based interaction mode developed is presented in detail. The knowledge representations to be communicated, the concept of interactive 3DIcons, and the actual realization are given. Finally, an example is shown, and conclusions are drawn.

2 INTERACTION IN PbD SYSTEMS

Obviously, User Interaction is one, if not the key feature in PbD systems. Nevertheless, these systems have to include other capabilities and features like Machine Learning, sensor data fusion and processing, etc. too. It is clear that the focus in system development, and therewith the implemented features and interaction modes is depending very much on the area the systems are to be applied to. This explains the variety of interaction modes and di erences between PbD systems in general [7, 10]. In the following a small selection of current robot and non-robot PbD systems is analyzed in order to give an overview on User-Interaction techniques employed. 2.1 Robot PbD systems Due to the nature of robotics, and because of the focus of the researchers, the system's presented in this section do vary greatly in the way they interact with the user. The NODDY system developed by P.M. Andreae [1], although being designed for instructing robots is more of a pure Machine Learning system in which the interaction aspect is reduced to giving demonstrations in a 2D simulated environment. Generalizations and the inclusion of program structures based on these demonstration sequences are justi ed with respect to generalization templates and structural matching algorithms.

However, there is no interaction whatsoever with the user for veri cation, or knowledge acquisition beyond the actual demonstration. The ARMS system presented by A.M. Segre [9] is also working on demonstrations given in a simulated environment. Based on state schemata being partial descriptions of world states and operator schemata enabling the transition from a speci c world state to a speci ed goal state Explanation Based Learning is 2 In this paper the term robot PbD does refer to task-level programming only.

performed in order to generate new operators that enhance a robot's applicability and performance. In this system again there is no interaction beyond the actual demonstration which is given in a simulated environment. Learning by Watching (LBW) as proposed in [4] approaches the robot PbD method from an engineering point of view concentrating mostly on the acquisition and processing of the initial demonstration, which is given in reality with the user's own hands. High e ort is spent on sophisticated, and ecient vision sensors, and image processing algorithms that enable the analysis of such a demonstration in real-time. Generalizations are reduced to the fact that objects involved in a task are allowed to be positioned di erently in the environment when executing an acquired action sequence compared with the setting that was present during the demonstration. Also this system is highly interactive in the demonstration phase, the interaction is limited to a passive observation of the user by the programming system. Moreover, no further interaction for veri cation, intention acquisition, or con rmation purposes takes place. In the RPD system described in [3], the demonstrations are given via controlling a real robot arm. Thereafter, the recorded numeric data is transformed to a symbolic operator sequence. Finally, this sequence is generalized, and program control structures like branches are included based on the operator sequence re ecting the demonstration and the actual user intention. The RPD system provides, and requires several phases of interaction during the PbD process. Besides the demonstration, the user has the opportunity to con rm, or correct the results of the transformation from the numeric sensor data to the symbolic operator level. Finally, the user's intention is acquired. The system actively presents the world state the user has produced with the actions of his demonstration and asks him to identify the intended parts. 2.2 Non-Robot PbD systems Cima, the PbD system developed by Dave Maulsby [6] provides the algorithms and interfaces for an instructible agent that incrementally learns to carry out repetitive tasks like formatting of bibliographies in a speci c style by observing the user. The applicability of the system was proven in several non-robotic domains such as graphic editors, text processing, and e-mail administration.

The knowledge about what actions have to be performed on a data-sets is incrementally build up. Therefore, this system requires more than one demonstration and thus is highly interactive. Once knowledge is gathered, the system starts performing the task automatically. However, whenever the program the system has created for the task shows to have errors, the user can interact by either rejecting actions, giving hints, or corrections.

Mondrian is an object-oriented graphic editor that

learns new graphical editing procedures by example. The system developed by Henry Lieberman [5] uses a simple explanation-based learner. By observing the user and generating explanations for the user's actions a generalized procedure applicable to a set of objects is de ned. Like in Cima, interactions in Mondrian can be of various kinds. The user points out the objects to be observed at the beginning. Then the user gives the demonstration. Speech recognition is used to acquire verbal hints. The generalizations inferred on the observed procedure are communicated to the user via speech and graphical output. 2.3 Analysis When analyzing the interaction requirements of the systems it is obvious that these vary strongly. Most of the robot PbD systems (NODDY, ARMS, LBW) restrict the user interaction to the actual user demonstration. In the contrary, the RPD system as well as the two reviewed non-robot PbD systems need more feed-back and acquire information via further user interaction in order to solve their learning/programming task (see also table 1).

However, the conclusion that these systems would show an inferior learning performance and thus would require more user interaction must not be drawn. Instead these systems acquire the additional knowledge from their users in order to avoid errors, adapt their knowledge bases and generally generate more robustly exactly the programs and solutions that the user desired and intended. The less interactive systems are much more focused on the pure learning or sensor data processing aspects, but lack the robustness of the more interactive ones. This is especially disadvantageous in the eld of robotics, where errors in programs can cause severe damage or even harm personnel. On the other hand the RPD system which generalizes exclusively w.r.t. to the acquired user intention and thus

does provide robustness still lacks excellent interaction features. The intention acquisition is done via PopUp menus, using objects names to reference objects, to present hypotheses and to ask for information.

3 THE SYSTEM

After having seen the di erent, but nevertheless limited interaction steps and modalities required and/or o ered by most robot PbD systems, a new PbD system which employs the 3D-Icon interaction method as one of several interaction mechanisms for interactive robot programming is presented shortly in the next paragraph. Thereafter, interaction requirements that can be derived from the systems presented in the preceding paragraph and from this new system will be analyzed. 3.1 System Overview The system is based on the experiences gained, and the algorithms developed in the RPD system mentioned above [3]. This new development is di erent since it does not use a speci c real robot device for demonstration purposes, but a data-glove and a trinocular vision system. Currently the vision system is not used in the demonstration process itself. The research in the area of image processing, object detection etc. is done in parallel and will later be integrated with the learning, simulation, and graphics and data-glove based interaction parts.

Therefore the demonstrations are currently performed in a virtual reality. The user wears the data-glove and manipulates objects directly in the simulated environment, which is displayed via a 3D visualizer on a display. After the demonstration, the sensor values recorded from the data-glove and the initial world model are taken and the series of world-states that occurred during the demonstration is calculated. The worldstates are represented by means of predicates that describe relations between objects, e.g. `isWestof', `isPartlyInside', `PartlyCovers'. A complete world-state description3 consists of a conjunction of all predicates that hold between the objects of the world in that state. Based on this series of world-states and the glove data, the demonstration is mapped onto a series of instantiated symbolic operators that when executed produce 3 This means complete w.r.t. the representation and computation capabilities of the system

system NODDY ARMS LBW RPD

interaction

problem

demonstration in simulation unintended generalizations demonstration in simulation unintended generalizations demonstration with own hands low generalization performance demonstration with robot robot dicult to control con rmation of operator sequence based on trajectories, dicult to correct supply of intention, menu and name based object and predicates names, dicult to remember Cima demonstrations, hints unintended generalizations, corrections corrected incrementally Mondrian speci cation of objects speech and graphics input speech and graphics output Table 1 Overview of systems, their interaction modes, and remaining/resulting problems the same e ects as the user did with the demonstration. Now, the user intention is acquired. Once this is done the operator sequence is generalized using the algorithms already developed for the RPD system. This leads to generalized robot programs coded as a sequence of operators which themselves can be used as macro-operators for constructing more complex programs. 3.2 Interaction Requirements From the system overview it becomes clear that the interaction points during the acquisition process resemble those of the older RPD system. Interactions do occur during   

the actual demonstration, the con rmation and/or correction of the determined operator sequence, and the the intention acquisition.

The 3D-Icon based interaction proposed in this paper is used for the third interaction phase. Here the system acquires the user's intentions in order to be able to justify and infer generalizations in the next step of the programming process. In the acquisition process the system does actively present the nal world-state to the user and asks whether the creation of it was intended by the user and thus does re ect his intention or not. Since the intention acquisition is based on the world-state representation it is clear that the importance of the chosen world-state representation with respect to further user interaction and the methods applied are not to be underestimated. From this dependency on the world-state representation, the interaction habits and capabilities of human

users, and the nature of robot tasks several di erent requirements are posed on an ecient method for interactive intention acquisition. 

Since the world-state (or parts of it), as well as conditions are represented by disjunctions and conjunctions of the predicates introduced above in paragraph 3.1 to the user, the chosen interaction method must be capable of communicating these predicates' semantics to the user.



Besides a predicate's semantic also the predicate's arguments, which are objects have to be presented and pointed out in a user friendly way.



Due to the nature of robots the presentation of the predicates' semantics as well as their arguments must be such that their meaning and identity in the environment become clear quickly, unambiguously, and robustly. Otherwise misinterpretations, misunderstandings, and wrong answers by the user might be the consequence, which would again raise the risk of generating faulty, potentially harmful code.



Since a human user is only capable of responding to a certain amount of information he gets presented at the same time, the intention acquisition process should be an incremental one. The intention should be acquired stepwise such that the user is not overloaded with questions and facts.



Finally the mode of interaction should be based on graphics since vision is the human sense which the highest bandwidth and the most powerful processing areas in the human cortex of all senses.

4 ROBOT PbD & 3D-ICONS

In order to meet all the requirements stated above, an interaction mechanism involving simple 3D-Icons was designed. The basic interaction concept, as well as its realization is described in the following. 4.1 Concept Summarizing the analysis of the interaction requirements a method is needed, that communicates the predicates for world-state and condition description robustly to the user, while not overloading the user's senses with too much information at once, and exploiting the human vision capabilities. With respect to these demands the following concept of a 3D-Icon based interaction method was developed. In order to avoid the presentation of too much information at a time, the acquisition process is done stepwise. Each step the system does actively present part of the nal world-state of the demonstration to the user which is given as a conjunction of predicates describing objects relations. It is asked whether the creation of this part was intended by the user and thus does re ect part of his intention or not. The stepwidth depends on the predicates involved in world-state description or condition. It's either one predicate per step or a conjunction of these that is displayed in each step. A visualization of the predicates semantics, and its arguments (objects of the environment) is given. The semantics is displayed using a 3D-Icon, and the arguments are presented using a 3D visualization of the environment they are in. The 3D-Icons are kept simple. Basic shapes like cylinders, cubes, and lines are used in order to give very simple visual examples of the predicates' semantics. This way a simple 3D scene is shown in which the objects share the same relation as the complex ones in the cluttered environment the demo was performed in. However, for the user the relation to be expressed becomes clear quickly and easily when looking at the 3D-Icon. The predicates' arguments, which are basically objects in the scene are visualized separately by presenting the part of the environment, the objects are in. By using visual e ects the link between the semantics visualized by the simple 3D-Icon and the actual objects in the scene that share this relation is expressed. 4.2 Realization The realization of the stepwise, 3D-Icon based interaction method for acquisition of a user's intention was

done by extending the 3D visualizer KaVis4 . The tool is based on the OpenGL, and the OpenInventor graphics libraries. The basic concept is that of a frontend visualization process that displays and manipulates OpenInventor models for other applications. The KaVis API library is linked to the application process. The API library functions are called by the application and communicate with the visualization process via the PVM5 tool. The 3D-Icons were modeled in OpenInventor using basic shapes. Thus they can be rendered easily and quickly using the visualizer. The visualizer itself was extended with procedures that enable the visualization of a split screen, in which a 3D-Icon is displayed on one side and the OpenInvenor models of the part and objects of the environment involved are displayed on the other. The objects in the scene that share the relation which is exempli ed by the 3D-Icon are highlighted by using visual e ects like arrows, directed light sources etc. All these e ects are realized employing the available OpenInventor methods and engines and temporarily introducing objects to the world model, e.g. the arrows, that are deleted again after the interaction step. In order to enable the acquisition of the user's reaction on the displayed facts, namely rejection or con rmation, Pushbuttons are added to the window widget. Finally, API calls were added to the KaVis API library. Now the robot PbD system simply has to link the API library, and to call the appropriate functions in order to use the 3D-Icon based interaction for the intention acquisition process.

5 EXAMPLE

An example for the realization of the 3D-Icon based user interaction in the robot PbD system described is shown in gure 1. In this case a 3D-Icon is displayed that allows easil representation of several spatial relationships between two objects at once. The cylinder in the center of the icon represents the reference object, which is the according predicates' second argument. The other cylinders surrounding the center one are divided in three slices. By colouring the appropriate slice in the respective cylinder a combination of the spatial relations and the respective predicates `isEastof', `isWestof', `isSouthof', `isNorthof', `isAboveof', and `isBelowof' that may hold between the reference and the rst object can be represented clearly. On the 4 5

Karlsruhe Visualizer Parallel Virtual Machine

Figure 1 3D-Icon representing the spatial relation isEastof(puma,glove) split screen's left side the two object are displayed in the environment. Thus the user easily can link the represented semantic to the objects involved and can decide whether he/she'd like to keep these relations as conditions in the program to be generated or not.

6 CONCLUSIONS In this paper we presented an interaction method for the acquisition of a user's intention in robot PbD systems. An analysis of the special requirements stemming from the eld of robotics, the human interaction limitations and capabilities, and of interaction methods implemented in state of the art robot and nonrobot PbD systems was presented. With respect to this analysis the concept and realization of the 3DIcon based interaction method was presented. Summarizing, the 3D-Icon based interaction method meets the requirements of o ering an easy to use, robust, human oriented interface for intention acquisition which is mandatory in order to produce correct and reliable robot programs from demonstrations.

ACKNOWLEDGMENT This work has partially been supported by the Deutsche Forschungsgemeinschaft Project \Programming by Demonstration". It has been performed at the Institute for Real-Time Computer Systems & Robotics, Prof. Dr.-Ing. U. Rembold and Prof. Dr.Ing. R. Dillmann, Department of Computer Science, University of Karlsruhe, Germany.

7 REFERENCES [1] Peter Merrett Andreae. Justi ed generalization: Acquiring procedures from examples. Technical Report AI-TR-834, Arti cial Intelligence Laboratory, MIT, 1985. [2] A. I. Cypher. Watch what I do { Programming by Demonstration. MIT Press, Cambridge, Massachusetts, 1993. [3] H. Friedrich, S. Munch, R. Dillmann, S. Bocionek, and M. Sassin. Robot programming by demonstration: Supporting the induction by human interaction. Machine Learning, pages 163{189, May/June 1996. [4] Yasuo Kuniyoshi, Inaba Masayuki, and Hirochika Inoue. Learning by watching: Reusable task knowledge from visual observation of human performance. IEEE Transactions pn Robotics and Automation, 10(6):799{822, December 1994. [5] Henry Lieberman. Watch what I do, chapter MONDRIAN: A teachable graphical editor. 1993. [6] David Maulsby. Instructible Agents. PhD thesis, University of Calgary, Calgery, Alberta, Canada, 1994. [7] M. Sassin and S. Bocionek. Programming by demonstration: A basis for auto-customizable workstation software. In Proc. of the Workshop Intelligent Workstations for Professionals, Munchen, 1992. [8] R. D. Schraft. Serviceroboter - ein Beitrag zur Innovation im Dienstleistungswesen. Fraunhofer-Institut fur Produktionstechnik und Automatisierung (IPA), 1994. in German. [9] A.M. Segre. Machine Learning of Robot Assembly Plans. Kluwer Academic Publishers, 1988. [10] B. Shepherd. Applying visual programming to robotics. In IEEE International Conference on Robotics and Automation, volume 2, pages 707 { 712, 1993.

Suggest Documents