DESIGN PRINCIPLES FOR DEPENDABLE ROBOTIC ... - CiteSeerX

International Journal of Humanoid Robotics – Inaugural Issue. March 2004.

DESIGN PRINCIPLES FOR DEPENDABLE ROBOTIC ASSISTANTS RAINER BISCHOFF and VOLKER GRAEFE The Intelligent Robots Laboratory Bundeswehr University Munich, 85577 Neubiberg, Germany [email protected], [email protected] http://www.unibw-muenchen.de/hermes Received (Day Month Year) Revised (Day Month Year) Accepted (Day Month Year) A large number of functionalities were integrated in a single fully autonomous humanoid robot, HERMES. To evaluate the dependability of this extremely complex machine, and its ability to interact with strangers, HERMES was exhibited in a museum, far away from its home laboratory, for more than six months. During this period the robot and its skills were regularly demonstrated to the public by non-expert presenters up to 12 hours per day. Also, HERMES interacted with the visitors, chatted with them in English, French and German, answered questions and performed services as requested by them. Only three major failures occurred during the 6-months-period, all of them caused by failures of commercially available modules that could easily be replaced. Key to this success was the dependability that had been originally designed into HERMES. During the design process certain design principles were followed in both hardware and software. These principles are introduced, and some long- and short-term experiments carried out with the real robot in real environments are presented. In fact, by demonstrating HERMES in the museum, at trade fairs and in TV studios – besides our institute environment – we have learned valuable lessons, especially regarding the interaction of a complex robotic assistant with unknown humans. Although we did not quantitatively evaluate the robot’s performance or acceptance by the non-expert users, several qualitative results are given in this paper, and many videos highlighting these results can be downloaded from the HERMES homepage. Keywords: dependability; cooperative learning, multi-modal communication, long-term experiments

1. Introduction “Dependability” is a system concept that integrates such attributes as reliability, availability, safety, confidentiality, integrity, and maintainability [Laprie 1992]. The goals behind the concept of dependability are the abilities of a system to deliver a service that can justifiably be trusted, and to avoid failures that are more frequent or more severe, and outage durations that are longer, than is acceptable to the user(s). Since our society largely depends on infrastructures that are controlled by embedded information systems, the dependability concept has been widely employed for these kind of systems. Although service and personal robots are supposed to become an important part in our future society, dependability aspects have been almost constantly neglected by researchers. However, dependability concepts are needed especially for these types of robots because they are intended to operate in unpredictable and unsupervised environments and in close proximity to, or in direct contact with, people who are not necessarily interested in them, or, even worse, who try to harm them by disabling sensors or playing tricks on them. Dependability has not been a major issue in research institutions so far because it is believed that industrial companies, when they will actually market service or personal robot products, will eventually deal with this question. Researchers in laboratories have always been satisfied if their robots performed well once or twice under specific conditions or at endof-project demonstrations, which enabled them to write a publication about their “perfectly”

IJHR, Inaugural Issue, March 2004

2

Bischoff, Graefe: Design Principles

Figure 1: Humanoid experimental robot HERMES; size: 1.85 m x 0.70 m x 0.70 m; mass: 250 kg

performing robot. Unfortunately, these “performances” make people (sponsors, public) believe that most of the robotic community’s problems are already solved, which is certainly not true. On the contrary, much research is still needed to improve considerably not only system reliability and safety concepts, but also design concepts, locomotion and manipulation capabilities, cooperation and communication abilities, reliability, and – probably most importantly – adaptability, learning capabilities and sensing skills. Exhibitions offer excellent opportunities for studying and evaluating a robot’s communication skills and dependability under real-world conditions, especially if the robot is exposed to the public, and allowed to interact with it, for extended periods of time. However, to have a chance of surviving such a long-term test without annoying failures, a robot must be much more dependable than a typical research robot in a laboratory. This requirement is probably the reason that, to the best of our knowledge, only a few research groups have ever undertaken long-term experiments with their robots interacting with strangers outside their own laboratories. From 1998-2002 three museum tour guide robots were running in two different museums of Pittsburgh. They were developed, installed and maintained by the group of Nourbakhsh and logged more than 6 years of operation altogether [Nourbakhsh 2002]. The longest running robot, “Chips”, operated for almost 4 years and accumulated over 500 km in the Carnegie Museum of Natural History in Pittsburgh. The three entertaining robots of Fraunhofer IPA [Graf et al. 2000] are still working in the entry hall of the telecommunications museum in Berlin and accumulated over 1000 km and valuable experience in non-expert operation in a typically less crowded environment. During the World Fair Expo 2000 in Hannover, 72 mobile robots (size 1.6 to 4.5 meters) were constantly moving freely on a surface of 5000 m2 with speeds up to 0.25 m/s while reacting to the presence of visitors and coordinating themselves in relation to each other [BBM Expo 2000]. Unfortunately, up to date we have not become aware of any scientific report on this experiment. During the World Fair Expo 2002 in Lausanne, 10 mobile robots with more sophisticated ways of interacting with visitors (via speech and buttons) were running quite reliably over a period of 6 months [Arras et al. 2002]. Similar tests were carried out by Thrun and Burgard with the robots RHINO [Burgard et al. 1999] and MINERVA [Thrun et al. 2000], albeit under the supervision of experts and only


3


for a few days. Several short-term experiments were performed with the mobile humanoid robot “Robovie” at exhibitions and in laboratory environments [Ishiguro 2001]. More recently, two Robovie robots acted as English speaking “foreign” children in a Japanese elementary school over a period of 18 days, with an average interaction time of less than 45 minutes (max. 72 minutes) per day [Kanda et al. 2003]. Interacting with the same group of pupils over this period of time revealed that Robovie made strong first impressions on the subjects (during the first three days 58% of the English utterances occurred), but could not maintain long-term relationships. Long-term experiments with mobile robots in their respective institute environments were carried out by [Simmons et al. 1999] at the Robotics Institute (CMU, Pittsburgh) with the robot XAVIER, one of the first mobile robots controllable via a Web interface, and by a research group at the Institute of Robotics (ETH, Zürich) with a mobile mail distribution system called MOPS [Tschichold et al. 2001]. Commercially available robots that do not possess complex interaction interfaces, but are nonetheless easy to operate and have been exposed to a general public, are the Helpmate robot [King, Weiman 1990], which was installed in dozens of hospitals world-wide, and a cleaning machine equipped with a Siemens Corporation navigation system, still working in a supermarket in the Netherlands [Endres et al. 1998]. The number of research groups taking their autonomous robots outside their protected research environment is increasing, but reports about this kind of experiments are few. Typically, robots are deployed together with their developers, and experiments are limited to a few days. There might be other groups that have been carrying out truly long-term experiments, but the fact that those experiments have not been reported at major conferences shows that integration and dependability issues as well as long-term experiments are not yet considered important and interesting problems, neither by the robotics research community nor by the funding agencies or bodies. Also, the projects listed above focused primarily on navigation and more or less simple human-robot communication (more complex in the cases of Robovie, MINERVA and RHINO). Much research effort (and money) has recently been spent on improving mechanics, control and multimodal interaction algorithms for two-legged humanoid robots. The most popular and most referenced ones of them, e.g., Sony’s entertainment robot SDR-4x / QRIO [Fujita et al. 2003], Honda’s marketing robot ASIMO [Sakagami et al. 2002], the (mostly) teleoperated HRP robots ([Inoue et al. 2001], [Yokoi et al. 2003]), Tokyo University’s remote-brain and Hx series of humanoids ([Kanehiro et al. 1998], [Nishiwaki et al. 2000], [Kagami et. al 2002]), and Munich University’s Johnnie [Seara et al. 2003], are marvelous, high-performance, highly-integrated devices, but much too sophisticated to be used and maintained by non-expert users. Whenever these robots are shown in their respective laboratories or at exhibitions, several skilled engineers have to carefully start and closely supervise their hard- and software. Moreover, most of these robots depend to a large part on external computers for executing their runtime software. – We wonder if service or personal robots will ever become valuable servants of our future society if not more robots are fielded for extended periods of time with a richer set of functionalities, a higher level of human-robot interaction, in a completely integrated manner and in realistic settings actually performing useful service tasks autonomously. As we pointed out before, dependability is crucial for robot assistants to be able to serve at home or at public or work places and to be accepted by society. It is one aim of this paper to raise the awareness for research on integration and dependability, and for long-term experiments. There is no other way to increase the dependability of useful robotic assistants in the long run.


4


2. Designing for Dependability The most general design rule to be kept in mind is that a robot system exists for one reason: to provide value to its users. Therefore, before specifying any system requirement, or determining the hardware platform or development processes, one has to answer the question whether a planned feature contributes to the system’s ability to provide value to the user. All design decisions should be made with this general rule in mind. Furthermore, the dependability of a robot is not something that can be added on after the robot has been designed and built. Rather, it must be designed into the robot from the very beginning and, specifically, we claim that it emerges from the following design principles: (i) Learning from nature how to design reliable, robust and safe systems; (ii) Providing natural and intuitive communication and interaction between the robot and its environment; (iii) Designing for maintainability; (iv) Caring for a simple, systematic and tidy design; (v) Optimizing system performance through field tests with novice users. Although several – if not all – of these design principles might be considered “common sense” for robotics engineers, they have definitely not been explicitly formulated before and are (as a matter of personal experience) not followed by a large part of the research community. We strongly believe that future robotic assistants could benefit from applying these design principles. At least, they have guided us in the design and construction of our humanoid robot HERMES (Figure 1) and are, therefore, explained in greater detail in the sequel. Of course, additional design rules exist, which are not mentioned here, but must be followed by the designer of any robotic system with respect to the application domain. 2.1. Learning from nature Nature has created highly complex, efficient and dependable systems in the form of organisms since the very beginning of life on earth. Design and function of organisms have been optimized under evolutionary pressure over billions of years, a small step at a time. It would be foolish to not apply nature’s solutions to today’s engineering problems, since design constraints and objectives are very similar, e.g., functionality, optimization and cost-effectiveness. Since the appearance of life on earth nature has provably designed reliable, robust and safe “systems” that are well capable of surviving in their respective habitats. Therefore, we strongly believe that robotics could benefit from abstracting some good design principles and concepts from nature. This first design principle essentially means that learning from nature should be taken as stimulation for independent technological design. Instead of directly copying nature’s ideas (i.e., taking them as blueprints) researchers should try to find analogous means for transferring the ideas into technical solutions. The concept of learning from nature is widely applicable to, e.g., material science, structure and construction, kinematics and dynamics, data and information processing, computation, energetic processes etc.. The essence of this design principle is multidisciplinarity. To find practical engineering solutions for robotic systems, various disciplines need to contribute, and success can only come from their integration. To give an example: according to the classic approach, robot control is model-based. Numerical models of the kinematics and dynamics of the robot and of the external objects that


5


the robot should interact with, as well as quantitative sensor models, are the basis for controlling the robot’s motions. The main advantage of model-based control is that it lends itself to the application of classical control theory and, thus, may be considered a straightforward approach. The weak point of the approach is that it breaks down when there is no accurate quantitative agreement between reality and the models. Differences between models and reality may come about easily; an error in just one of the many coefficients that are part of the numerical models can suffice. Organisms, on the other hand, are robust and adapt easily to changes of their own conditions and of the environment. They never need any calibration, and they normally do not know the values of any parameters related to the characteristics of their “sensors” or “actuators”. Obviously, they do not suffer from the shortcomings of model-based control which leads us to the assumption that they use something other than quantitative measurements and numerical models for controlling their motions. Perhaps their motion control is based on a holistic assessment of situations for the selection of behaviors to be executed. Possibly robotics could benefit from following a similar approach. Following this line of argumentation we strongly believe that sensing in general should be based on the senses that have proved their effectiveness in nature. Therefore, vision – the sensor modality that predominates in nature – is also an eminently useful and practical sensor modality for robots. Also, tactile sensing and hearing may greatly improve a robot’s safe operation as shown by nature. Following this line of argumentation we decided that sensing, for example, should be based exclusively on those senses which have proved their effectiveness in nature. Therefore, vision – the sensor modality that predominates in nature – is also the main sensor modality for our robots. In addition to it, tactile sensing and hearing, also wide-spread in nature, greatly improve our robot’s overall sensing abilities. Moreover, we use a calibration-free approach [Graefe 1999] wherever it is practical, giving our robots a high degree of robustness against parameter changes and of adaptability to changing environmental characteristics. HERMES’ system architecture is based on the concepts of skill and behavior – principles that are used to describe human and animal behavior as well. Furthermore, communication and the interpretation of the sensor data (i.e., perception) are based on comparable natural and biological principles. 2.2. Providing natural and intuitive communication and interaction Any person who might, voluntarily or not, encounter a service robot needs to be able to communicate and interact with the robot in a natural and intuitive way. Therefore, the human communication interface has to be designed in a way that no training would be required for any person who might get in contact with the robot. This can be achieved if the human-robot communication would resemble a dialogue that could as well take place between two humans. If the robot resembles a human, a person can easily derive from his former everyday experience with humans how a specific interaction, e.g., exchanging objects with the robot, might work. Even if the robot does not have humanoid shape, a safe confidence-inspiring interaction could benefit from humanoid characteristics, such as smoothness of movements and compliance of the joints or links. In general, unexpected robot movements should be avoided. Instead, gentle human-like motions should be generated to enable operators or uninvolved persons to anticipate the robot’s actions. Therefore, it might be useful to additionally visualize the robot’s state or imminent motions in a way that facilitates anticipation, e.g., with help of facial expressions, postures or


6


even indicators that humans are familiar with in everyday situations. Doing so, it should be the goal to exploit the people’s own intuition to make the interaction safer. 2.3. Designing for maintainability Maintainability is the characteristic of design and installation that affects the amount of time and cost necessary to repair, test, calibrate, or adjust an item to a specified condition when using defined procedures and resources. Design for maintainability has as a prime objective the design of systems, subsystems, equipment and facilities capable of being maintained in the least amount of time, at the lowest cost, and with a minimum expenditure of support resources. Maintainability must be designed into a system or equipment during the beginning stage of development to ensure that costly maintenance or redesign are avoided. The overall goal of realizing maintainability is, of course, not limited to robotic systems but should be addressed by all engineering disciplines. An example for a comprehensive standard is the U.S. Department of Energy (DOE) Human Factors/Ergonomics Handbook for the Design for Ease of Maintenance [DOE 2001]. Although it establishes system maintainability design criteria for DOE systems, subsystems, equipment and facilities, design principles for robotic systems can be derived from it, too. One important guideline in designing systems that are easy and cost effective to maintain is to separate the complete system into physically and functionally distinct units to allow for easy removal and replacement. The great advantage of this modular approach is that it allows us to design, develop, test and improve system components alone, or even to buy them directly off the shelf, before integrating them into a complex system. If the components themselves are failsafe and need little or no maintenance at all, overall system maintainability is greatly increased. This prime directive applies to sensors, actuators and computers which should be enclosed in maintenance-free subsystems. Nevertheless, these subsystems should be easily accessible in case they need to be replaced. Typically, research robots do not possess these minimal characteristics. They usually have a heterogeneous, hard-to-maintain overall structure and are composed of components that happen to be available in the lab but are not necessarily suitable. They are often built into the robot in such a way that no easy replacement is possible in case something breaks. We believe that only a robot that needs little or no maintenance and that can be easily repaired (if ever needed) will be accepted as a co-worker, caretaker or companion. A strictly modular design where all modules have standardized, homogeneous mechanical and electrical interfaces is considered as most important. If these modules are connected via powerful communication links they can be nearly arbitrarily configured and adapted to changing requirements. This concept of modularity should be pursued both for the construction of the robot body and its sensors, and for the structure of the information processing system. Thus, especially for research purposes, it should be possible, on the one side, to increase the degrees of freedom of the overall system or to attach more sensors, and, on the other side, to adapt the processing power by adding computational nodes if this should become necessary. On the basis of “off-the-shelf” components a new robot can thus be created that is homogenous, flexible, and easy to maintain. 2.4. Caring for a simple, systematic and tidy design In engineering talk, K.I.S.S. means “Keep it simple, stupid”. Scientists use the more literate rule of “Ockham’s razor” – all else being equal, the simpler theory is preferable. This most wise principle, attributed to the 14th century logician and Franciscan friar William of Ockham,


7


states in its original form: “Entities should not be multiplied unnecessarily”. Adopted by many scientists over the centuries, such as Leibniz, Newton, Einstein, Dirac, and Hawking, just to name a few [Physics FAQ 1996], it can be applied to nearly every facet of life and business. Applied to robotics, K.I.S.S. should read “keep it simple and systematic”, meaning that one should care for a systematic design and prefer planning ahead and simplicity over creeping featurism and steadily growing development complexity. Adding features over time, one by one, without having thought of a flexible and scalable system structure or architecture beforehand, adds unnecessary complexity to the system, making it more and more difficult to understand. “Simple” design, on the other hand, does not mean that useful features (even invisible internal features) have to be discarded in the name of simplicity. It also does not mean “quick and dirty”. In fact, it often takes much thought and work over multiple iterations to simplify. The payoff is a design that is compact and understandable, and can be extended and reorganized. Therefore, it is much better maintainable and testable, and also less error-prone. Furthermore, a clear vision is essential to the success of robot system design. Without conceptual integrity, a robotic system threatens to become a patchwork of incompatible subprojects, held together by the wrong kind of screws. Therefore, a clear sense of a robot system’s architecture is indispensable. Compromising the architectural vision of a robot’s software system weakens, and will eventually break, even the most well designed robot. It is a matter of personal experience that, especially in research environments, robots often fail because of broken cables and unreliable connections. Such robots often look very cluttered with cables criss-crossing each other, and circuitry and connectors hidden under bundles of wires. This “patchwork design” not only makes visual inspection difficult, but it may also be taken as an indication that the persons who built and maintain the robot have placed little emphasis on a systematic design. This patchwork often results from creeping featurism, i.e., one feature has been added after the other without having planned ahead for extensions and scalability in the first place. Furthermore, there seems to exist a strong correlation between the external appearance and the trustworthiness and reliability of an experimental robot. Systematic design does not only lead to an improvement of the internal structures (software, hardware, wiring) of a robot but also of its appearance. A “tidy” looking robot appears definitely more trustworthy than a “patchwork robot”. Although software is not visible, any observer wonders whether the structure of the robot’s software resembles the robot’s appearance, e.g., the layout of the robot’s wiring. If, on the other hand, a designer made an effort to have the robot look tidy, it may be assumed that he also took care to do other things right, such as reliably connecting the different sensors, actuators and peripherals, finding proper ways to route all cables – and maybe even structure the software in a systematic way. Of course, a good design involves more than esthetic aspects. Industrial designers, e.g., consider all aspects from ergonomics over construction to deployment. Nevertheless, it should be mentioned that only a few research institutes really try to consider these aspects in a holistic fashion to provide a truly robust system. If robotics researchers placed more emphasis on the ease of maintenance and robustness of their robots, their robots might become more dependable, and their appearance tidier, as side effects. Also, people might feel more comfortable and relaxed if they had to interact with a “tidy” robot, thus, improving overall success rate of interactions.


8


2.5. Optimizing system performance through field tests with novice users Even if a robot is composed from already thoroughly tested modules (or components), overall performance must be verified and optimized in a holistic fashion. World-wide accepted test cases, standards and benchmarks must be developed that exceed mere feasibility studies mainly conducted so far on an individual basis. It is simply not sufficient to just prove the concept of a single functionality, and that sometimes even under carefully prepared environmental conditions. For creating a real robot working in the real world in real time over an extended period it is necessary to conduct extended field trials. It is important that the persons who interact with the robot during these experiments are similarly experienced, or inexperienced, as those persons who will later be expected to interact with the robot or to benefit from its services. Although future service and personal robots are supposed to become an important part in our future society, only a few long-term tests of robots have been carried out so far. Dependability issues have been almost constantly neglected by the researchers involved. But robot technology can only mature when researchers start integrating all required technologies in a single robot and test the system for dependability under real-world conditions and in its interactions with persons who were not involved in its development. Only they can provide the wanted feedback that enable developers to optimize system performance according to the user needs. 3. Dependable Robot Hardware We applied the above laid-out design principles and realized them in our experimental service robot HERMES. In accordance to the first and second design principles the robot’s design is anthropomorphic, taking a human as a design model (for a detailed argumentation see [Bischoff 1997]). Abstracting nature instead of copying it has lead to an anthropomorphic upper body mounted on a wheeled undercarriage. The selected main sensors resemble the three human senses vision, touch, and hearing. To comply with the third and fourth design principles we placed great emphasis on modularity and extensibility. These design principles are particularly important for research robots because they have to be developed and improved both in terms of hard- and software over many years involving different researchers. Caring for modularity and extensibility simplifies future design modifications enormously without sacrificing a homogenous overall structure. All drives are realized as modules with compatible mechanical and electrical interfaces; each drive module consists of two cubes rotating relative to each other and containing a motor, a Harmonic Drive gear, power electronics, sensors, a micro-controller, and a communication interface. A single CAN bus connects all modules with the main computer. HERMES runs on 4 wheels, arranged on the centers of the sides of its base. The front and rear wheels are driven (48 V, 500 W each) and actively steered (24 V, 100 W each), the lateral wheels are passive. The manipulator system consists of two articulated arms with 6 degrees of freedom each on a body that can bend forward (130°) and backward (-90°). Each arm is equipped with a two-finger gripper that is sufficient for basic manipulation experiments. The work space extends up to 120 cm in front of the robot; the deposit area in the back of the robot can easily be reached. Each arm has a mass of 14.2 kg and a payload of 2.0 kg (on a fully stretched out arm; shoulder modules are 24 V, 200 W each, all other modules 100 W). By activating the modules’ magnetic brakes it is possible to exert much higher forces on


9


objects, if only the platform’s degrees of freedom are used (e.g., to open doors). Although more energy-efficient designs with better mass-to-payload ratios are possible (e.g., DLR’s torque-controlled lightweight arm [Hirzinger et al. 2001], weighing 18 kg, carrying a maximum of 10 kg, only consuming 100 W on average), this mechatronic “total design” optimization has clear advantages in terms of performance, we still prefer, in the interest of flexibility, a design on off-the-shelf modules for our research robot. HERMES’ undercarriage incorporates four batteries (each weighing 32 kg) which allow a continuous operation (consisting of autonomous navigation and manipulation, and humanrobot interaction) for a minimum of 2 h (on average 4 h) without recharging. The heavy undercarriage guarantees that the robot will not loose its balance even when the body and the arms are fully extended to the front. Since HERMES overall mass of 250 kg and its possible velocity of 2 m/s present a danger to the environment, the maximum speed has been reduced to 1 m/s (in public places) and several hardware measures (see below) ensure proper emergency stopping during navigation and manipulation. Main sensors are two video cameras mounted on independent pan/tilt drive units in addition to the pan/tilt unit that controls the common “head” platform. The cameras can be moved with accelerations and velocities comparable to those of the human eye. A radio Ethernet interface allows to communicate via a LAN or the Internet. A wireless keyboard can be used to teleoperate the robot up to distances of 7 m. A hierarchical multi-processor system is used for information processing and robot control. The control and monitoring of the individual drive modules is performed by the sensors and controllers embedded in each module. The robot’s “brain” is a network of digital signal processors (DSP, TMS 320C40) embedded in a rugged PC. Sensor data processing (including vision), situation recognition, behavior selection and high-level motion control are performed by the DSPs, while the PC provides data storage and the human interface (Figure 2).

Figure 2: Modular and adaptable hardware architecture for information processing and robot control.

A number of additional hardware measures that follow the design principles have been considered to enhance reliability and operating safety. They are described in the sequel.


10


3.1. Modular computer hardware Ease of maintenance and repair is certainly one of the most prominent features of HERMES. The robot’s 25 drive modules are functionally similar with almost identical mechanical and electrical interfaces. If any one of these modules should ever fail, it could be easily replaced with a new readily available off-the-shelf module. The same holds true for the robot’s brain: each DSP board and the single slot CPU can be easily replaced from stock. The special shielding and ventilation of the rugged PC keeps the processors’ temperatures down and reduces electromagnetic interference to a minimum. 3.2. Cables and connectors Within HERMES, all signal and power line connectors are secured with snaps, screws or fixtures to their respective housing. Simple plug-in connectors that are the fastest and easiest to use, e.g., simply push in or pull out, are avoided, because they also are the easiest to disconnect accidentally. All connectors are strain-relieved to eliminate the risk of loose or broken cables. Electromagnetic shielding of the cabling has been a major concern to diminish the effect of the many sources of electromagnetic fields within the robot. Electrical plugs have been carefully selected so that it is physically impossible to insert the wrong plug into a receptacle or insert it the wrong way. Different plugs are used for different power levels (5, 12, 24 and 48 Volts). All connectors, plugs and receptacles are labeled unambiguously and coherently to ensure easy testing and maintenance. 3.3. Power circuitry and emergency stopping Standard safety regulations for industrial robots require that all consumer loads are disconnected from power in case of an emergency and that all drives are actively braked, e.g., if the bumpers are touched. In this case a human operator is needed to reset the robot. Any kind of intelligent assessment of the prevailing “emergency” situation by the robot is not allowed. However, in normal living environments the robot might need to touch things or cannot prevent it, if it wants to continue its given task. Should it not have the ability to intelligently assess the situation? For instance, maybe it would suffice during simple maneuvers such as turning around a corner just to back up a little bit or to change the steering angle in order to prevent any damage to the walls. Another scenario, e.g., when a person has by accident been squeezed in between the heavy robot and a wall, could require to set the robot’s modules to enter a compliant mode where all joints can be moved manually with ease to prevent further injury to a human instead of actively braking all drives. We believe that future robots need to have more intelligent safety concepts than the existing ones to be able to work with, or in close proximity to, humans. It will be simply not safe enough to just follow the existing safety regulations for industrial manipulators or automated guided vehicles. Therefore, our safety concept allows active utilization of the bumpers to enable tactile sensing and to complement missing visual information. Program failures could be detected by implementing so called “watch dog” timers on different levels, e.g., in the robot’s microcontrollers, the slot CPU and the DSPs. Any watch dog timer running out would cause the robot to stop via electronic emergency switches. So far, these watch dog timers have been implemented on the DSP and PC levels only. HERMES possesses two standard emergency switches. One may be activated by pressing a clearly visible red-yellow button on the robot’s cargo area, the other one is a wireless emergency switch carried by a human operator. They are connected in series and only


11


interrupt the power circuitry for the motors; the information processing system keeps running as long as the robot is switched on. Sensor data processing and data logging continues and no time is wasted after an emergency stop to “re-boot” the robot. On a lower level, current sensors in each module check if the motor current is too high. In this case the power supply for that module will be interrupted to prevent damage to the electronic components and a break is activated to prevent the arm, or grasped objects, from falling down. 3.4. Artificial skin A modular approach has also been taken in the design of an artificial skin for the robot. This “skin” is based on conductive foam that serves two purposes: one, it damps accidental and unwanted impacts between the robot and humans or environmental objects, and two, it allows to identify the contact locations of, and the forces exerted by, the touched objects. Contact points and forces are measured via a dense grid of electrodes underneath the foam. Pressing the foam results in a higher conductivity of the material (lower resistance, respectively). The resistance between each electrode and a ground plane is continuously measured (sample rate 50 Hz) and evaluated by dedicated microcontrollers. In case of touch events these microcontrollers first send messages to higher hierarchical computing levels that decide about appropriate reactions based on the robot’s current situation. If for any reason these higher levels do not immediately respond to the message, the microcontrollers will directly stop the associated motor modules. A bumper consisting of 12 identical modules of the artificial skin surrounding the robot’s undercarriage (at a height of 30 mm - 330 mm, each section 200 mm wide) has already been realized. Furthermore, two new two-finger grippers that are completely covered by this conductive foam have been developed and are currently being integrated (Figure 3). In the future it is planned to cover the whole robot structure with this kind of tactile sensing elements that can be easily replaced. Ideally, these elements will be directly connected to the individual motor modules and connected via a safe bus system to the central information processing unit. Presently, they are connected via a high-speed serial communication bus.

Figure 3: Developed tactile sensing elements: Top Left: one of twelve identical bumper modules covering the undercarriage at a height between 30 and 330 mm; Bottom Left: side view of the bumper module: the electronic circuits are integrated in, and thus, protected by an epoxy layer; modules are connected via a serial bus through a cable duct; Right: Finger module: printed circuit boards (PCBs) are glued with epoxy to an aluminum core; a dense grid of electrodes covers the surface on all four sides and the tip of the finger; PCBs are covered by 3 mm thick conductive foam (not shown).


12


Another (or a complementing) solution could be to employ either slip clutches in the joints of manipulators or to implement intelligent control algorithms that continuously predict and verify force and torque on all joints. Prerequisite for the latter safety concept would be a lightweight manipulator that allows position, velocity and torque control with minimal control loop cycle times (< 1 ms). This is currently unfeasible with off-the-shelf industrial products. Therefore, it is definitely advisable for the time being to use some kind of high-resolution tactile sensors to reliably detect (un)wanted contacts of the robot with its environment. 3.5. Extended dynamic range of CCD cameras To increase the robustness of the image processing the robot’s two CCD cameras have been modified to allow their integration time, and thus their sensitivity, to be controlled by the vision system. This enables the robot to reliably detect objects even under uncontrolled and changing lighting conditions by maintaining a high contrast around tracked features or keeping an average grey level within a region of interest. Automatic gain control which is usually based on an average grey level within the entire image does not yield satisfactory results because it cannot cope with the high differences in brightness of natural scenes which leads to an over- or underexposure of regions of interest or tracked objects.

4 Dependable Robot Software and System Architecture Seamless integration of many – partly redundant – degrees of freedom and various sensor modalities in a complex robot calls for a unifying approach. This complies with the beforesketched fourth design principle, emphasizing a systematic, yet simple architectural design, open for extensions and maintaining conceptual integrity. We have developed a system architecture that allows integration of multiple sensor modalities and numerous actuators, as well as knowledge bases and a human-friendly interface. In its core, the system is behavior-based, which is now generally accepted as an efficient basis for autonomous robots [Arkin 1998]. However, to be able to select behaviors intelligently and to pursue long-term goals in addition to purely reactive behaviors, we have introduced a situation-oriented deliberative component that is responsible for situation assessment and behavior selection. The system architecture has been specified in a way that would allow other developers to integrate their functionality with ease. 4.1. System Overview Figure 4 shows the essence of the situation-oriented behavior-based robot architecture as we have implemented it. The situation module (situation assessment & behavior selection) acts as the core of the whole system and is interfaced via “skills” in a bidirectional way with all hardware components – sensors, actuators, knowledge base storage and MMI (man-machine, machine-machine interface) peripherals. These skills have direct access to the hardware components and, thus, actually realize behavior primitives. They obtain certain information, e.g., sensor readings, generate specific outputs, e.g., arm movements or speech, or plan a route based on map knowledge. Skills report to the situation module via events and messages on a cyclic or interruptive basis to enable a continuous and timely situation update and error handling. The situation module fuses via skills data and information from all system components to make situation assessment and behavior selection possible. Moreover, it provides general system management (cognitive skills). Therefore, it is responsible for planning an appropriate


13


Figure 4: System architecture of a personal robot based on the concepts of situation, behavior and skill.

behavior sequence to reach a given goal, i.e., it has to coordinate and initialize the in-built skills. By activating and deactivating skills, a management process within the situation module realizes the situation-dependent concatenation of elementary skills that lead to complex and elaborate robot behavior. In general, most skills involve the entire information processing system. However, at a gross level, they can be classified into five categories besides the cognitive skills: Motor skills control simple movements of the robot’s actuators. They can be arbitrarily combined to yield a basis for more complex control commands. Encapsulating the access to groups of actuators that form robot parts, such as undercarriage, arms, body and head, leads to a simple interface structure, and allows an easy generation of pre-programmed motion patterns. Sensor skills encapsulate the access to one or more sensors, and provide the situation module with proprioceptive or exteroceptive data. Sensorimotor skills combine both sensor and motor skills to yield sensor-guided robot motions, e.g., vision-guided or tactile and force/torque-guided motion skills. Communicative skills pre-process user inputs and generate outputs to the users. They are crucial for providing feedback according to the current situation and the given application scenario. The system’s knowledge bases are organized and accessed via data processing skills. They return specific information upon request and add newly gained knowledge (e.g., map attributes) to the robot’s data bases, or provide means of more complex data processing, e.g., path planning. For a more profound theoretical discussion of our system architecture which bases upon the concepts of situation, behavior and skill, see [Bischoff, Graefe 1999]. 4.2. Implementation A robot operating system was developed that allows sending and receiving messages via different channels among the different processors and microcontrollers. All tasks and threads run asynchronously, but can be synchronized via messages or events. The developed software is highly reusable and open for future extensions since it is written in C code with abstraction layers for hardware-specific function calls.


14


Overall control is realized as a finite state automaton that does not allow unsafe system states. It is capable of responding to prioritized interrupts and messages. A finite state automaton has been preferred over other architectures because of its simplicity (fourth design principle). After powering up the robot finds itself in the state “Waiting for next mission description”. A mission description is provided as a text file that may be either loaded from a disk, received via e-mail, entered via keyboard, or result from a spoken dialogue. It consists of an arbitrary number of single commands or embedded mission descriptions that let the robot perform a required task. All commands are written or spoken in natural language and passed to a parser and an interpreter. If a command cannot be understood, is under-specified or ambiguous, the situation module tries to complement missing information from its situated knowledge or asks the user via its communicative skills to provide it. Motion skills are mostly implemented at the microcontroller level within the actuator modules. High-level motor skills, such as coordinated smooth arm movements are realized by a dedicated DSP interfaced to the microcontrollers via a CAN bus. Sensor skills are implemented on those DSPs that have direct access to digitized sensor data, especially digitized images. 4.3. Special software measures for enhancing safety and operating robustness 4.3.1. Object-oriented image processing One apparent difficulty in implementing vision as a sensor modality for robots is the huge amount of data generated by a video camera: about 10 million pixels per second, depending on the video system used. Nevertheless, it has been shown that modest computational resources are sufficient for realizing real-time vision systems if a suitable system architecture is implemented [Graefe 1989]. As a key idea for the design of efficient robot vision systems the concept of object-oriented vision was proposed. It is based on the observation that both the knowledge representation and the data fusion processes in a vision system may be structured according to the visible and relevant external objects in the environment of the robot. For each object that is relevant for the operation of the robot at a particular moment the system has one separate “object process”. An object process receives image data from the video section (cameras, digitizers, video bus etc.) and generates and updates continuously a description of its assigned physical object. This description emerges from a hierarchically structured data fusion process that begins with the extraction of elementary features, such as edges, corners and textures, from the relevant image parts and ends with matching a 2-D model to the group of features, thus identifying the object. Recognition of relevant objects is crucial for the robot’s operation. The decision what objects have to be detected and tracked is made by the situation module. That module also decides that the robot has to move slower, if, e.g., some features are tracked less reliably, and that it has to stop, if the features are lost. Based on the type of detected and tracked objects the speed of the robot may be adjusted. For instance, at intersections the robot slows down to minimize the risk of colliding with persons who might suddenly appear and cross the robot’s path. 4.3.2. Speaker-independent speech recognition The robot understands natural continuous speech independently of the speaker, and can, therefore, be commanded in principle by any non-dumb human. This is a very important


15


feature, not only because it allows anybody to communicate with the robot without needing any training with the system, but more importantly because the robot may be stopped by anybody via voice in case of emergency. Speaker-independence is achieved by providing grammar files and vocabulary lists that contain only those words and provide only those command structures that can actually be understood by the robot in the current situation. In the present implementation HERMES understands about 60 different command structures and 350 words, most of them in each of the available three languages English, German and French. 4.3.3. Robust dialogues for dependable interaction Most parts of robot-human dialogues are situated and built around robot-environment or robot-human interactions, a fact that has been exploited to enhance the reliability and speed of the recognition process by using so-called contexts. They contain only those grammatical rules and word lists that are needed for a particular situation. However, regardless of the situation and the state of the dialogue a number of words and sentences not related to the current context are available to the user, too. These words allow to “reset” or bootstrap a dialogue, to trigger the robot’s emergency stop and to make the robot execute a few other important commands at any time. It is important to note that the robot is always in control of the current dialog and the flow of information towards the user. If the robot is asked by a user to execute a service task it will follow a specific “program” consisting of concatenated and combined skills thereby tightly coupling acting, sensing and communication in a predefined way. If something goes wrong, i.e., some parameters exceed their bounds, the current command will be canceled by the robot. Canceling a command involves returning into a safe state which again might involve communication and interaction with the user (for details see [Bischoff, Graefe 2002]). Obviously, there are some limitations in our current implementation. One limitation is that not all utterances are allowed or can be understood at any moment. The concept of contexts with limited grammar and vocabulary, as it has been implemented up to now, does not allow for a multitude of different utterances for the same topic. General speech recognition is not sufficiently advanced, and compromises have to be accepted in order to enhance the recognition in noisy environments. Furthermore, in our implementation it is currently not possible to track a speaker’s face, gestures or posture. This would definitely increase the versatility and robustness of human-robot communication. However, even now HERMES is already able to include gestures in its communicative acts, e.g., when giving directions to a human (Fig. 9d). 4.3.4. Learning by doing Two forms of learning are currently being investigated. They both help the robot to learn from scratch by actually doing a useful task: One, to let the robot automatically acquire or improve skills, e.g., grasping of objects, without quantitatively correct models of its manipulation or visual system (autonomous learning). Two, to have the robot generate, or extend, an attributed topological map of the environment over time in cooperation with human teachers (cooperative learning). The general idea to solve the first learning problem is simple. While the robot watches its end effector with its cameras, like a playing infant watches his hands with his eyes, it sends more or less arbitrary control commands to its motors. By observing the resulting changes in the camera images it “learns” the relationships between such changes in the images and the control commands that caused them. After having executed a number of test motions the robot is able to move its end effector to any position and orientation in the images that is


16


physically reachable. If, in addition to the end effector, an object is visible in the images, the end effector can be brought to the object in both images and, thus, in the real world. Based on this concept a robot can localize and grasp objects without any knowledge of its kinematics or its camera parameters. Since no quantitative models of any of the robot’s characteristics are used, the difficult problems of creating such models and maintaining their accuracy are sidestepped. In contrast to other approaches with similar goals, but based on neural nets, no training is needed before the manipulation is started [Graefe 1995]. The general idea to solve the second learning problem is to let the robot behave like a new worker in an office with the ability to explore, e.g., a network of corridors, and to ask people for reference names of specific points of interest, or to let people explain how to get to those points of interest. The geometric information is provided by the robot’s odometry, and relevant location names are provided by the people who want the robot to know a place under a specific name. In this way the robot acquires quickly from scratch the knowledge necessary to deliver personal services, such as: how specific persons call certain places; what the most important places are and how it may get there; where objects of personal or general interest are located; or how specific objects should be grasped. The ability to link, e.g., a person’s name to environmental features, requires several databases and links between them in order to obtain the wanted information, e.g., whose office is located where, what objects belong to specific persons and where to find them. Many types of dialogues exist to cooperatively teach the robot new knowledge and to build a common reference frame for subsequent execution of service tasks. For instance, the robot’s lexical and syntactical knowledge bases can easily be extended, firstly, by directly editing them (since they are text files), and secondly, by a dialogue between the robot and a person that allows to add new words and macro commands during run time. When gathering new information through dialogues from its environment HERMES expects the person’s statements to be correct, i.e., it will not try to resolve contradictory statements. For instance, if a person claims that his office is room 2455 and later declares room 2454 to be his office, the first entry is removed without further notice or attempts to resolve this apparent contradiction. If a person, indeed, has two offices he would be required to give different names. However, the concept of “aliases” exists to allow two or more persons to denominate the same location with different names, e.g., to allow one person to call room 2401 “meeting room” while another prefers to say “conference room”. To teach the robot names of persons, objects and places that are not yet in the database (and, thus, cannot be understood by the speech recognition system), a spelling context has been defined that mainly consists of the international spelling alphabet. This alphabet has been optimized for ease of use by humans in noisy environments, such as aircraft, and has proved its effectiveness for our applications as well, although its usage is not as intuitive and natural as individual spelling alphabets or as a more powerful speech recognition engine would be. 5. Experiments and Results Since its first public appearance at the Hannover Fair in 1998 where HERMES could merely run (but still won “the first service robots’ race”!) quite a number of experiments have been carried out that prove the suitability of the proposed methods. Of course, we performed many tests during the development of the various skills and behaviors of the robot and often presented it to visitors in our laboratory (please check out the many publications and videos


17


presented at www.unibw-muenchen.de/hermes). The public presentations made us aware of the fact that the robot needs a large variety of functions and characteristics to be able to cope with the different environmental conditions and to be accepted by the general public. In all our presentations we experienced that the robot’s anthropomorphic shape encourages people to interact with it in a natural way. As presented in the preceding sections, HERMES possesses several other promising features inside and outside that make it intrinsically more reliable and safer than other robots. One of the most promising results of our experiments is that our calibration-free approach seems to pay off, because we experienced drifting of system parameters due to temperature changes or simply wear of parts or aging. These drifts could have produced severe problems, e.g., during object manipulation, if the employed methods relied on exact kinematic modeling and calibration. Since our navigation and manipulation algorithms only rely on qualitatively (not quantitatively) correct information and adapt to parameter changes automatically, the performance of HERMES is not affected by such drifts. Tactile sensing also greatly improves the system’s dependability. Figure 5 shows an example of the tactile bumper sensors’ response in case of an accident. In this simple contact situation HERMES tries to continue to deliver its service, e.g., to transport an object, and does not wait until a human has solved the problem. In such a simple case the robot would drive backwards, modify the steering angle and try again. More complex contact situations (2 or more contact locations) still require, for safety reasons, the help of a human. The dialogue and associated human-robot interactions depicted in Figure 6 may serve as an example how HERMES interacts with people. Contexts are actively switched depending on the prevailing situation. Prompts implicitly incorporate the verification of the user’s utterance, and the verbosity level is automatically adapted to the user’s assumed knowledge and experience. Whenever a command is incomplete (missing command arguments) or ambiguous (too many arguments or imprecise description), a specific dialogue is initiated to resolve the problem. It is important to note that it is always the robot (except in emergency) who is in charge of the current dialogue and the flow of information towards the user. These capabilities enable HERMES to build a common, and thus dependable, reference frame with any person in the shared living and working environment.

Figure 5: Sensor image of tactile bumpers after touching the corner of two adjacent walls while the robot was trying to turn around it; color coding: light grey value = no touch, the darker the color the higher the exerted forces during touch; the sensor image outer row to inner row correspond to a covered area from 40 - 320 mm above the ground on the undercarriage.


18


Utterances / Actions (P: Human; H: HERMES)

Activated Context

Comments

P: “Hello!”

idle

robot is disrupting its current task

H

service

normal prompt (the robot expects a moderately experienced user)

service

robot is switching to the verbosity level ‘novice user’

:“Hello! What can I do for you?”

P: “What can you do?”

H: “I could offer some services such service as fetching and carrying objects or simply entertain you!”

offering detailed information about capabilities; would have been extended if the next user input would not have been understood

P: “Take over glass!”

service

service task not yet completely specified

H: “Should I take over the glass with my left hand?”

confirmation implicit feedback is given to the user by incorporating ‘glass’ in the question; context switching for better recognition

P: “No!”

confirmation now the robot assumes that the right hand should be used

H:

“I will take over the glass with service my right hand. Please, hand over glass!”

P: doing nothing

service

bringing the gripper into a position and orientation that allows taking over a glass; if the user wishes to cancel he could say ‘no’ or ‘cancel command’ robot is waiting for user to act as previously instructed

H: “Please place the glass between service my fingers and touch them!”

incremental prompt: more information is given to the user to ensure proper interaction through kinesthetic sensing

P: tricking the robot by touching the service fingers but not giving the glass

the robot notices that the user did not hand over an object because the gripper closed fully

H: “Please try again to hand over glass or say ‘cancel command’ ”

service

since something went wrong during the first trial other options are explicitly listed

P: placing glass between fingers, thereby slightly touching them

service

robot notices the touching of its fingers and grasps the glass; if no user interaction had happened the robot would have returned to its initial state

H: “What else can I do for you?”

service

continuing to offer services to a novice user (prompt would be omitted otherwise)

P: “Place it onto the table!”

service

‘it’ is replaced with ‘glass’ because the glass is the most recent object the user has referred to

H: “I have detected two tables. Please specify one!”

object attrib. implicit feedback is given to the user by incorporating the number of tables detected in the utterance; context switching

P: “The small one!”

object attrib. key word spotting for attributes such as left, right, small, ...

H: “I will place the glass onto the small table. Do you want me to do this task?”

confirmation implicit feedback is given to the user by incorporating the attributes of the table in the utterance; confirmation is required before executing this time-consuming task

P: “Yes, please!”

confirmation

key word spotting for ‘Yes’ or ‘No’

H: “I am approaching the table.” (more utterances until task completion)

idle

since the verbosity level is still set to ‘novice user’, the robot is commenting its actions aloud, otherwise it would execute its task silently

Figure 6: Excerpt from a dialogue between a human and HERMES to take over an object and place it onto a table. In its course HERMES learns that it is interacting with a novice user who needs much feedback for interacting correctly. It should be noted how often contexts are switched, depending on the robot’s expectations. This makes the speech recognition much more dependable and robust, especially in the presence of ambient noise.

Autonomously or through dialogues with people, the robot is able to build an attributed topological map of its environment (Figure 7). Since HERMES is using only vision and odometry for navigation purposes it is limited by its relatively poor perception (when compared to humans). Nevertheless, the situation-oriented and skill-based system architecture in addition to the camera’s active exposure time control enables a navigation performance that is more than adequate for our office building environment. Combined visual and tactile sensing is only in its early stages. We expect the robot to perform even more dependably when these senses are fully integrated and combined.


19


Figure 7: Attributed topological map built by the robot with help of human teachers through dialogues (e.g., the dialogue depicted in Figure 6). The robot learns how persons call (specific) places and how the places are connected via passageways. Multiple names are allowed for individual locations based on users’ preferences. Geometric information does not have to be accurate as long as the topological structure of the network of passageways is preserved. (The map has been simplified for demonstrating purposes. It deviates significantly in terms of complexity, but not in general structure, from the actual map being used for navigation at the Institute.)

In the sequel we concentrate on demonstrations that we performed outside the familiar laboratory environment, namely in television studios, at trade fairs and in a museum where HERMES was operated by non-experts for an extended period of time. Such demonstrations, e.g., in television studios, subject the robot to various kinds of stress. First of all, it might be exposed to rough handling during transportation, but even then it should still function on the set. Second, the pressure of time during recording in a TV studio requires the robot to be dependable; program adaptation or bug-fixing at the location is not possible. HERMES has performed in TV studios a number of times and we have learned much through these events. We found, for instance, that the humanoid shape and behavior of the robot raise expectations that go beyond its actual capabilities, e.g., the robot is not yet able to act upon a director’s command like a real actor (although sometimes expected!). It is through such experiences that scientists get aware of what “ordinary” people expect from robots and how far, sometimes, these expectations are missed. Trade fairs, such as the Hannover Fair, the world’s largest industrial fair, pose their challenges, too: hundreds of moving machines and thousands of people in the same hall make an incredible noise. It was an excellent environment for testing the robustness of HERMES’ speech recognition system. Last but not least, HERMES was field-tested for more than 6 months (October 2001 April 2002) in the Heinz Nixdorf MuseumsForum (HNF) in Paderborn, Germany, the world’s largest computer museum. In the special exhibition “Computer.Brain” the HNF presented the current state of robotics and artificial intelligence and displayed some of the most interesting robots from international laboratories, including HERMES. We used the opportunity of having HERMES in a different environment to carry out experiments involving all of its skills, such as vision-guided navigation and map building in a network of corridors; driving to objects and locations of interest; manipulating objects, exchanging them with humans or placing them on tables; kinesthetic and tactile sensing; and detecting, recognizing, tracking and fixating objects while actively controlling the sensitivities of the cameras according to the ever-changing lighting conditions. HERMES chatted with employees and international visitors in three languages (English, French and German). Topics covered in the conversations were the various characteristics of the robot (name, height,


20


Figure 8: HERMES performing at the special exhibition “Computer.Brain”, instructed by natural language commands: taking over a bottle and a glass from a person (not shown), (a) filling the glass with water from the bottle; (b) driving to and placing the filled glass onto a table; (c) interacting with the visitors (here: waving with both arms, visitors wave back!)

weight, age, ...), exhibits of the museum, and actual information retrieved from the World Wide Web, such as the weather report for a requested city, or current stock values and major national indices. HERMES even entertained people by waving a flag that had been handed over by a visitor; filling a glass with water from a bottle, driving to a table and placing the glass onto it; playing the visitors’ favorite songs and telling jokes that were also retrieved from the Web (Figure 8). HERMES was able to chart the office area of the museum from scratch upon request and delivered services to a priori unknown persons (Figure 9). In a guided tour through the exhibition HERMES was taught the locations and names of certain exhibits and some explanations relating to them. Subsequently, HERMES was able to give tours and explain exhibits to the visitors (Figure 10). 6. Lessons Learned We found it interesting to observe how HERMES, actually just a laboratory prototype despite its designed-in dependability, survived the daily hard work far away from its “fathers”, where no easy access to repair and maintenance was available, and how it got along with strangers and even with presenters who did not know much about robot technology. In fact, we were surprised ourselves that it performed so well. During 6 months of operation (lasting up to 18 hours a day during video recordings for documentation purposes) only one motor controller, one drive motor and one audio amplifier ceased to function, all of them commercially available and easily replaceable. According to the museum staff, HERMES was one of the few robots at the show that could regularly be demonstrated in action, and among them it is considered the most intelligent and most dependable one. This statement is supported by the

Figure 9: HERMES executing service tasks in the office environment of the Heinz Nixdorf MuseumsForum: (a) dialogue with an a priori unknown person with HERMES accepting the command to get a glass of water and to carry it to the person’s office; (b) asking a person in the kitchen to hand over a glass of water; (c) taking the water to the person’s office and handing it over; (d) showing someone the way to a person’s office by combining speech with gestures (head and arm) generated automatically.


21


Figure 10: HERMES giving information about its “closest relatives” C3PO und R2D2 from the “Star Wars” movies in the special exhibition “Computer.Brain” at Heinz Nixdorf MuseumsForum, Paderborn, Germany. While acting as German- and English speaking tour guide, HERMES used arm, head, eyes and body gestures to indicate to which exhibit its current utterances related. The information about the exhibits had been entered previously as attributes into a topological map (cf. Figure 7) which had been acquired by HERMES through direct interaction with the museum staff during a supervised learning phase. Besides exhibit-related information HERMES could also give more general information about the museum and guide and direct visitors to other locations of interest (e.g., the closest way to the restrooms and exits, etc.).

fact that the museum staff never called for advice once the initial setup was done. We had expected to give much more support and wondered how often we would have to travel from Munich to Paderborn (a six-hour-drive, one way) to help. Actually, we only were in Paderborn for setting up the robot for the exhibition, for presenting and documenting our research work during the first two weeks after the exhibition’s opening and for 4 days of documentation work after the first two months of the exhibition. Preparing the robot for the exhibition was indeed fun, but also a lot of work: it made us realize that many operational details had never been documented before, such as powering the robot on and off, charging the batteries, starting the main program and testing functionality. Now they had to be written down in a manual for non-experts, i.e., people with little engineering background. Actually, the museum staff had insisted on having such a reference guide, but as a matter of fact, it shared the fate of most reference manuals in the world: it was almost never looked at, because people rather like to try out how things work instead of studying manuals, which makes the need for safe behavior even more evident. Being afraid that the robot might come back to our university in pieces, we had made an effort to finish many of the laboratory’s research projects before sending HERMES to the museum. Actually, such time pressure helped to speed up work on algorithms and implementation details. Although we knew that thorough testing is only possible in different environments with numerous different people interacting with the robot, we had never before really been able to do so over an extended period of time. This exhibition gave us the opportunity, and eventually it proved that our concepts and approaches (as presented in chapters 2 and 3) were correct. Consequently, to really see the robot working in a completely different environment while being operated by non-experts for over 6 months, was certainly the most valuable experience of this long-term experiment. Some behaviors worked much better in the new environment than in our institute, others worse. For example, navigation worked much better on the one hand because the floor was not as reflective as our institute’s floor. On the other hand, the overall lighting conditions were rather poor and in the actual exhibition area it was almost too dark to navigate by means of vision. Although a large part of the exhibition featured red and yellow walls and a grey floor, it was very difficult for our monochrome


22


vision system to distinguish between walls and floors. A color vision and an even higher dynamic range of the cameras would certainly be desirable for our robot. Especially children liked interacting with the robot. Surprisingly enough, the robot could understand the children’s high voices and sometimes not fluently spoken phrases. They even hugged the robot, albeit under close supervision of the staff, without being afraid of breaking something, and, much more important, being afraid of being hurt by such a massive chunk of moving metal. Adults, on the other hand, faced the robot with all due respect. Some people pushed the robot’s emergency button that was clearly visible in the back of the robot, and expected something to happen. Since the emergency button only disconnects the motors from the power but not the computers, a lengthy reboot procedure was not required. The staff just had to pull up the emergency button again to restart the robot. We know now that the state of the emergency button should be monitored by the robot in order to react adequately to such a situation. The funniest interaction for most of the visitors and the staff alike resulted from touching the tactile bumpers placed around the robot’s undercarriage. The robot was programmed to stop moving and to say “Ouch”. This simple “emotion” made most of the people smile, and kept them touching the bumpers more than once. On the other hand, behaviors that the developers considered more impressive, such as navigation and manipulation, were taken for granted. The interaction capabilities on top of assumed (normal) behavior is what most people are interested in. Certainly, this does not simplify the robot scientist’s work since his robots obviously have to “compete” with the well-known robots from science fiction movies. According to a museum press release, more than 80.000 visitors had been attracted by the special exhibition “Computer.Brain” which was 30.000 more than had been hoped for. The maximum capacity of the museum was reached on several days, leading to long waiting lines. This tremendous success is certainly due to the highly interactive character of the exhibition. Of the 330 exhibits 52 were interactive, the most spectacular ones being robots. The overall exhibition’s media presence was remarkable with 18 independent broadcasts in televison (not counting reruns) and 11 in the radio, in addition to an uncountable number of newspaper articles. Taking media presence as an important indicator for successful and well recognized work, our project was indeed quite successful: to our knowledge HERMES was featured to a larger extent at least 6 times in TV, twice in radio and 18 times in newspaper articles (most of them during the two weeks after the exhibition’s opening). Due to missing links to interested social science researchers we (as engineers) felt unable to prepare meaningful questionnaires and to systematically observe human-robot interaction. Since generally accepted benchmarks for evaluating robot appearances and performances do not exist, it would have clearly been a subject of independent research. Of course, we would have liked to obtain such results to also quantitatively prove the success of the robot’s deployment but the qualitative results we obtained are more than encouraging to continue in the proposed directions, and to apply for funds for studies with a social science background. Probably it would have been most interesting to see what effects HERMES would have had on those people who interacted for several minutes only or were just bystanders, or on those who were given the chance to interact more seriously for a couple of hours, maybe after having obtained a short introduction to its capabilities. Furthermore, long-term effects on the museum staff could have been evaluated. Would they really have been able to establish a long-term relationship with, and show affection to, a rather mechanically looking robot? The research opportunities with a dependable service robot of humanoid shape and characteristics are certainly tremendous, but could not be in the focus of our research.


23


7. Summary and Conclusions HERMES, an experimental robot of anthropomorphic size and shape, interacts dependably with people and their common living environment. It has shown robust and safe behavior with novice users, e.g., at trade fairs, television studios, at various demonstrations in our institute environment, and in a long-term experiment carried out at an exhibition and in a museum’s office area. The robot was largely constructed from readily available motor modules with standardized and viable mechanical and electrical interfaces (drive modules, computers etc.). Due to its modular structure the robot is easy to maintain, which is essential for system dependability. A simple but powerful skill-based system architecture is the basis for software dependability. It integrates visual, tactile and auditory sensing and various motor skills without relying on quantitatively exact models or accurate calibration. Actively controlling the sensitivities of the CCD cameras makes the robot’s vision system robust with respect to varying lighting conditions (albeit not as robust as the human vision system). Consequently, safe navigation and manipulation, even under uncontrolled and sometimes difficult lighting conditions, were realized. A touch-sensitive skin currently covers only the undercarriage, but is in principle applicable to most parts of the robot’s surface. HERMES understands spoken natural language speaker-independently, and can, therefore, be commanded by untrained humans. In summary, HERMES can see, hear, speak, and feel, as well as move about, localize itself, build maps and manipulate various objects. In its dialogues and other interactions with humans it appears intelligent, cooperative and friendly. In a long-term test (6 months) at a museum it chatted with visitors in natural language in German, English and French, answered questions and performed services as requested by them. Although HERMES is not as competent as the robots we know from science fiction movies, the combination of all before-mentioned characteristics makes it rather unique among today’s real robots. As noted in the introduction, today’s robots are mostly strong with respect to a single functionality, e.g., navigation or manipulation. Our results illustrate that many functionalities can be integrated within one single robot through a unifying situation-oriented behavior-based system architecture. We also believe that our simple design strategies, such as modularity, calibration-free control and truly human-like interaction, would enable other researchers, too, to build similarly dependable robots. Our results suggest that testing a robot in various environmental settings, both short- and long-term, with non-experts having different needs and different intellectual, cultural and social backgrounds, is enormously beneficial for learning the lessons that will eventually enable us to build dependable personal robots. 8. References Arkin, R. C. (1998): Behavior-Based Robotics. MIT Press, Cambridge, MA, 1998. Arras, K. O.; Philippsen, R.; de Battista, M.; Schilt, M.; Siegwart, R. (2002): A Navigation Framework for Multiple Mobile Robots and its Application at the Expo.02 Exhibition. Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), Workshop on Robots in Exhibitions. Lausanne, Switzerland, October 2002. BBM Expo (2000): Future Intelligence – Frequently Asked Questions. Available at http://www.bbmww.de/expo2/ english.zip (last accessed July 21, 2002). Bischoff, R. (1997): HERMES – A Humanoid Mobile Manipulator for Service Tasks. Proc. of the Intern. Conf. on Field and Service Robotics. Canberra, Australia, Dec. 1997, pp. 508-515. Bischoff, R.; Graefe, V. (1999): Integrating Vision, Touch and Natural Language in the Control of a Situation-Oriented Behavior-Based Humanoid Robot. Proceedings IEEE Conference on Systems, Man, and Cybernetics (SMC 1999), October 1999, pp. II-999 - II-1004.


24


Bischoff, R.; Graefe, V. (2002): Dependable Multimodal Communication and Interaction with Robotic Assistants. Proceedings 11th IEEE International Workshop on Robot and Human Interactive Communication (ROMAN 2002). Berlin, Sept. 2002, pp. 300-305. Burgard, W.; Cremers, A. B.; Fox, D.; Hähnel, D.; Lakemeyer, G.; Schulz, D.; Steiner, W.; Thrun, S. (1999): Experiences with an interactive museum tour guide robot. Artificial Intelligence, Vol. 114, No. (1-2), pp. 3-55. DOE (2001): Human Factors/Ergonomics Handbook for the Design for Ease of Maintenance. Dep. of Energy Handbook DOE-HDBK-1140-2001, Feb. 2001. Available at: http://tis.eh.doe.gov/techstds/ standard/hdbk1140/hdbk1140.html. Endres, H.; Feiten, W.; Lawitzky, G. (1998): Field test of a navigation system: Autonomous cleaning in supermarkets. Proc. IEEE Int. Conf. on Robotics and Automation, pp. 1779-1784. Fujita, M.; Kuroki, Y.; Ishida, T.; Doi, T. T. (2003): Autonomous Behavior Control Architecture of Entertainment Humanoid Robot SDR-4X. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, Nevada, October 2003, pp. 960-967. Graefe, V. (1989): Dynamic Vision Systems for Autonomous Mobile Robots. Proceedings of the IEEE/RSJ Intern. Workshop on Inteeligent Robots and Systems (IROS 1989). Tsukuba, pp. 12-23. Graefe, V. (1995): Object- and Behavior-oriented Stereo Vision for Robust and Adaptive Robot Control. Intern. Symp. on Microsystems, Intelligent Materials, and Robots, Sendai, pp. 560-563. Graefe, V. (1999): Calibration-Free Robots. Proceedings of the 9th Intelligent System Symposium. Japan Society of Mechanical Engineers. Fukui, pp. 27-35. Graf, B.; Schraft, R.D.; Neugebauer, J. (2000): A Mobile Robot Platform for Assistance and Entertainment. Proceedings of 31st Intern. Symposium on Robotics, ISR-2000, Montreal. Hirzinger, G.; Albu-Schaffer, A.; Hahnle, M.; Schaefer, I.; Sporer, N. (2001): On a New Generation of Torque Controlled Light-Weight Robots. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2001), Vol. 4, pp. 3356 -3363. Inoue, H.; Tachi, S.; Nakamura, Y.; Hirai, K.; Ohyu; N.; Hirai, S.; Tanie, K.; Yokoi, K.; Hirukawa, H. (2001): Overview of Humanoid Robotics Project of METI. Proceedings of the 32nd International Symposium on Robotics (ISR 2001), Seoul, Korea, April 2001, pp. 1478-1482. Ishiguro, H.; Ono, T.; Imai, M.; Maeda, T. Kanda, T.; Nakatsu, R. (2001): Robovie: A robot generates episode chains in our daily life. Proceedings 32ndInternational Symposium on Robotics (ISR 2001, Seoul, Korea, pp.1365-1361. Kagami, S.; Nishiwaki, K.; Kuffner Jr., J. J.; Kuniyoshi, Y.; Inaba, M.; Inoue, H. (2002): Online 3D Vision, Motion Planning and Bipedal Locomotion Control Coupling System of Humanoid Robot: H7. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), Lausanne, Switzerland, pp. 2557-2562. Kanda, T.; Hirano, T.; Eaton, D.; Ishiguro, H. (2003): A practical experiment with interactive humanoid robots in a humun society. Proceedings of the 3rd IEEE International Conference on Humanoid Robots, Karlsruhe/Munich, October 1-3, 2003, CD-ROM, Session 4b #3. Kanehiro, F.; Mizuuchi, I.; Koyasako, K.; Kakiuchi, Y.; Inaba, M.; Inoue, H. (1998): Development of a Remote-Brained Humanoid for Research in Whole Body Action. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 1998), Leuven, Belgium, Vol. 2, pp. 1302-1307. King, S.; Weiman, C. (1990): Helpmate autonomous mobile navigation system. Proceedings of SPIE Conference on Mobile Robots, Vol. 2352, Boston, pp. 190-198. Laprie, J. C. (1992): Dependability: Basic Concepts and Terminology. Springer-Verlag, 1992. Nishiwaki, K.; Sugihara, T.; Kagami, S.; Kanehiro, F.; Inaba, M.; Inoue, H. (2000): Design and development of research platform for perception-action integration in humanoid robot: H6. Proceedings of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2000), Vol. 3, pp. 1559 -1564. Nourbakhsh, I. R. (2002): The Mobot Museum Robot Installations: A Five Year Experiment. Proceedings IEEE/RSJ IROS 2002; Workshop on Robots at Exhibitions. Lausanne, Switzerland, October 2002. Physics FAQ (1996): Online Version of Physics Frequently Asked Questions, Version: November 12, 1996, available at: http://physics.hallym.ac.kr/education/faq/faq.html (last accessed: Feb. 17, 2003).


25


Sakagami, Y.; Watanabe, R.; Aoyama, C.; Matsunaga, S.; Higaki, N.; Fujimura, K. (2002): The Intelligent ASIMO: System Overview and Integration. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, EPFL, Lausanne, Switzerland, October 2002, pp. 2478-2483. Seara, J. F.; Strobl, K. H.; Martín, E.; Schmidt, G. (2003): Task-Oriented and Situation-Dependent Gaze Control for Vision Guided Humanoid Walking. Proceedings of the 3rd IEEE International Conference on Humanoid Robots, Karlsruhe/Munich, October 1-3, 2003, CD-ROM, Session 3b #6. Simmons, R.; Fernandez, J.; Goodwin, R.; Koenig, S.; O’Sullivan, J. (1999): Xavier: An autonomous mobile robot on the web. Robotics and Automation Magazine, 1999. Tschichold, N., Vestli, S., Schweitzer, G. (2001): The Service Robot MOPS: First Operating Experiences. Robotics and Autonomous Systems Vol. 34, pp. 165-173. Thrun, S.; Beetz, M.; Bennewitz, M.; Burgard, W.; Cremers, A. B.; Dellaert, F.; Fox, D.; Hähnel, D.; Rosenberg, C.; Roy, N.; Schulte, J.; Schulz, D. (2000): Probabilistic algorithms and the interactive museum tour guide robot Minerva. Int. Journal of Robotics Research, Vol. 19, No. 11, pp. 972-999. Yokoi, K.; Nakashima, K.; Kobayashi, M.; Mihune, H.; Hasunuma, H.; Yanagihara, Y.; Ueno, T.; Gokyuu, T.; Endou, K. (2003): A Tele-operated Humanoid Robot Drives a Backhoe in the Open Air. Proceedings IEEE/RSJ Intl. Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas, Nevada, October 2003, pp. 1117-1122. Rainer Bischoff received his Dipl.-Ing. degree in electrical engineering from Hannover University, Germany, in 1995. He has been active in measurement science, control engineering and robotics research since 1993, including several long-term research stages at the LEEI in Toulouse, France, at EMCO corporation, Austin, Texas, and at the City University of Hongkong. He is currently pursuing his Ph.D. degree in robotics at the Bundeswehr University Munich, Germany, and holds a position as Research Advisor at the Intelligent Robots Laboratory of the same university. His research focuses on humanoid, personal and service robotics, in particular on system architecture and integration, sensor-based robot control, and multi-modal human-robot interaction. Since 2002 Mr. Bischoff is working for the KUKA Roboter corporation, Europe’s largest robot manufacturer, coordinating and managing the company’s cooperative research projects. Mr. Bischoff is a member of the IEEE and has authored over 30 papers, receiving two best paper awards. Volker Graefe received his doctor of sciences degree in 1964 in physics, mathematics and oceanography from the University of Kiel, Germany. From 1961 to 1965 he did research in applied physics in Kiel, and from 1965 to 1969 in physical oceanography at the University of Hawaii, Hawaii Institute of Geophysics. From 1969 to 1975 he was a project manager with Krupp, a major German industrial corporation, managing research and development projects in ocean engineering and computer design. Since 1975 he is professor of the Bundeswehr University München. From 1975 to 2003 he was the head of the Institute of Measurement Science, and since 2003 of the Intelligent Robots Lab of the same university. His main research interests are robot intelligence, robot vision, robot navigation, robot architecture and human-robot communication; learning robots; calibration-free robots; humanoid servant robots. He has (co-)authored more than 100 publications in these fields. Since 1993 he is an Honorary Professor of the Changsha Institute of Technology (China). He received the Nakamura Prize for contributions to the Advancement of the Technology of Intelligent Robots and Systems over a decade in 1997 and the IAPR/MVA prize for the most influential paper of the past decade in 1998. Professor Graefe has served as a member of the IEEE IES Administrative Committee from 1993 to 1998 and of the IROS Advisory Committee since 1990. He was the General Chairman of the IEEE/RSJ/GI International Conference on Intelligent Robots and Systems in Munich, Germany, IROS '94.