human-robot interaction (HRI) in the sense that a humanoid should not only ..... Person says Hi. Person gives. Introduction. Wave to. Person. Shake. Hands.
Towards a Uni ed Framework for Human-Humanoid Interaction K. Kawamura, A. Alford, K. Hambuchen, and M. Wilkes Center for Intelligent Systems, Vanderbilt University, Nashville TN 37235, USA
Abstract. In order for a humanoid robot to be accepted in society
and perform as an intelligent human assistant or companion, it must be equipped with a robust human-humanoid interaction (HHI) and task learning mechanisms. This paper describes our eort to achieve a robust HHI mechanism based on a multi-agent architecture called the Intelligent Machine Architecture (IMA), two high-level agents called the Self and the Human Agents, and short- and long-term memories called the Sensory EgoSphere (SES) and the Database Assocaitive Memory (DBAM). A case study of handshaking between a human and our humanoid system, ISAC, is given to illustrate key concepts.
1 Introduction At the Intelligent Robotics Laboratory of the Center for Intelligent Systems at Vanderbilt University we have been developing a humanoid system called ISAC (Intelligent Soft-Arm Control) over the past several years. ISAC was originally developed for a robotic aid system for the physically disabled [8] (Figure 1). Since then it has evolved into a testbed for research in human-humanoid interaction [9, 14]. This paper describes the physical aspects of our humanoid system, the foundations for a parallel, distributed robot control architecture called the Intelligent Machine Architecture (IMA), and a framework for human-humanoid interaction (HHI). Two high-level IMA agents called the Self Agent and the Human Agent are responsible for HHI. Additionally, we are developing two other software modules, the DataBase Associative Memory (DBAM) and the Sensory EgoSphere (SES); these modules will provide the humanoid with a long-term and short-term memory, respectively. The long-term goal of our research is to develop a set of IMA-based tools that allow humans to interface with humanoids as a member of the team.
1.1 Current ISAC Humanoid System ISAC has two 6-degree-of-freedom arms actuated by McKibben arti cial muscles. These muscles are pneumatic actuators whose lengths shorten as their internal air pressure increases [8]. They are attached to the arm in antagonistic pairs, approximating the action of human muscles. They have a signi cantly larger strength to weight ratio than electro-mechanical actuators; moreover, they are
Fig. 1. Original ISAC Feeding the Physically Challenged naturally compliant and are safe for use in close contact with people. ISAC has cost-eective anthropomorphic end eectors, built in-house, that we call the PneuHand I and PneuHand II. Small pistons pneumatically actuate the former, while the latter employs hybrid electric/pneumatic actuation in which the fore nger and thumb have a motor and a piston in series. The motors enable ne control in grasping and the piston provides strength in the grasp. The armhand systems have 6-axis force-torque sensors at each wrist, proximity sensors on the palms, and rudimentary touch sensors on each nger. ISAC also employs color, stereo, active vision with pan, tilt, and verge, sonic localization, and speech I/O (Figure 2).
2 The Intelligent Machine Architecture (IMA) The Intelligent Machine Architecture (IMA) is a software architecture for designing robot systems. Our design process with IMA is to decompose the system into a set of atomic agents. We use the term atomic to mean a fundamental building block and to distinguish IMA atomic agents from the more common use of agent as an autonomous, intelligent entity{machine, software, or biological. Although intelligent behavior emerges from the interaction of atomic agents in the IMA system, atomic agents are not, in general, intelligent. They are similar to the agents described by Minsky [11] or to the simple agents that some authors call actors [13, 7]. Within the context of IMA, an atomic agent is one element of a domain-level system description that tightly encapsulates all aspects of that element, much as in object-oriented systems. IMA permits the concurrent execution of agents on separate machines while facilitating extensive inter-agent
Fig. 2. ISAC Humanoid System communication. The loosely coupled, asynchronous operation of decision-making agents simpli es the system model at a high level.
2.1 Taxonomy of IMA Agents A brief description of the various building blocks of IMA is given below. Components: These are DCOM objects from which atomic agents are built. DCOM is the Distributed Component Object Model; it is a binary standard for software objects which supports operation across a network. Atomic Agents: These are composed of components. They have one or more threads of execution and are independent, autonomous entities. Usually, an atomic agent is not able to perform useful activity by itself; thus, collections of agents that communicate and interact are used to achieve useful results. There are four basic types of atomic agents, plus there is a fth type of atomic agent that exists as a concession to realistic implementation considerations. 1. Hardware/Resource Agent: Interface to sensor or actuator hardware. Those interfacing to sensors provide other atomic agents with regular updates of sensor data. Those interfacing to actuators accept commands from and provide updates of current state to other atomic agents. 2. Behavior/Skill Agent: Encapsulate basic behaviors or skills; e.g., collision avoidance or visual tracking. 3. Environment Agent: Provide an abstraction for dealing with objects in the robot's environment. This abstraction includes operations that the robot can perform on the object{e.g., \look at."
4. Sequencer Agent: Perform a sequence of operations; often interacting with one or more environment atomic agents or activating and deactivating one or more atomic agents. Sequencer agents may call other sequencer agents, but there should be an identi able highest level sequencer agent. 5. Multi-Type Agent: Combine the functionality of at least two of the rst four agent types. For example, in the interests of eciency of implementation it may be advantageous to combine the hardware and behavior types into a single multi-type agent. Compound Agents: An IMA Compound Agent is an interacting group of atomic agents that are coordinated or sequenced by one or more sequencer agents. The highest level sequencer agent can be envisioned as the root node of a tree with connections and dependencies on other agents as branches.
3 Framework for Human-Humanoid Interaction For a humanoid robot to become a useful assistant to people, robust humanhumanoid interaction (HHI) must be developed. HHI diers from traditional human-robot interaction (HRI) in the sense that a humanoid should not only recognize the physical aspects of humans, but psychological aspects such as frustration, confusion and joy, as well [12].
AS P E C TS P hysical S ensory
H um an
H um anoid
F ac e
S truc ture
Hand
W ork s pac e
A udio
A udio
V is ion
V is ion
O thers
Tac tile Infrared
C ognitive
E m otion Logic
A ffec t DB A M SES
Fig. 3. Aspects of a Human-Humanoid Interface Figure 3 shows aspects of humans and humanoids that we are considering for design of human-humanoid interaction (HHI). For an HHI interface to be maximally eective, the humanoid should be able to perceive the physical and
sensory aspects of a person and to communicate with him or her. Outputs that humans express include vocal utterances, facial expressions, gazes, and gestures. The designer should consider the nature of these modes to enable the system to respond to them. To respond to humans, humanoids have been given the abilities to speak using text-to-speech technology, gesture using their manipulators and make facial expressions, if so equipped. Cognitive aspects are most dicult to implement in HHI. How to model human emotion and humanoid aect are open research questions. Figure 4 shows our implementation of Human-Humanoid Interaction in IMA based on the Human Agent and the Self Agent. These agents have been designed to address certain aspects shown in Figure 3. We will also discuss the Sensory EgoSphere (SES) and the DataBase Associative Memory (DBAM), which are designed to provide the humanoid with short-term and long-term memory.
Sensory EgoSphere
SES Manager
Human Agent
A
A
A A
Atomic Agents
A
Self Agent
A
DBAM Manager
DataBase Associative Memory
Fig. 4. Human-Humanoid Interaction in IMA
3.1 The Human Agent
The goal of the Human Agent is to represent the human in HHI. The Human Agent is a compound agent that comprises the atomic agents that are tuned to the features of humans. The role of the Human Agent is to encapsulate the information that the humanoid has determined about the human. This feature allows the system to interact intelligently with the human by sensing these features dynamically. The Human Agent performs the functions of detection, monitoring
and identi cation of humans throughout its interaction. The current implementation of the Human Agent, shown in Figure 5, includes four modules logically grouped by function for detection, monitoring, identi cation, and interaction.
DBAM Manager
SES Manager
Self Agent Human Agent Interaction Module Detection Module
Monitor Module
Identification Module
HumanMotion Detection Agent
PersonId Agent
Env. Agent X
HumanFace Agent
HumanHand Agent
HumanFinger Agent
Fig. 5. Human Agent and its constituent modules in relation to other agents The Detection Module covers some of the Sensory aspects shown in the Human column of Figure 3. Infrared motion and sound source localization are used turn the system's attention toward new humans in its environment. The Monitor Module works to locate and track features of humans{currently faces and hands, thus covering the Physical aspects. Other Sensory aspects are covered by the Interaction Module, which monitors speech and gesture input. Its role is to pass speech commands to the Self Agent and to interpret gesture in the context of speech. We are currently developing the ability to detect a human pointing with his or her nger. This agent allows the human to supplement speech by deictic gesturing. The humanoid system's ability to detect human emotion is currently under development. Our rst attempt for developing an indicator of emotional state will be based on speech information, using a measure of the positive and negative word choice and feedback from the human. This technique will rely upon correct identi cation of the human. The Identi cation Module is used to identify individuals in the system's environment. This module monitors features, such as a person's height and clothing color information, to identify and detect changes among the people in its environment.
3.2 The Self Agent The Self Agent is a compound agent that we developed to address the cognitive aspects for the humanoid in HHI. The Physical and Sensory aspects are mapped to various atomic agents. Part of the Cognitive aspect covered by the Self Agent is to determine the human's goals from the information the Self Agent receives from the Human Agent, and to select actions for the humanoid to achieve these goals. Actions are selected by activating sequencer and environment atomic agents. As another part of the cognitive aspect, the Self Agent integrates failure information from atomic agents and maintains information about the overall state of the humanoid. The failure information is collected from the various atomic agents that evaluate the system status at lower levels. This technique, which we call System Status Evaluation, is discussed in more detail in Section 4. The Self Agent also generates part of the humanoid's communication to the human. The Self Agent does this in response to the human's input, acknowledging receipt of commands. It also takes initiative in generating output when the status information maintained by the Self Agent indicates a problem with the functioning of the humanoid. The current implementation of the Self Agent is shown in Figure 6.
DBAM Manager
Affect Module
Interaction Module
Activator Module
Interrogation Module
Pronoun Module
Description Module
SES Manager
Self Agent
Human Agent SSE
Env. Agent 1
SSE SSE
Seq. Agent 1
SSE
Env.Agent M
Seq Agent N
Fig. 6. Self Agent and its constituent modules in relation to the other agents. The Interaction Module interprets the human's input and generates responses to the human. The interpretation software is based upon the Maas-Neotek robots developed by Mauldin [10, 4]. Textual input is processed, generating a text response or increasing or decreasing the activation level of a atomic agent. The Activator Module is the means by which the Self Agent modi es the activation
state of atomic agents. The Activator Module maintains a list of atomic agents, with an associated numerical value that is the Self Agent's contribution to the atomic agent's activation level. This allows the various Self Agent modules to add to or subtract from a atomic agent's activation level. The Interaction Module and Activator Module work together to cause the humanoid to perform an action in response to human input. For example, if the human said, \Please give me a spoon," the Interaction Module would determine which atomic agents should be activated, and increase the activation level of those atomic agents through the Activator Module, causing them to perform their task. The Aect Module is an arti cial emotional model used to describe the current state of the humanoid and provide internal feedback to the humanoid's software agents, similar to that described in [3]. Arti cial emotions are modeled by numerical variables; a fuzzy evaluator fuzzi es each emotional variable to produce a textual description of the humanoid's emotional state for feedback to the human. Currently, two arti cial emotions are used. The rst, happiness, re ects the operating status of the humanoid. This variable is aected by the System Status Evaluation (SSE) processes in the atomic agents (see below). When SSE determines that an agent is not working properly, it drives the happiness variable downward. If the happiness variable drops below a threshold, the humanoid reports the problem to the human. Another Aect Module variable is confusion. This variable is aected by the Pronoun Module. The Pronoun Module acts as a conversational referent; that is, a pointer to \what we're talking about." It will resolve human phrases such as \it," \this," and \the blue block" to environment atomic agents. The Pronoun Module will then act as a pointer or reference for use by sequencer atomic agents. Using this method, when the human says \Take the thing I am holding" or \Take the blue block," the same sequencer atomic agent is invoked. This task atomic agent's script then refers to the Pronoun Module, instead of a speci c environment atomic agent. The Pronoun Module will point to either \the thing the human is holding" or \the blue block" respectively. This concept is similar to the pronome described in [11] and role-passing developed in [6]. If the Pronoun Module cannot resolve the correct binding|for example, there may be no environment atomic agent|then there has been a failure of goal identi cation, and the module will increase the confusion aect variable. The Pronoun Module also points to atomic agents for purposes of failure description. This is done in conjunction with the Interrogation Module. The Interrogation Module handles discussions about the humanoid's status involving questions from the human. For example, when the human says \What are you doing?" or \What is wrong," the Pronoun Module should resolve to the either the currently active atomic agent or a failed atomic agent, respectively. The Description Module lists these agents and their descriptions (see below). The Interrogation Module informs the human of the descriptions (prefaced by \I am..." or \I couldn't..." as appropriate). Further interrogation by the human| e.g., \Why couldn't you do that?"|will cause the Pronoun Module to resolve to the speci c atomic agents, from which the failure information can be accessed.
The Description Module provides descriptions of the atomic agents available in the system. This allows agents to supply text strings containing their names and a description of their purpose. The atomic agents also supply information about their status: whether they are active or inactive, and whether they are achieving their goals or not. The Description Module then has text strings containing the names and descriptions of all atomic agents, all active atomic agents, all successful atomic agents, and all failed atomic agents. These text strings can be used to provide feedback to the human, via the Interrogation Module.
3.3 Sensory EgoSphere
The Sensory Egosphere (SES) is a data structure that maintains a short-term memory of sensory data for the humanoid robot. The SES is comprised of a spatially-indexed database and a geodesic hemisphere interface. The SES interacts with the Human Agent and the Self Agent to register and retrieve sensory data. The SES can also interact with any other IMA agent providing or needing short-term sensory data. Data registration and retrieval are performed using the posting and nding functions. The posting function registers data onto the SES using the location of the humanoid's head. When data is posted, it is indexed in the database using the closest node on the geodesic hemisphere. The nding function retrieves data from the SES by data name, data modality or spatial location. Albus originally conceived the egosphere and de nes it as \a spherical coordinate system with the self (ego) at the origin" [1]. Our implementation uses a geodesic hemisphere as a coordinate system. The hemisphere is centered at the humanoid's head and maintains data occurring in the frontal view of the robot. The geodesic hemisphere structure interfaces with the humanoid's environment for accurate spatial registration. The spatial structure of the hemisphere is like a topological map, though. The spatial information is of two dimensions, azimuth and elevation on the hemisphere. No depth (or range) location is registered. Data in the humanoid's environment is \projected" onto the SES. The projection location denotes the data's spatial index. Figure 7 shows this topological orientation using objects in the environment. The tessellation of the geodesic hemisphere creates neighborhoods of dierent depths. These neighborhoods construct the topology of the SES. When data is being retrieved from the SES, neighborhoods of nodes are searched. The nodes index the database structure to retrieve the data. Figure 8 shows an example neighborhood data searches. In Figure 8, the nodal points represent data located in the database. The names are used to store the data on the SES. This speci c example shows a search for Anthony's face. The complete arrows show the path taken. The dashed arrows show alternate paths to the desired data. The rst neighborhood from the origin is searched. Only two nodes from this neighborhood can lead to Anthony's face. At each point in the depth 1 neighborhood, the depth 2 neighborhoods are searched. From the path taken, two nodes lead to the desired data. At each point in the depth 2 neighborhoods, the depth 3 neighborhoods are searched. From
Face Can Hammer Block
Fig. 7. Projection of Data onto the SES the chosen path, only one node contains the desired data. Therefore, the desired data was found at a depth of 3 from the origin.
3.4 Database Associative Memory Human interaction depends highly on the recall of sequences of actions. The sequences of actions may vary from person to person and throughout an environment. For example, a person will interact dierently with an unknown person as compared to a long-time friend. In the same manner, a person will manipulate heavy objects in the environment dierently than light objects in the environment. For a humanoid robot to interact with various humans or in complex environments, it must be able to recall how to interact. We attend to this issue by implementing a DataBase Associative Memory (DBAM). The DBAM is a data structure that allows a robot to recall sequences of associated actions. The DBAM combines a database for holding information with a spreading activation mechanism for performing recall. An interface to the DBAM communicates with other agents in the humanoid system. This interface allows IMA agents to cue the memory, modify memory and add new information to memory. The database holds action and condition records and the weights and connections used to associate them. The actions are tasks to be performed while the
Anthony
Kim Origin Sound 1 Object 2 Object 1
Fig. 8. SES Figure conditions are states that must be satis ed by associated actions. The weights are values used in the spreading activation mechanism and the connections denote the associations between actions and conditions. The spreading activation mechanism operates on nodes. These nodes referencethe action and condition records of the database. The mechanism is fully described in [2]. The mechanism is initialized from the DBAM interface, which begins the spread of activation using condition cues and a goal state. Between the condition cues and the goal state, activation spreads throughout action nodes and other condition nodes by way of the connections. Although connected action records are not set to a value, any connected condition record must receive a true or false value. The Human and Self agents must decide these values and inform the database. The result of the spreading activation mechanism is the recall of associated actions to reach the goal state. Figure 9 shows an example of a spreading activation network using information from the database. The arrows denote the connections between nodes. Weights are applied to the arrows to spread activation through each connection. In this example, the Human Agent decides the values of the \Person Known" and \Person Unknown" conditions. At the same time, the goal state is set and activation is back-propagated. This back-propagation is guided by the associated desired conditions given by the database. The desired conditions are either \Person says Hi" for the \Person Known" condition, or \Person give Introduction"
Person Known
Person Unknown
Greet & Introduce
Say Hello
Person says Hi
Person gives Introduction
Wave to Person
Shake Hands Greeting Complete
Fig. 9. Sample DBAM Application for the \Person Unknown" condition. Depending on which conditions are met and which conditions are desired, activation accumulates at one of the action nodes. This continues throughout the network. Eventually, both paths should lead to \Greeting Complete".
4 System Status Evaluation The System Status Evaluation (SSE) concept is based on the premise that there exist in our multi-agent system patterns of data ow with associated timing characteristics which must be maintained for correct operation. This idea has been used by Han and Shin to develop the the heartbeat algorithm employed in [5]. Our SSE technique analyzes the statistics of agent communication timing, speci cally, the delay between successive communication messages. We will describe two inter-agent communication paradigms, and show how an agent determines the status of another agent using statistical measures of the delay. Past communication timing information is recorded to create a histogram, which then is used as an estimator of a probability distribution function for the delay. The distribution function then allows the agent to determine a threshold level for the delay to determine whether the other agent is working properly. Futhermore, in a multi-agent system, failure in one agent can cause a \ripple eect" if one agent's operation relies on the services of another. Therefore, we should consider the eects one agent's failure has on another. The two communication paradigms determine the direction of arcs for a directed causality graph that shows the how agent failures can cause abnormal timing to propagate through the system.
4.1 Inter-Agent Communication In IMA, a communication channel between two agents is called a relationship; the two agents participating in the relationship take on dierent roles. The two most common relationships in our system are (1) observer and (2) master-slave. In the observer relationship, one agent supplies another with a steady stream of data, often at regular intervals. Figure 10 shows this type of communication. In this gure, agent AS takes on the source role, sending messages to agent AO , which takes on the observer role.
Agent O
M(k-1) M(k) ∆(tk)
data messages M(t)
tk-1
tk
t
Agent S
Fig. 10. Inter-Agent Communication: Observer Type We de ne the message stream as M (k) = fM (0); M (1); :::g. Figure 10 also shows the time delay between two successive messages M (k ? 1) and M (k), which we de ne as
(k) = tk ? tk?1
(1) In the master/slave relationship, two agents act as a \master-slave" pair. The agent in the master role, AM , sends the agent in the slave role, AS , a signal to begin an operation. When the slave has completed its operation, it sends an acknowledgement signal. This is shown in Figure 11. In this model, the master sends commands C (k) = fC (0); C (1); :::g and the slave sends replies R(k) = fR(0); R(1); :::g, where tR(k) > tC (k) . Figure 11 shows the time delay between a command and a corresponding reply, which we de ne as (k) = tR(k) ? tC (k) (2)
Agent M C(k)
R(k) ∆(tk)
command messages C(t)
tCk
Agent S
tRk
t
reply messages R(t)
Fig. 11. Inter-Agent Communication: Master/Slave Type
4.2 Using Timing Information for Failure Detection Agents which take on the observer role in an observer relationship, or agents which take on the master role in a master/slave relationship, can use the communication timing information to determine whether their corresponding observed or slave agents have failed. These agents measure during normal operation and create a histogram of the values. Figure 12 shows an example of such a histogram. This histogram has two spikes{a large one corresponding to a of approximately 200ms, and a smaller one corresponding to approximately 300ms. Using the histogram, an agent can estimate F (x; k), the probability that (k) x. Given this function, the agent can choose a probability threshold Pmax that corresponds to the longest acceptable , which we denote max , for that relationship. If the agent ever observes a communication delay greater than max , then it can classify the other agent in the relationship as failed.
5 Handshaking Demonstration In order to illustrate the dynamics of Human-Humanoid Interaction, we have implemented a handshaking demonstration in which ISAC interacts with a human in several ways. Each of the atomic agents in the demonstration adheres to a simple script, but the interplay among atomic agents{and with the human{provides for a variety of situations. This demonstration centers on a small number of fundamental behaviors that we are implementing for ISAC. These include: 1. Look at a person who is talking, 2. Track the person visually as he or she moves,
Count
250 200 150 100 50 1 4 7 10 13 16 19 22 25 28 31 34 37 40
0 Delay (x10ms)
Fig. 12. Inter-Agent Communication Histogram 3. Reach out with its hand to meet the person's hand. A typical script begins when ISAC detects the presence of a human and begins tracking the human's face. At this point, the interaction could take one of several paths. Figure 8 shows two possible beginnings. If the human is standing outside ISAC's workspace (as in part A of Figure 13), ISAC will ask the human to come closer before attempting to shake hands. However, if the human is standing inside ISAC's workspace, ISAC will initiate handshaking with the human. Figure 14 shows the interconnections among some of the atomic agents in the system. The bottom two layers of atomic agents in Figure 14 are resource and behavior atomic agents. These agents{Left and Right Camera, Pan/Tilt Head, Right Hand, Right Arm, and Tracking{would all be used in almost any task that ISAC would perform. The Human Hand and Human Face agents are environment agents. They encapsulate knowledge and abilities pertaining to speci c objects that would appear in the environment{in this case, the human's hand and face. They support operations that ISAC can perform on those objects, such as lookat or move-to. Human Face supports a look-at operation, which causes ISAC to track the human's face. Human Hand supports a move-to operation that causes ISAC to move the end-eector of one of its manipulators to the human's hand. The HandShake sequencer agent is activated by the Self Agent in response to input from the Human Agent. Additionally, Figure 14 shows the ow of communication during the handshaking operation. It also shows how groups of agents can form sensori-motor behavior loops. One of these, a visual tracking loop, involves Left Eye, Right Eye, Human Face, Tracking Agent, and Pan/Tilt Head. The other, a visually guided motion loop, involves Pan/Tilt Head, Human Hand, and Right Arm.
Nice to meet you!
Please Come Closer
Workspace Boundary
Workspace Boundary
(B)
(A)
Fig. 13. Two Possible Scenarios for Handshaking
Human Agent
Speech Recog.
Detection Module
Interaction Module
Identification Module
Monitoring Module
Interaction Module
Affect Module
1
Self Agent
2
Activator Module
Interrogation Module
Pronoun Module
Description Module 3
SSE
SSE
7
4
Human Hand
Human Face
11
SSE
5
13
SSE
Tracking
6 SSE
Left Camera
SSE
Right Camera
8
Speech Synth.
HandShake Seq.
10 9
12 SSE
Pan/Tilt Head
SSE
Right Hand
SSE
Right Arm
Fig. 14. Agents for the Handshaking Demonstration from the Self Agent's Point of View
5.1 Operation of the Atomic Agents The following gures illustrate the operation of the Hand Shake sequencer atomic agent. The core of the agent is a state machine; the transition diagram for this state machine is shown in Figure 15. When the Human Agent detects the presence of a human, the Self Agent sends an activation signal to the Hand Shake agent. This causes Hand Shake's state machine to move from State 0{an \idle" state{to State 1. In State 1, the Hand Shake agent sends signals to two other atomic agents: a signal to the Human Hand to activate its \move-to" operation, and a signal to ISAC's Hand agent to activate its \auto-close" behavior.
State 0: Waiting for Activation
Activation
State 1: Activate the Human Hand’s “move-to” op and Hand’s auto-close behavior
“Yes”
Timeout
“No” State 4: Speak to Human: “Don’t you want to shake hands?”
Move-to Complete AND Hand Closed State 2: Invoke internal “Shake” behavior
Hand Open
State 3: Activate Hand’s open behavior
Behavior Done
Fig. 15. State Transition Diagram for Handshaking Sequencer Agent The Human Hand agent's state transition diagram is shown in Figure 16. When the agent's move-to operation is activated, the agent checks to make sure that the human is inside ISAC's workspace. The agent does this by querying the arm agent for the workspace boundaries; it can then compare the human's position with the boundary information. If the human is inside the workspace, the agent sends commands to the arm agent to move ISAC's arm toward the human's hand. However, if the human is outside the workspace, the agent will cause ISAC to ask the human to move closer.
State 0: Waiting for Activation Activation
State 1: Is the human inside ISAC’s workspace?
Position Reached State 2: Move Arm’s end-effector to human’s hand
State 4: Speak to Human: “Please come closer” No
Yes
Fig. 16. State Transition Diagram for Human Hand Environment Agent (Human): Hello (ISAC): Shake! (ISAC): Please move closer (ISAC): Don't you want to shake hands? (ISAC): Something is wrong (Human): What's wrong? (ISAC): I had a problem trying to shake hands (Subject): HandShake (Human): What happened? (ISAC): I was asking the human to shake hands when timeout happened
Fig. 17. Example Dialogue between Human and ISAC
5.2 Example Dialogue Figure 17 shows a dialogue between the human and ISAC. During this trial, the human was standing outside ISAC's workspace when he greeted the ISAC. The ISAC responded by inviting the human to shake hands, and asked him to move closer. However, the human did not approach ISAC. ISAC then asks the human if he really wants to shake hands. Since the human does not respond, the Hand Shake agent reports a failure to the Self Agent. The Self Agent then causes
ISAC to report that something is wrong. ISAC then explains what happened in response to the human's queries. Because the Hand Shake agent fails, the happiness aect variable drops below a threshold, causing ISAC to report a problem (\Something is wrong"). When the human queries the robot about it (\What's wrong?"), the Interrogation module reads the description of the failed agent from the Description Module and creates a response (\I had a problem trying to shake hands"). The Pronoun Module points to the failed agent{Hand Shake{as the subject of conversation. When the human asks \What happened?", the Interrogation Module asks the subject agent (Hand Shake) what it was doing at the time of failure, and reports this to the human.
6 Conclusion In this paper, we presented our approach for a uni ed framework for ecient and eective human-humanoid interaction (HHI). The HHI framework shown in Figures 3 and 4 has been developed on a multi-agent architecture called the Intelligent Machine Architecture (IMA) using our ISAC humanoid system. Our next step is to implement task learning mechanisms. Our approach is learning SMC (sensory-motor coordination). To achive this goal, we are developing a control system that will enable ISAC to learn SMC while being led repeatedly by a person through a sequence of behaviors to perform a task. This, we believe, will lead to the successful integration of humanoid robots into society.
References 1. J.S. Albus. Outline for a theory of intelligence. IEEE Trans. Systems, Man, and Cybernetics, 21(3):473{509, 1991. 2. S. Bagchi, G. Biswas, and K. Kawamura. Interactive task planning under uncertainty and goal changes. Robotics and Autonomous Systems, 18(1-2):157{167, July 1996. 3. C. Breazeal(Ferrell). Regulating human-robot interaction using \emotions", \drives" and facial expressions. In Autonomous Agents 1998 workshop \Agents in Interaction - Acquiring Competence through Imitation", 1998. 4. L. Foner. Entertaining agents: A sociological case study. In The Proceedings of the First International Conference on Autonomous Agents (AA '97), Marina del Rey, California, 1997. 5. S. Han and K. G. Shin. Experimental evaluation of failure-detection schemes in real-time communication networks. In IEEE Transactions on Parallel and Distributed Systems, June 1999. 6. I. Horswill. Real-time control of attention and behavior in a logical framework. In Proceedings of the First International Conference on Autonomous Agents, pages 130{137, February 1997. 7. N. Jennings, K. Sycara, and M. Woolridge. A roadmap of agent research and development. Autonomous Agents and Multi-Agent Systems, (1):7{38, 1998.
8. K. Kawamura, R.A. Peters II, S. Bagchi, M. Iskarous, and M. Bishay. Intelligent robotic systems in service of the disabled. IEEE Transactions on Rehabilitation Engineering, 3(1):14{21, March 1995. 9. K. Kawamura, R. T. Pack, M. Bishay, and M. Iskarous. Design philosophy for service robots. In Robotics and Autonomous Systems, volume 18, pages 109{116. July 1996. 10. M. Mauldin. Chatterbots, tinymuds, and the turing test: Entering the loebner prize competition. In AAAI-94, 1994. 11. M. Minsky. The Society of Mind. Simon and Schuster, Inc., New York, NY, 1985. 12. R. Picard. Aective Computing. MIT Press, 1997. 13. G. Richter, F. Smieja, and U. Beyer. Integrative architecture of the autonomous hand-eye robot janus. In Proceedings of CIRA'97, Monterey, CA, July 1997. 14. D. M. Wilkes, W. A. Alford, M. Cambron, T. Rogers, R. A. Peters, and K. Kawamura. Designing for human-robot symbiosis. Industrial Robot, 26(1):49{ 58, 1999.
Acknowledgements
This work has been partially funded by the DARPA Mobile Autonomous Robotics Systems (MARS) Program.
This article was processed using the LATEX macro package with LLNCS style