Evolution of User Interaction: The Case of Agent Adele W. Lewis Johnson, Erin Shaw, Andrew Marshall, and Catherine LaBore Center for Advanced Research in Technology for Education (CARTE), Information Sciences Institute, University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292-6695 USA +1 310 822 1511
[email protected],
[email protected],
[email protected],
[email protected] pedagogical goals of the learning environment. Adele, an Agent for Distance Learning Environments, operates within a Web-based simulation environment and is designed for use in health science courses; here, we focus on work to support case-based diagnostic and skills learning in medical and dentistry education.
ABSTRACT
Animated pedagogical agents offer promise as a means of making computer-aided learning more engaging and effective. To achieve this, an agent must be able to interact with the learner in a manner that appears believable, and that furthers the pedagogical goals of the learning environment. In this paper we describe how the user interaction model of one pedagogical agent evolved through an iterative process of design and user testing. The pedagogical agent Adele assists students as they assess and diagnose medical and dental patients in clinical settings. We describe the results of, and our responses to, three studies of Adele, involving over two hundred and fifty medical and dental students over five years, that have led to an improved tutoring strategy, and discuss the interaction possibilities of two different reasoning engines. With the benefit of hindsight, the paper articulates the principles that govern effective user-agent interaction in educational contexts, and describes how the agent’s interaction design in its current form embodies those principles.
In a typical use of Adele, shown in figure 1, students are presented with a computer simulation of a clinical problem. Students are able to examine the simulated patient, ask relevant questions, order and interpret diagnostic tests, and make diagnoses or create treatment plans. Adele monitors the student’s actions and provides feedback accordingly. Students can ask Adele for a hint or rationale for each action. The exercises were developed with physicians and dentists at USC to address the needs of first and second year graduate medical and dentistry students. These are relatively highly motivated students, however, we found that even well motivated students can fail to draw the appropriate lessons from simulated clinical problems, e.g., they may arrive at a diagnosis without thinking carefully about why their proposed diagnosis explains the observed findings. The key question, which is the focus of this paper, is how a guidebot such as Adele should interact with the learner in order to promote learning, and how the choice of a reasoning engine can affect interaction.
Categories and Subject Descriptors
H.5.2 [Information Systems]: User interfaces; K.3.1 [Computers and Education]: Computer uses in education; I.2.0 [Computing Methodologies]: Artificial Intelligence General Terms: Human factors
In this paper we describe how the interaction model of Adele evolved through an iterative process of design and user testing in our quest for the most effective interaction model, i.e., one that most naturally facilitates and scaffolds learner activities in order to facilitate problem solving and achieve the learning goals of the exercise. We give the results of three studies of Adele, involving over two hundred and fifty medical and dental students over five years that have led to an improved interaction strategy. The medical studies have been formally published. The largest of the studies, in dentistry, has not been previously published, nor have any two of these studies been comparatively described. We describe why and how we changed the interaction strategies in response to the study results, and discuss the impact of different reasoning engines on user interaction.
Keywords
Interface agents, proactive and agent-based paradigms, user studies, social intelligence INTRODUCTION
Animated pedagogical agents, a.k.a. guidebots, are intelligent agents that interact with learners in computerbased learning environments in order promote learning. Guidebots offer promise as a means of making computeraided learning more engaging and effective [8]. To achieve this, an agent must be able to interact with the learner in a manner that appears believable, and that furthers the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. IUI’03, January 12–15, 2003, Miami, Florida, USA. Copyright 2003 ACM 1-58113-586-6/03/0001…$5.00.
93
Figure 1. Adele oversees a student working through clinical dentistry and medical cases. Although the procedural model is effective in capturing best practice, it suffers from two limitations. One is that it does not in itself provide sufficient medical rationales, and so these rationales must be explicitly added by the case author. A more serious problem, particularly for diagnostic tasks, is that the rationale for which action to take depends upon what actions have been taken before, what information was obtained from those actions, and what diagnostic hypotheses are currently being considered. The best practice procedure takes these factors into account, so as long as the student follows the procedure the rationales can be predicted ahead of time and authored into the case. Once the learner deviates from best practice though, these pre-authored rationales are no longer of much value. A medical instructor in such a situation would reason about the case at a more fundamental level, assessing the likelihood of different disease hypotheses based upon the information that has been gathered so far and recommending an action that would be most helpful in evaluating these hypotheses. Accordingly, we developed an additional type of reasoning engine for Adele that focuses on evaluation of hypothesis, called hypothesis-based reasoning (HBR), which is described in detail in [4]. In this approach, information about the causal relationships between clinical findings, patho-physiological states, and diseases is explicitly represented using a Bayesian network, shown in Figure 3 on the right. Adele uses the representation to dynamically determine the likelihood of each disease hypothesis, as well as the likelihood of the patho-physiological states underlying them. This approach enables Adele to generate effective hints and explanations regardless of what the student does. At the same time, it gives the learner greater freedom to deviate from best practice if he or she chooses, a feature that may or may not be desirable. A major goal of our evaluations of Adele was to assess these tradeoffs, and try to determine what combinations of reasoning methods and interaction strategies were most effective.
TECHNICAL OVERVIEW OF ADELE
Adele [14] is intended to reinforce the following kinds of learning as the students work through clinical problems. She helps learners acquire an understanding of best practice, i.e., the appropriate clinical procedures to follow. She can help students to learn how to apply these procedures, e.g., what actions to take in order to obtain desired patient information. And most important, Adele helps learners to understand why a diagnostic or therapeutic action should be taken, e.g., what effect it will have and what its significance is. Assigning students clinical problems to solve does not in itself ensure that these types of learning occur — without supervision from a human instructor or an agent such as Adele students may inadvertently fail to follow improper clinical procedures, or not understand why a clinical procedure is advisable. Because of the emphasis on best clinical practice, in most cases Adele is provided with knowledge of best practice, represented as a hierarchical nonlinear plan [11]. Unlike previous efforts in intelligent tutoring in medicine [1,2] which have relied upon large knowledge bases, Adele encodes only the knowledge needed to tutor a given case. The clinical procedure, consisting of steps and sub-steps, is explicitly authored by instructors on a case-by-case basis so as to accurately reflect the way medical experts analyze the case (Figure 3.) As a student works through the problem a hierarchical task planner evaluates the appropriateness of each step, taking into account the preconditions, effects, and ordering constraints that may exist. Information about the causal relationships between the clinical findings (e.g., an x-ray shows specific lesions) and the hypotheses (i.e., the final and differential diagnoses) is incorporated into the explicitly authored textual hints and rationales associated with steps in the procedure. Hints are used to lead a student through a procedure step by step, while rationales encode the reason for taking a particular step. Students request hints and rationales by pressing the Why? and Hint buttons on the interface, shown in Figure 2.
94
Figure 2. GUI buttons used to solicit hints from the agent in different versions of the system. The current version, on right, differentiates first, subsequent and final requests for a hint. “More…” is used to provide an extended hint. INITIAL EVALUATION: CLINICAL MEDICINE
Analysis of the results
An initial evaluation of Adele (hierarchical plan model) was conducted by a class of second year medical students as part of an Introduction to Clinical Medicine course at the USC Keck School of Medicine [13]. The students worked through the case on their own time, unmonitored. Over one hundred students worked through the case although only twenty-five students completed the final questionnaire. Two face-to-face evaluations were also conducted.
Overwhelmingly, students thought the system was easy to use, yet some found it difficult to figure out what to do next (28%), and how to do it (28%). The latter finding indicated a need for more guidance than the system provided, and subsequently, we looked to Adele to fill this role; to suggest actions at an interface level, as opposed to a task level, when a user becomes confused, and to point out the importance of interface elements if not utilized. Individual comments ran the gamut, from frustrated students who were unable to use the system to enthusiastic users whose comments sound like testimonials. If there was a consensus among the students, it was that they wanted more cases to work on.
Our goals for the evaluation were twofold: to discover how the students would react to Adele, and to confirm that the system could support the students. The final questionnaire contained thirty questions in six categories: System Use, System Components, Rationales, Adele, and Learning. The questions addressed both specific elements of the tutoring system, such as the interface and rationales, and the general reaction of the students to Adele and the concept of the system. We used a 1-5 Agree-Disagree response scale (Likert Scale) for rating. The results are given in Table 1. Evaluation question
Adele
The students provided favorable impressions of Adele in their general textual comments. Most students thought Adele was a good distance education tool, and useful as a classroom preparation tool, but did not suffice as a replace-
Strong agree %
Agree %
Neutral %
Disagree %
Strong disagree %
The system was easy to use (combined results) It was easy to figure out what to do It was easy to figure out how to do it
19 8 8
56 44 40
15 20 24
8 24 16
2 4 12
Adele is a good distance education tool Adele is useful as a clinical preparatory tool Adele would be helpful as a class supplement Adele is a good substitute for a class lecture Adele provided most info of a lecturer Adele provided most info of an attending phys Adele is believable as an attending physician I would like to have more cases available
28 33 29 9 0 0 0 38
44 38 42 9 24 26 24 42
16 8 4 21 64 56 44 13
8 17 20 33 0 8 24 8
4 4 4 29 12 8 8 0
Adele’s hints are helpful Adele’s rationales were useful I prefer Adele give rationales before a step I prefer Adele give rationales after a step I prefer Adele let me ask for the rationales Adele’s images and actions were motivating Adele is preferable to a text-only tutor I prefer a real voice to a synthesized one I dislike Adele’s unsynchronized lips & voice
42 34 8 9 8 8 16 24 4
21 34 28 35 38 29 16 32 22
33 29 40 39 46 42 24 32 48
4 4 20 17 4 17 32 12 22
0 0 4 0 4 4 12 0 4
Table 1. Evaluation results from the field trial at the School of Medicine
95
ment for a class lecture. Nor was she believable as an attending physician. It was not clear whether students would have preferred a text-only tutor over the persona when general comments were factored in. Not surprisingly, students would have preferred a real voice to a synthesized voice, however, students did not feel strongly about the fact that Adele’s lips and voice were not synchronized. (Adele’s lips and voice moved at the same time but were not yet phonetically synchronized.)
SECOND EVALUATION: GERIATRIC DENTISTRY
A second evaluation of Adele (hierarchical plan model) was conducted in 1999 with a new case in Geriatric Dentistry at the USC School of Dentistry. One hundred dentistry students and forty dental hygiene students participated in and completed the field trial, which took place in a controlled and monitored setting. Our goals for the second evaluation were similar to those of the first. Most of the questions from the initial evaluation were restated to give us a point of comparison. The results are given in Table 2.
Hints and rationales
The survey also included seven open-ended questions, four more than the previous survey. As in the original survey, the students were asked to describe the areas of the program they found useful, the areas they would change, and why they liked or disliked the concept of the system. This time they were also asked more specific questions about the agent; to describe how Adele could have been more useful, and why they found her hints, rationales, and How button helpful or not helpful. The detailed comments were used to support the scaled data.
Almost all students agreed that Adele’s hints were helpful and her rationales useful. They had mixed feelings, though, about when they wanted to hear the rationales given. During a one-on-one session before the evaluation, we noticed that the student was not asking “Why?” and was therefore missing much of the knowledge that the domain expert had carefully authored, so we decided to have Adele give some of the rationales automatically, whether a student asks for them or not. We implemented three variations: 1) give a rationale only when asked, 2) give it automatically after a hint, and 3) give it automatically after a user takes a step, and then asked the students which variation they preferred. Variation number three was suggested by the physician/instructor, being the strategy that was most intuitive to him. Most students answered that they prefer to hear a rationale only when they ask for it, indicating that students want control over their own learning. However, logging and observation data show that most students never asked for a rationale, and therefore would most likely not ask for one if given a choice. Because most of the wisdom of the instructor is embedded in the rationales, this indicates a critical learning system problem. Evaluation question
The new domain required two major additions: a point system for scoring the learner’s work and a treatment plan exercise. The additions required fundamental changes to the reasoning engine in order to support differently assessed actions (e.g., optional, unnecessary, contraindicated steps, as well as required ones) but did not affect the interaction model. The treatment plan exercise required a new module for assigning treatments for subsequent visits, which also did not require changes to the interaction model.
The system was easy to use (combined results) It was easy to figure out what to do It was easy to figure out how to do it The How button was useful
Strong agree % 8 2 4 23
Agree % 49 23 19 47
Neutral % 18 11 18 16
Disagree % 18 40 40 7
Strong disagree % 5 22 22 5
Adele is useful as a clinical preparatory tool Adele would be helpful as a class supplement Adele is a good substitute for a class lecture I would like to have more cases available
13 15 5 38
56 56 23 42
17 16 43 13
10 9 22 8
4 4 9 0
Adele’s hints are helpful Adele’s rationales were useful I prefer Adele give rationales before a step I prefer Adele give rationales after a step I prefer Adele let me ask for the rationales Adele’s images and actions were motivating Adele is preferable to a text-only tutor I prefer a real voice to a synthesized one I dislike Adele’s unsynchronized lips & voice I found the sound helpful
27 6 11 12 12 11 9 13 3 N/A
47 50 53 53 30 48 31 31 23 80
15 31 16 18 20 27 22 42 43 N/A
8 10 17 16 32 9 28 13 22 20
3 3 3 1 6 5 10 1 9 N/A
Table 2. Evaluation results from the field trial at the School of Dentistry.
96
We also addressed some of the weak points of the system as gleaned from the initial evaluation. Since it was often unclear how to perform a step that Adele suggested, the need for interface-level guidance was critical, especially since these students tended to be less sophisticated computer users than the medical students. Our solution was to implement a partially ordered help-task planner that allows Adele to explain how to perform the steps she suggests. The information is available on demand via a How button on the agent control panel. (This button is also used to display suggested references when appropriate.)
Hints and rationales
Analysis of the results
Adele
Students also found Adele’s rationales helpful, although, again, not as helpful as the hints. Unlike the medical students, these students prefer that Adele provide the rationales automatically. When asked to comment on whether the rationales were useful (not useful) and why, most students wrote that the rationales helped them understand why an action was necessary. Twelve percent of respondents, however, did not find the rationales useful, and almost five percent were either unaware of the Why button or did not use it. When students were asked to describe the areas of the program that they found most useful, they cited Adele, and in particular her hint feature, twice as frequently as any other area. (Students also liked the graphics and interactive tools, the information and learning sequence, the patient’s medical history, and the case-based clinical model.) Many students opined that they would not have been able to finish the case without her help, but at the same time thought that Adele should be more helpful, especially when they were stuck. When asked what other feedback they would have liked Adele to give them, the most frequent response was “information”. Some desired more specific information (unfortunately, no one specified what kind of specific information), while others indicated that more general information, e.g. a high-level overview, would be helpful. Many students cited “feedback” as something Adele should provide more of.
The dentistry students, like the medical students, found the system easy to use, even if the new exercises slightly lowered the overall combined usage score. However when asked to describe things they would like to see changed, the number one response was “make the program more userfriendly”. This included making the interface easier to navigate, the case less complicated, and the step sequence less rigid. Slow network speed was also cited as a problem. Interface assistance
Almost everyone found the new How button useful, however, sixty-two percent still found it difficult to figure out what to do next, and how to do it. These numbers are much higher than the corresponding numbers from the medical school and we surmise several reasons for the difficulty. First, the dentistry students had less experience using a computer and, as noted in the comments, the multiwindow interface is confusing – while attempting to manage multiple windows on screen several users mistakenly clicked on the ‘x’ in the top right-hand corner to minimize the window and instead exited the case. As a result of this evaluation, we replaced the multiple-window interface with a single-window version. Also, the dentistry case is longer, taking on average an hour and a half to complete, and its step ordering is very restrictive, requiring the students to adhere to best practice, as recommended by the instructor. This was a common complaint. Again the instructor’s recommendation concerning interaction model disagreed with the learners’ preferred interaction style.
These comments led to the creation and addition of milestones, points during the exercise when the student has completed an exclusive portion of the task. Adele is authored to provide special feedback at this point, to give learners a sense of accomplishment as well as to reinforce where they are in the case with respect to completing the task. Once again there was no consensus on whether students prefer Adele to a text-only tutor, yet (again) the students commented favorably on Adele in the general comments. We will need to conduct side-by-side comparisons to answer this question conclusively. Like the medical students, these users would have preferred that Adele have a real voice to a synthesized one but were relatively undisturbed by her unsynchronized lips and voice. Most users prefer to have sound, but not all; a few felt strongly enough to comment that no sound would have been better, reinforcing the negative reaction to synthesized speech.
It is likely that the difficulty score would have been higher had there been no How button. When asked to comment on whether the How button was useful (not useful) and why, most students responded that they relied on the How button to help them navigate the interface. Nine percent of respondents, however, did not find the How button useful, and five percent were either unaware of a How button or did not use it. Again, students found Adele’s hints to be very helpful. When asked to comment on whether the hints were useful (not useful) and why, most students commented that they relied on Adele’s hints to sequence through the case. Eight percent of respondents, however, did not find the hints useful, and almost three percent were either unaware of the Hint button or did not use it.
Also once again, the consensus among the students is that the concept, if not the reality, of the system is sound, that Adele is a good preparatory tool and class supplement (but not replacement), and that they would like more cases to work on. When asked if they liked (disliked) the idea of working through practice cases over the Internet, students overwhelmingly liked the idea, especially if it meant that they could work from home.
97
(step palpate-axillary :effect ((palpate-axillary-done) (set palpate-axillary-value)) :phrase "palpate" "palpating" "the axillary nodes" ) (step examine-lymphnodes :precond (examine-lesion) :steps
(and palpate-axillary-nodes palpate-clavicular-nodes)
:hint
"Is Lymphadenopathy indicated?"
:rationale "Lymphadenopathy occurs in infectious diseases such as TB, viral, fungal, some bacterial infections, and in cancer." :verbose "The distribution of the nodes involved gives a clue based on their drainage. For example, the supraclavicular and axillary nodes drain the sternal area." :context
"lymphnodes" )
Figure 3. On the left are two steps in a hierarchical plan: the top step is a subset of the bottom step. On the right is a portion of the Bayesian network for HBR. The starting node (center) is labeled “cough”. Adele: Asthma can cause chronic bronchial inflammation.
THIRD EVALUATION: HBR
As described above in the technical overview, a second reasoning engine was developed for Adele, employing hypothesis-based reasoning. This new reasoning method also enabled new methods for interacting with the learner. As the learner works through the case, the engine maintains a point of current focus of attention, which is the node in the Bayesian network that represents the hypothesis that the learner is currently investigating. The focus is used to generate hints automatically when asked. When the learner asks for a hint, Adele locates the node in the network for the evidence-gathering step that offers the most utility in affecting the likelihood of the current focus. Adele then constructs a path from the focus node to the selected node. She then selects the node that is nearest on the path to the focus node, and explains its relationship to the focus node. If the student is still unclear what to do, Adele will then select the next node on the path.
At this point the learner should ask the patient if they have asthma. If they do not, Adele will ask what their reason is for taking another action. Like the earlier versions of Adele, the HBR version proactively offers comments about the rationales behind actions. Unlike earlier versions, the comments depend upon a model of the learner’s knowledge about the domain. If a learner’s evidence gathering action significantly affects the likelihood of the current focus hypothesis and the learner model indicates that he or she may not be aware of it, Adele will volunteer a comment about the likelihood of the current hypothesis. An additional change is the increased use of probing questions that require learners to explain the rationales behind their actions. Earlier versions of Adele also asked learners pop quizzes about observed findings, but these questions were not necessarily aimed at getting the learners to explain their reasoning. Having learners explain their reasoning has the added benefit of getting them to identify which hypothesis is their current focus of attention, which in turn helps Adele to follow the learner’s reasoning.
This approach has the effect of guiding learners in a direction of reasoning that will enable them to decide what to do on their own, and at the same time understand the rationales for the action they choose. The following is an example interaction that illustrates this, where the learner is trying find out what evidence to gather that will help explain a patient’s cough. To follow Adele’s reasoning, follow the path in Figure 3 from “cough” to “asthma”.
The Adele HBR reasoning module was tested at the USC Keck School of Medicine on a diagnostic case in which learners were asked to determine the cause for a patient’s cough. Four medical students worked in pairs on the cough case; the students were observed working through the case and a focus group was conducted immediately afterwards. Although the number of subjects was small, the focus group provided valuable information about learners’ reactions to Adele.
Student: What next? Adele: Chronic air passage obstruction can cause cough. Student: What next? Adele: Chronic bronchial inflammation can cause chronic air passage obstruction Student: What next?
98
original choice of using a Hint and Why button, that is, of separating the action and its rationale, came from an earlier pedagogical agent upon which Adele was based [10]. In the HBR version, the Bayesian network based reasoning engine obviated the need for an explicitly authored rationale so the Why button was removed (Figure 2). Instead, the rationale for the best next step is inherent in the suggestion Adele gives when asked “What next?” In fact, the hint is the rationale: the actual action step is not named until all paths have been exhausted. This observation is informing a new version of Adele. Although we may use a Bayesian network approach instead of a hierarchical plan approach to create new cases in the future, there are some drawbacks to the network approach. Creating and validating a probability network is a significant task, even for a domain expert, and modifying a network can be difficult. We also have legacy cases that we cannot convert ourselves. So we decided to try to change the interaction model without changing the reasoning engine.
Analysis of the results
The students thought Adele’s hints were good, they liked that she tried to lead them in the right direction. This is the first indication that Adele was guiding the students as opposed to instructing them. However, they did not ask for hints frequently, asking mainly after Adele intervened and prevented them from submitting a diagnosis without having collected enough evidence to justify. This made Adele’s ability to volunteer commentary and suggestions very important. The students also liked that Adele asked questions about the motivation for their actions. In fact it was often the case that Adele’s questions prompted the pairs of learners to discuss the questions themselves, learning in the process. Adele thus served as facilitator rather than instructor in these instances. One critique that the learners had was that they wanted better explanations when Adele rejected an action because the hypothesis was unlikely. This information can be easily extracted from the Bayesian network model, it is a matter of further enhancing the interaction model to provide these richer explanations.
To achieve this goal we completely removed all hints that had associated rationales, and used the rationales instead. Some of the authored rationales were so long that we broke them up. The first part is used for the hint, the second part will be displayed (not spoken) when the learner presses the new More button. This is similar to the use of mini-lessons in [5]. A sample interaction might look like this:
We must also mention the general reaction of our own group members while working through the HBR model. Most of us felt that the task was challenging and fun, and encouraged thinking on our own. This was in contrast to what we experienced with the former interaction models, in which we were told to take an action, as opposed to being led to do so.
Student: What next? Adele: Lymphadenopathy occurs in infectious diseases such as TB, viral, fungal and some bacterial infections. Student: More… Adele: The distribution of nodes involved gives a clue to their drainage. For example, the supraclavicular and axillary nodes drain the sternal area. Student: Answer Adele: You could palpate the axillary nodes. The agent interaction model is more guide-like, as in the HBR approach. The reasoning engine still focuses on a procedural and not a diagnostic interpretation of the user’s interactions and thus it will still be unable to evaluate the user’s actions in detail when they deviate significantly from standard clinical procedures.
DISCUSSION AND CURRENT WORK
In retrospect, we can identify several factors that appear to contribute to improving effectiveness of Adele’s interaction. One of the biggest factors is the way in which Adele’s interactions influence learner motivation. Researchers who have studied motivation have identified a number of factors that contribute to motivation, including curiosity, a sense of challenge, self-confidence, and a sense of being in control [12]. Each of these factors is potentially influenced by Adele, but Adele’s greatest impact currently is on curiosity and control. Adele’s probing questions can help promote curiosity. However the biggest impact may be on learner control. In the earlier versions of Adele, the dentistry case in particular, learners complained about being constrained to follow best practice, particularly when they did not understand why it was best practice. The HBR does not constrain the learner to follow a best practice strategy, but instead tries to identify the learner’s problem solving strategy (i.e., focus of attention). HBR-Adele may offer evidence for why the learner’s strategy is inappropriate, but will intervene and prevent the learner from acting only rarely. Our evidence so far suggests that this is a superior approach. Unfortunately it is too often the case that agents that assume tutorial functions take away learner control, particularly plan-based systems; the planbased version of Adele is not unique in that respect.
Another change that we have made to the interaction model is to give the learner the option of not receiving the answer, i.e., the best action to take next. In all former interaction models, the learner arrived at the best next step by pressing the What else? button until there were no hints left. However, the learner had no way of knowing when the agent would run out of hints. Now, before the agent tells the student the best next step, the button label changes from “What next?” or “What else?”, indicating a subsequent hint, to “Answer”, indicating that the student (and instructor) are about to be told the step. Many learners complained about the quality of speech synthesis used in Adele. This is a persistent problem, and
The overwhelming practice we noticed during the large studies was that students seldom asked for a rationale. The
99
3.
one that definitely must be addressed in a guidebot that is capable of a generating a rich variety of utterances. To address this problem, we have been developing a new speech synthesis technique, based upon unit selection, that is capable of generating high-quality expressive speech suitable for education and training settings [9]. These techniques should be able to improve the acceptability of guidebots such as Adele.
4.
Finally, the questions remain regarding the effect of realizing Adele as an animated character. Although our evaluation instruments did not directly measure this, we received substantial informal feedback regarding Adele’s apparent personality and social skills. These issues are discussed elsewhere in this proceedings [7].
5.
6.
CONCLUSION
This paper has presented lessons learned from a number of evaluations and informal assessments of different versions of Adele. This was a sometimes laborious, sometimes serendipitous process; in retrospect lessons were learned that can be applied to future educational applications of interface agent technology.
7. 8.
ACKNOWLEDGMENTS
CARTE staff Rajaram Ganeshan and Jeff Rickel, and students Rebecca Hwang, Cynthia Chang, Nancy Chan, Ami Adler, and Anna Romero contributed to the work presented here. Dr. Ganeshan and Dr. Beverly Wood of the Keck School of Medicine authored the network for diagnostic tutoring and conducted its study. Ms. LaBore managed the dentistry field trial at ISI. Dr. Roseann Mulligan of the School of Dentistry directed the evaluation, and she and students John Morzov and Kristine Choulakian authored the dentistry case. Maria Henke of the Andrus School of Gerontology coordinated the dentistry development team. Dr. Allan Abbott led the evaluation at the Keck School of Medicine and medical student Michael Hasler authored the medical case. Our collaborators, Drs. Demetrios Demetriades, William La, Wesley Naritoku, Sidney Ontai and Angela Atencio and Leah Flodin at the USC School of Medicine provided indispensable assistance. This work was supported by an internal grant from the USC Information Sciences Institute.
9.
10.
11. 12.
13.
REFERENCES
1.
2.
Clancey, W. J. & Letsinger, R. (1984). NEOMYCIN: Reconfiguring a Rule-Based Expert System for Application to Teaching, In W.J. Clancey & E. H. Shortliffe (Eds.), Readings in Medical Artificial Intelligence: The First Decade. Reading, MA, Addison-Wesley. Clancey, W. (1987). Knowledge-based Tutoring: The GUIDON Program. MIT Press.
14.
100
Davies, J.R., Gertner, A.S., Lesh, N., Rich, C., Sidner, C., & Rickel, J. (2001). Incorporating tutorial strategies into an intelligent assistant. In IUI’01, 53-56. New York: ACM Press Ganeshan, R., Johnson, W.L., Shaw, E., & Wood, B. (2000). Tutoring diagnostic problem solving. In G. Gauthier, C. Frasson, K. VanLehn (Eds.), Intelligent Tutoring Systems: 5th International Conference, ITS 2000, 33-42. Berlin: Springer-Verlag. Gertner, A., Conati, C., & VanLehn, K. (1998). Procedural help in Andes: Generating hints using a Bayesian network student model. Proceedings of the 15th Nat. Conf. on AI, 106-111. AAAI Press. Horvitz, E., Breese, J., Hecherman, D., Hovel, D., Rommelse, K. (1998). The Lumière Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users. Proc. of the 14th. Conf. on Uncertainty in AI. Johnson, W.L. (2003). Interaction tactics for socially intelligent pedagogical agents. Proceedings of IUI ’03. New York: ACM Press. Johnson, W.L., Rickel, J., and Lester, J. (2000). Animated Pedagogical Agents: Face-to-Face Interaction in Interactive Learning Environments, International Journal of Artificial Intelligence in Education 11, 47-78. Johnson, W.L., Narayanan, S., Whitney, R., Das, R., Bulit, M., & LaBore, C. (2002). Limited domain synthesis of expressive military speech for animated characters. IEEE TTS Workshop. Rickel, J. and Johnson, W.L. (1999). Animated agents for procedural training in virtual reality: perception, cognition, and motor control, Applied Artificial Intelligence Journal, Vol. 13, 343-382, 1999. Russell, S., and Norvig, P. (1995). Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs. Sansone, C. and Harackiewicz, J.M. (2000). Intrinsic and extrinsic motivation: The search for optimal motivation and performance. San Diego: Academic Press. Shaw, E., Ganeshan, R., Johnson, W. L., and Millar, D. (1999). Building a Case for Agent-Assisted Learning as a Catalyst for Curriculum Reform in Medical Education, Proceedings of AIED '99, 509-516. Amsterdam: IOS Press. Shaw, E., Johnson, W.L., and Ganeshan, R. (1999). Pedagogical agents on the web. Proc. of the Third Int’l Conference on Autonomous Agents, 283-290. New York: ACM Press.