an exploratory study into interlanguage ... - Wiley Online Library

Research Report

An Exploratory Study Into Interlanguage Pragmatics of Requests: A Game of Persuasion

Hui-Chun Yang Diego Zapata-Rivera

April 2009 ETS RR-09-13

Listening. Learning. Leading.®

An Exploratory Study into Interlanguage Pragmatics of Requests: A Game of Persuasion

Hui-Chun Yang University of Texas, Austin Diego Zapata-Rivera ETS, Princeton, New Jersey

April 2009

As part of its nonprofit mission, ETS conducts and disseminates the results of research to advance quality and equity in education and assessment for the benefit of ETS’s constituents and the field. To obtain a PDF or a print copy of a report, please visit: http://www.ets.org/research/contact.html

Copyright © 2009 by Educational Testing Service. All rights reserved. ETS, the ETS logo, LISTENING. LEARNING. LEADING., TOEIC, and TOEFL are registered trademarks of Educational Testing Service (ETS). C-RATER, TEST OF ENGLISH AS A FOREIGN LANGUAGE and TEST OF ENGLISH FOR INTERNATIONAL COMMUNICATION are trademarks of ETS.

Abstract Teachability and assessment of interlanguage pragmatics are two crucial but still relatively unexplored areas in second language research. Several instruments, such as multiple-choice discourse completion tasks, written discourse completion tasks, role-play tasks, and discourse self-assessment tasks (J. D. Brown, 2001), have been used to assess students’ interlanguage pragmatic skills. This paper describes a new assessment-based gaming environment of English as a second language (ESL) pragmalinguistics called A Game of Persuasion that focuses on the speech act of request. This game allows users to engage in interactive written dialogue with an artificial professor (or pedagogical agent) in multiple academic contexts. Students explore contextually and socially appropriate request strategies while the system scores each attempt, assigns points, and provides formative and summative feedback. This paper presents (a) related research on instruments that have been used in the area of interlanguage pragmatics (i.e., request), (b) our assessment-based gaming environment, and (c) reports on results from a study aimed at exploring usability aspects of the game and usage of request strategies employed by 30 English language learners (ELLs) who completed 10 written interactive request tasks with and without the presence of a pedagogical agent. Key words: Interlanguage pragmatics, assessment-based games, pedagogical agents, speech acts

i

Table of Contents Page 1. Theoretical Framework................................................................................................... 1 2. A Game of Persuasion .................................................................................................... 4 2.1. IDCT Scenarios .................................................................................................... 4 2.2. Dr. Brown ............................................................................................................. 4 2.3. Dialogue Engine ................................................................................................... 6 2.4. Scoring and Dr. Brown’s Responses .................................................................... 8 2.5. Refining the Scoring Engine................................................................................. 9 3. Methods......................................................................................................................... 10 3.1. The Experiment .................................................................................................. 10 3.2. Computer-ScoredVersus Human-Scored Utterances ......................................... 12 3.3. Analysis of Results ............................................................................................. 12 4. Discussion and Future Work......................................................................................... 22 References......................................................................................................................... 24 Notes ................................................................................................................................. 28

ii

List of Tables Page Table 1. Examples of the Request Categories Identified in the Dialogue Engine .......... 6 Table 2. Conversational Cycles ...................................................................................... 8 Table 3. Proportion of Request Strategies Used by Two Groups ................................. 16 Table 4. Transition Analysis (First Attempt – Alerter)................................................. 21 Table 5. Transition Analysis (Subsequent Attempts—Nonconventionally Indirect [NCI]).............................................................................................................. 21

iii

List of Figures Page Figure 1. Dr. Brown. .......................................................................................................... 5 Figure 2. Finite-state automata model in the speech act of request (+ points). ................. 7 Figure 3. Agent-based interactive discourse completion task (IDCT); experimental condition..................................................................................... 11 Figure 4. Text-based interactive discourse completion task (IDCT); control condition. ............................................................................................. 11 Figure 5. Distribution of scores assigned by the human raters. ....................................... 13 Figure 6. Distribution of scores assigned by the computer.............................................. 13 Figure 7. State transition diagram of request strategies used by students in the agent-based interactive discourse completion task (IDCT) group (n). ............ 19 Figure 8. State transition diagram of request strategies used by students in the text-based interactive discourse completion task (IDCT) group (n)................ 20

iv

Pragmatic competence is essential for all English language learners (ELLs), including immigrant students in the K–12 level, adult English as a second language (ESL) students, and English as a foreign language (EFL) students who prepare themselves for advance studies in an English speaking country. Research to date has shown that ESL and EFL textbooks include a paucity of pragmatic information (Grant & Starks, 2001; Vellenga, 2004) and that considerable inconsistencies exist between the English found in textbooks and that which is documented in the corpora of spoken and written English (Bardovi-Harlig, Hartford, Mahan-Taylor, Morgan, & Reynolds, 1991; McCarthy & Carter, 1995). To compound this problem, teachers have also been shown to be ill-prepared to teach pragmatics in class (Ishihara, 2007; Kasper & Schmidt, 1996). Addressing the current dilemma in pragmatic education, we implemented a computer game prototype called A Game of Persuasion that is aimed at assessing the present pragmatic competence of ELLs and supporting student learning of pragmatics. Among pragmatic functions, the speech act of request within academic contexts was selected as the focus of our initial investigation. Based on the results from this study, we intend to explore its potential applications to other areas of interlanguage pragmatics. A Game of Persuasion offers real-time, written exchange between a student and a pedagogical agent. Pedagogical agents (e.g., Chan & Baskin, 1990; Graesser, Person, Harter, & TRG, 2001; Johnson, Rickel, & Lester, 2000) have been used to facilitate learning by supporting human-like interaction with computer-based systems. Pedagogical agents can act as virtual peers or virtual tutors. In light of these findings, our goal is to design an assessment-based learning tool that allows learners to build a connection between the game and their underlying knowledge of the speech act of request. This paper describes (a) related research on instruments that have been used in the area of interlanguage pragmatics (i.e., request), (b) the development of our assessment-based gaming environment, and (c) the usability aspects of the game and the patterns of request strategies employed by 30 ELLs who completed 10 written interactive request tasks with and without the presence of a pedagogical agent. 1. Theoretical Framework Interlanguage pragmatics (ILP) has been identified as the study of language learners’ use and acquisition of pragmatic competence in a second language (L2). In addition to language rules of morphology and syntax, successful ELLs need to master appropriate use of different 1

expressions to achieve socio-interactional purposes. Following Leech (1983), one’s L2 pragmatic ability requires sociopragmatic and pragmalinguistic knowledge. Sociopragmatic knowledge encompasses the knowledge of the social rules governing the discourse practices, whereas pragmalinguistic knowledge describes how language learners use linguistic and strategic resources as means to reach their communicative goals. In recent years, speech acts theory has attracted a great deal of attention in ILP research, especially the act of request (Byon, 2004; House & Kasper, 1987; Rose, 2000). Requests can be made at different levels of directness based on the given context. Direct strategies are defined as the expressions in which propositional content and linguistic forms are consistent, whereas indirect strategies refer to utterances in which speakers’ meaning and the sentence meanings are not identical (Holtgraves, 1986). Conventionally indirect and nonconventionally indirect strategies are two subsets of indirect strategies commonly used to make a request. Speakers using conventionally indirect strategies may seek information or make a request by questioning the hearer’s ability, such as, “Can you turn down the music?” Nonconventionally indirect strategies are known as hints, or off-record strategy (Brown & Levinson, 1987), such as, “The music is really loud.” In formal settings where a request is addressed to a person who has more power, a higher degree of indirectness is required to avoid possible offenses and to prevent infringing upon the interlocutor’s face. 1 The greater imposition an act involves, the more indirect the speaker will be. The most common type of research instrument in pragmatic research is the discourse completion task (DCT; Kasper, 2000). The DCT is a questionnaire containing a set of short descriptions of situations or dialogues with a turn or part of a turn missing. Participants read the contextual information given and react to a prompt in written form. Researchers found that a large amount of data on a variety of pragmatic behaviors can be collected using this method in a short period of time (Beebe & Cummings, 1996). A major limitation of the DCT is that they do not replicate negotiated and interactive aspect of discourse (Beebe & Cummings, 1996; Wolfson, Marmor, & Jones, 1989), thus DCT data can only provide samples of speech acts rather than the interactive process that takes place between interlocutors. McNamara and Roever (2006) pointed out that such a dilemma posed the question of “construct under-representation” when testing learners’ pragmatic competence (p. 64). It is because learners’ linguistic realization is situated in coconstructed conversational contexts. To have a better understanding of the dynamics of

2

interchange and the nature of L2 learners’ ability, it is important to recreate an environment that simulates the real-life interactions. In an effort to shorten the discrepancy between naturally occurring data and data sets elicited by the DCT, Kuha (1999) ingeniously uses a computerized, text-based, interactive version of a DCT, Interactive DCT (IDCT), to elicit data on complaints. Unlike in the traditional DCT where participants are only allowed to take one turn, the IDCT allows participants to have a three-turn dialogue where complainee’s responses are computer-generated. The study shows that the IDCT elicits responses that are significantly longer and have greater variation of outcomes since the IDCT places speech acts within an interactional context. To create a sense of reality, we implemented a pedagogical, agent-based IDCT. Animated pedagogical agents can mimic human gestures and facial expressions to help engage students more in the target tasks. They can also play different roles (e.g., a professor, a nurse, a policeman, or a classmate), enhancing the context in which interactions take place. A number of research projects have addressed issues such as the role that pedagogical agents play in improving student motivation, supporting students’ cognitive and affective needs (Johnson et al., 2000; Wang, Johnson, Rizzo, Shaw, & Mayer, 2005), and providing individualized interventions that encourage learning (Conati & Zhao, 2004). Among previous studies implementing pedagogical agents to assist learners, only few of them focus on the learning and instruction of foreign and second languages. Mote, Johnson, Sethy, Silva, and Narayanan (2004) implemented the Tactical Language Training System (TLTS) to provide training in basic Arabic language and cultural skills. In another study, Harless, Zier, and Duncan (1999) developed four Virtual Conversations programs that allow learners to have extended, synchronous, face-to-face dialogue with virtual characters who speak Arabic natively. However, to this date, no single attempt has been made to carry out an ILP study that incorporates pedagogical agents to help students learn pragmatic skills in a second language. This research is informed by the dual needs to evaluate the pragmatic competence of ELLs and to use this information to provide feedback that helps them choose appropriate request strategies according to a particular context. Thus, we developed and used A Game of Persuasion to explore the following research questions: •

How usable is the game? And how effective is it in terms of encouraging and maintaining student motivation? 3

•

What are the types of request strategies ELLs use to complete IDCTs with and without the presence of a pedagogical agent?

•

What are the distributions of request strategy transitions in the agent-based IDCT group and in the text-based IDCT group? 2. A Game of Persuasion

As an assessment-based gaming environment (Zapata-Rivera, Vanwinkle, Shute, Underwood, & Bauer, 2008), A Game of Persuasion gathers evidence of students’ performances on predefined constructs (e.g., knowledge and use of request strategies in particular contexts) through the use of tasks/scenarios that are developed to elicit behaviors. 2.1. IDCT Scenarios A Game of Persuasion includes 10 scenarios designed to elicit requests that occur in academic settings. The game involves a senior male professor, played by a pedagogical agent (i.e., Dr. Brown), and a nonnative English speaker participant. In all hypothetical situations, participants interact with the professor, who is of higher social standing. Request scenarios were selected based on their frequency, practicality, and potential difficulty for the participants of this investigation: English language learners pursuing academic studies in the United States. Previous studies have noted that context-internal variables, such as power, distance and degree of imposition, were significant social factors in determining the speech act performance (BlumKulka, House, & Kasper, 1989; Brown & Levinson, 1987). In a range of everyday encounters in academic contexts, ELLs have a high occurrence of making requests to professors, and requests to people of higher standing may be more challenging due to sociocultural differences (Byon, 2004; Naiditch, 2006). Although the nature of the request changes, scenarios share the same internal representation and scoring algorithm (see sections 2.3 and 2.4). Thus, it is possible to generate additional request scenarios without having to make significant changes to the game. 2.2. Dr. Brown Animated pedagogical agents have been shown to have a facilitative impact on human learning in instructional planning (Baylor & PALS Research Group, 2003), botanical anatomy and physiology (Lester, Stone, & Stelling, 1999), and medical education (Shaw, Ganeshan, 4

Johnson, & Millar, 1999; Shaw, Johnson, & Ganeshan, 1999). In recent years, pedagogical agents have been used to motivate and engage students’ learning of a foreign or second language and culture (Wang et al., 2005; Zapata-Rivera et al., 2008). Students in A Game of Persuasion are asked to make requests to a pedagogical agent (i.e. Dr. Brown). Dr. Brown (see Figure 1) provides textual and verbal feedback (e.g., “Sorry, I can’t help you with that. Is there anything else I could help you with?”). Dr. Brown also shows various emotional expressions (e.g., wink, stare, smile, and frown) that are chosen based on how appropriate the student’s request is for a particular context. This feedback offers students information about their performance and guides them to explore more effective request strategies. A comment box pops up at the end of each scenario (formative feedback), providing immediate suggestions based on the student’s current performance (e.g., “Next time you can try to be more specific about why you want to make this request?”). After students complete all the situational tasks, they get a final report (summative feedback) indicating their final scores and a list of suggestions for the improvement of strategy use based on their assessment results.

Figure 1. Dr. Brown.

5

2.3. Dialogue Engine Students’ written utterances are classified into a finite set of pre-established categories using keywords. Alerters, head acts, and supportive moves are considered three major parts in the realization of a request (Blum-Kulka, et al., 1989) and hence they are used as descriptors to identify appropriateness of student utterances. Alerters serve as opening statements such as greetings that precede a speaker’s requests. They raise the hearer’s attention and open up a gate for further conversation. Head acts are classified on three levels of directness: direct, conventionally indirect, and nonconventionally indirect strategies. To increase the likelihood of the hearer’s compliance with the speaker’s requests, it is often necessary to use some supportive statements (Faerch & Kasper, 1989; House & Kasper, 1987). The most commonly occurring type of supportive move is the grounder (Byon, 2004; Vine, 2004). Examples of these categories, drawn from the data, are listed in Table 1. Table 1 Examples of the Request Categories Identified in the Dialogue Engine Category Alerters Direct requests Conventionally indirect requests Nonconventionally indirect requests Grounders Nonsense

Example How are you? I need to take your writing course this semester. Could you let me take your writing course? I am really fascinated by the topic of your writing class. I am planning to graduate this semester but I’m one class short. Do you have parents?

The dialogue engine is represented using a finite-state automata diagram (see Figure 2). The finite-state machine represents all the possible turns students might take in completing a task. The nature of the interaction is to convince the professor, Dr. Brown, to grant the request. For each persuasion situation, a student begins with an initial state (S1), and then continues on in a conversation until he or she arrives at a subset of states of the automata, the final states (Yes or No). Both the initial and final states are represented as two concentric circles. Several labeled ovals denote intermediate states (e.g., Alerter, Greeting, Retry, and Direct). Transitions are portrayed as arrows pointing from an originating state to a target state. Each arrow/link that is connected to Dr. Brown’s part of the diagram is marked with an action set (e.g., 2x) and the number of points that will be given to the student. 6

STUDENT

Dr. Brown

2x, (+2)

1a. Alerter

2. Greetings

S1

2x. Greetings e.g., Hello, what can I do for you?

1b. No Alerter 4x, (+0)

3. Direct

4. Retry

+KW (object+ grounder)

+KW (object) 7x, (+1)

S2

5. Conventionally Indirect

6. Nonconventionally Indirect (max 3 times)

9. Nonsense/obscene language (max 3 times)

7x. Rejection to the Request e.g., I’m sorry I would not be able to help you.

8x, (+8)

+KW (object+ grounder)

+KW (object)

8. Yes

4x. Replies for Clarifications e.g., Sorry, I don’t understand what you’re trying to say. Could you say that again?

7x, (+ 2)

7. No

8x. Acceptance for the Request e.g., All right.

4x, (max+1)

4x, (+0)

Figure 2. Finite-state automata model in the speech act of request (+ points). Note. KW = keyword. 2x, 4x, 7x, and 8x refer to some of the possible response categories, and (+ number) represents the number of points assigned. Actions triggered on Dr. Brown’s side help to maintain a reasonable dialogue between him and the student by moving the interaction to a different state. If the machine receives messages that have no keywords belonging to any category, the student is directed to the state of nonsense and is instructed to clarify their utterances up to three times. On the fourth time, he or she is forced to exit the situation without obtaining any points. Similarly, if a participant employs a direct or conventionally indirect strategy, but does not include request item key words, the program turns down the participant’s request. This situation is due to the fact that the participant does not ask for the items that he or she is supposed to ask for in that scenario even though he or she may have used a request head act. This state was created to raise participants’ awareness of inappropriate strategy use and encourage them to try a different strategy to convince the agent. Table 2 describes the types of cycles and their definitions. 7

Table 2 Conversational Cycles Definition Direct request strategies (no key word of request item) *3 Conventionally indirect strategies (no key word of request item) *3 Nonconventionally indirect strategies *3 Nonsense *3

Cycle Boorish requester Evasive requester Equivocal talker Nonsense talker

2.4. Scoring and Dr. Brown’s Responses The scoring criteria were developed based on the appropriateness of students’ written utterances to the current context. The appropriateness was defined as the degree to which speech acts were performed at a proper level of directness and politeness according to the situations. The current point scheme is shown in Figure 2 as numbers that appear in parentheses on the links that trigger Dr Brown’s actions. There are 10 scenarios in this game, and the maximum points a student can earn in each scenario is 10 for a total of 100 possible points. The system assigns different points by identifying the major linguistic expressions used by students. For example, when students greet the professor or address the professor with his title when entering a new situation, their utterance is identified as an alerter, as seen on the first state. Because using alerters or attention getters when two people meet each other is recognized as a cultural norm in the United States, students can obtain two points (+2) by addressing the hearer. In contrast, if they fail to do so, they do not receive any points in this state. After receiving an alerter message, Dr. Brown replies to students with a range of greeting utterances as exemplified in the box of 2x Greetings (e.g., “Hello, what can I do for you?”). Students then can move on to the next state (S2). Student utterances of request are first identified as four types by the engine: direct strategy, conventionally indirect strategy, nonconventionally indirect strategy, and nonsense/obscene. Given that our scenarios place students on a lower social status as compared to Dr. Brown’s, conventionally indirect strategies are considered more appropriate than other strategies. Based on these criteria, students who employ conventionally indirect strategies get more points than those who use other strategies to make the request for each situation. The direct and conventionally indirect strategies are further subcategorized into two components: utterances containing both keywords (KW) of request head acts, request object and grounders (+ KW (object + grounder)), and utterances containing only a request head act and

8

request object (+ KW (object)). Grounders provide a rationale or a justification for making requests, and therefore, they function as efficient supportive and mitigating modifiers (Faerch & Kasper, 1989; Hassall, 2001). The occurrence of grounders is regarded as an indicator to determine the appropriateness of requests. Students providing grounders for their direct and conventionally indirect requests receive higher points than those who do not. Dr. Brown’s statements are randomly selected from a particular action set. Direct and conventionally indirect requests with a grounder are granted, while those with no grounder are rejected by Dr. Brown. However, excessive use of grounder with no request head acts may be confusing and inappropriate (Blum-Kulka & Olshtain, 1986; Hassall, 2001). Such responses are classified as nonconventionally indirect and receive a maximum of one point. When that happens, Dr. Brown asks students to clarify themselves. If students’ written utterances are unrelated to the given scenarios, their sentences are identified as nonsense, and they are forced to exit after three attempts. Similarly, students are forced to exit the scenario after three attempts of using obscene language. In addition, Dr. Brown provides immediate feedback (e.g., “Watch your language”) to prevent students from using profane or obscene language. Students can skip scenarios if they do not know how to respond to the presented hypothetical situation. Both formative and summative feedback are provided to the students. In order to call their attention to the use of request strategies and possible modification for more effective requests, the program displays formative feedback in three different forms: visual cues, utterance responses, and pop-up comments and credit points. 2.5. Refining the Scoring Engine In order to examine the feasibility of the keyword-based scoring engine, it was piloted with several English learners and native speakers of English prior to the study. The pilot data was used to refine the descriptions of the scenarios, keywords, and point scheme. This pilot resulted in a more robust scoring engine that included new keywords associated with particular scenarios and request strategies. Although the current version of the game does not make use of sophisticated natural language processing tools, it uses a list of possible erroneous but recognizable spelling of keywords. This design allows the computer system to automatically identify the type of request strategy used by the student even though some of his/her sentences may include grammatical or structural errors.

9

3. Methods The current study explores the usability of the game and investigates the types and distributions of request strategies used by ELLs to cope with the hypothetical academic situations in the IDCT system with or without the presence of a pedagogical agent. Prior research by Kuha (1999) showed that computerized IDCTs produced responses that were longer and more diverse than those produced by the DCT. In this study, we compare two IDCTs (i.e., agent-based and text-based) and explore possible effects on selection of request strategies and produced responses. The introduction of the agent-based IDCT was intended not to replace face-to-face role play but to alleviate some constraints in interlanguage pragmatic learning. More specifically, the system was developed offer language learners an alternative to acquire pragmatic knowledge through simulated interactions. Participants consisted of 30 adult ELLs of a large southwestern university in the United States. They enrolled in the ESL program in spring 2008. The age of the participants ranged from 18 to 45. Almost all participants were from Southeast Asian countries. Sixteen students spoke Chinese as their first language. Twelve students spoke Korean as their first language. One of the students spoke Japanese as her first language. One student came from South America and spoke Portuguese as her first language. Participants were randomly assigned to two groups: an agentbased IDCT group (experimental condition, Figure 3) and a text-based IDCT group (control condition, Figure 4). Participants in the agent-based IDCT group were asked to interact with an animated pedagogical agent to complete 10 request activities/scenarios, and those who were in the text-based IDCT group were required to complete the same task without the presence of the agent. Their levels of proficiency (as reported in a self-assessment scale or Test of English as a Foreign Language™ [TOEFL®] scores) ranged from low to intermediate high. 3.1. The Experiment Data for this investigation were gathered by means of an assessment-based gaming environment (A Game of Persuasion), a background questionnaire, and a usability survey. First, participants played the game for 45 minutes to 1 hour. Then, a 10-minute usability survey and a 5-minute background questionnaire were administered to obtain information regarding participants’ experiences with the game and their demographic characteristics.

10

Figure 3. Agent-based interactive discourse completion task (IDCT); experimental condition.

Figure 4. Text-based interactive discourse completion task (IDCT); control condition.

11

3.2. Computer-ScoredVersus Human-Scored Utterances Before reporting and analyzing the results of the experiment, we describe the results of the comparison between scores assigned by the scoring engine and human rater. To that effect, each student’s utterance was first classified based on request categories and the combination of categories. Based on these classifications, points were assigned using the scoring rules presented in section 2.4. The mean scores assigned by the human rater and the computer system were compared. The distributions of scores assigned by the humans and the computer are presented in Figure 5 and Figure 6. Interrater reliability computed by Pearson product-moment correlations was 0.66 (p < .01), suggesting a reasonably moderate level of consistency between the judgments of the human and the computer. The variance interpretation of the correlation coefficient revealed that interrater coefficient was moderately consistent (r2 = 0.44) while it also indicated that 56% of the variance by the human rater was not accounted for by the computer rater. Human raters tended to assign higher scores to participants in both conditions than did the computer scoring system, with these mean differences ranging from 0.85 to 1.1 points out of ten at a scenario level. 2 A number of factors may have contributed to the discrepancies between the computer and the human scoring. As mentioned earlier, one major difference concerned judgments on indirect request utterances. Because indirect request formulas are more implicit, thus the scoring engine was likely to encounter difficulty in identifying utterances that were not straightforwardly expressed. Another difference lies on language forms. Since not all instances of poor spelling or grammar were included in the database, the computer scoring system cannot detect all forms of typos. Therefore, the erroneous words or phrases tended to be considered unintelligible, resulting in lower scores for the particular scenario in which typos occurred. In addition to the two factors mentioned above, occurrences of generic lexicon such as “personal business,” “something came up,” or “not feeling well,” instead of specifying their reasons for making requests may have led to a lower rating because the system was likely to consider those cases as invalid rationales. However, human raters were able to make inferences from all these cases with little difficulty. 3.3. Analysis of Results First we report on usability issues of the game and then discuss participants’ choice of strategies and the ways they changed throughout the game.

12

10

Frequency

8

6

4

Mean =74.87 Std. Dev. =21.27 N =30

2

0 20.00

40.00

60.00

80.00

100.00

Human Scoring

Figure 5. Distribution of scores assigned by the human raters.

10

Frequency

8

6

4

2 Mean =66.17 Std. Dev. =22.652 N =30

0 0.00

20.00

40.00

60.00

80.00

Computer Scoring

Figure 6. Distribution of scores assigned by the computer. 13

100.00

Usability information. Most participants (92% or more across the two conditions) thought that the game motivated them to learn more about how to make requests in English. Approximately 83% of participants across the two conditions thought that persuading Dr. Brown helped them improve their knowledge of making a request, and 80% of participants thought it was fun to persuade the professor. The initial tutorial page was reported to be helpful for participants to understand how to use the system (80% or more across the two conditions). Regarding the appearance of Dr. Brown, more than half of the participants in agent-based IDCT condition (72%) liked how the professor looked. Approximately 70% of participants across the two conditions thought that the feedback provided by the game helped them think about what kind of request strategy they could use for the next scenario, and 75% of them thought the interface/program was attractive. Eighty percent of agent-based IDCT participants and 86% of text-based IDCT participants agreed with the following statement: “The Persuasion Game is easy to use.” Eighty-five percent of agent-based IDCT participants and 83% of text-based IDCT participants agreed with the following statement: “The final recommendation page helped me understand my strengths and weaknesses.” In addition, some students’ comments on the open-ended questionnaire seemed to indicate that they enjoyed interacting with the lifelike professor. One interested student mentioned, “The professor’s moving eyes make me feel like I am talking to a real professor.” Another interested student reported that she “liked the animation of the professor’s facial expressions.” However, a student commented on the look of the professor, suggesting a replacement of the professor’s face by a friendlier and less serious one. In general participants seemed to enjoy interacting with both versions of the game. They valued the feedback they received during the game and thought that the game helped them to learn to make requests in English. Performance-wise comparison. As indicated in section 3.2, the overall agreement between computer and human rater on the mean scores assigned to students in two groups was moderate (r = .66). Based on the results of human scoring, no significant difference appeared in final scores 3 (t(28) = - 0.41, p = 0.40) between students who were assigned to the experimental condition (M = 73.27, SD = 22.62) and those assigned to the control condition (M = 76.47, SD = 20.50). Response length. The written discourse data revealed a significant difference in the resulting response length (t (28) = 1.106, p < .05) between the agent-based IDCT group (M =

14

679, 4 SD = 337.3) and the text-based IDCT group (M = 563.3, SD = 224.3). It seems like the presence of the pedagogical agent helped create a more vivid conversational environment that allowed students to be more actively engaged in the hypothetical dialogues. Therefore, participants in the agent-based group generally produced longer utterances. Selection of strategies. All responses of the participants were coded and analyzed based on the coding categories of request (see Table 1) and scoring criteria (see Figure 2) developed for the present study. A range of linguistic expressions of request used by the participants were classified into different levels of directness: direct, conventionally indirect, and nonconventionally indirect strategies. Each level was further subcategorized into several types of request strategies, taking into account the presence of grounders (i.e., DR = direct request, DRG = direct request with grounder, CI = conventionally indirect, CIG = conventionally indirect with grounder, and NCI = nonconventionally indirect). Regarding the use of request strategy of different levels of directness, participants in both groups made the most of their requests by means of conventionally indirect strategies with grounders to increase the likelihood of the acceptance of the requests (see Table 3). In this request type (CIG) the participants seemed to use modal verbs to ask question about permission or ability while they, in fact, utilized a formula for request behavior. Examples of conventionally indirect strategies with grounders by participants in agent-based IDCT group and in text-based IDCT group are these: Example 1: Asking the professor to write a recommendation letter (agent-based IDCT group) A12: Professor, you are the kindest professor I have ever met. I believe you know I study really hard in your class, so I also get high grade in your class. Now, I want to apply a scholarship, and I need a recommendation letter because you are always satisfied with I study attitude. Could you write the recommendation letter? Example 2: Asking the professor for a permission to take a writing course (text-based IDCT group) B01: I really have to take a writing course in order to graduate this semester. After my graduation, I could get a job and support my family. Please, could I get your permission to take the writing course that is closed already?

15

Table 3 Proportion of Request Strategies Used by Two Groups Request types DR DRG CI CIG NCI Total no. of request strategies

Agent-based IDCT group % 2.5 8.4 5.9 56.6 26.6

N 5 17 12 115 54 203

Text-based IDCT group % 4.3 5 7.5 54.7 28.5

n 7 8 12 88 46 161

Note. IDCT = interactive discourse completion task, DR = direct request, DRG = direct request with grounder, CI = conventionally indirect, CIG = conventionally indirect with grounder, NCI = nonconventionally indirect. Over 25% of requests by both groups were made with nonconventionally indirect request strategies. The most common type of nonconventionally indirect request strategy, also known as hints, may occur in the forms of statement or question. Examples of a question hint by participants in agent-based IDCT group and a statement hint by participants in text-based IDCT group are these: Example 3: Asking the professor to schedule a make-up exam (agent-based IDCT group) A05: Actually I wanna to join my brother’s wedding. It’s a big day in our family. I don’t want absent. I am really sorry about that. I think it’s a good way, isn’t it? Professor: Sorry, I have no idea what you’re talking about. Could you get to the point? Example 4: Asking the professor for a deadline extension for an assignment (text-based IDCT group) B01: I have a big problem with my computer. Professor: I understand your problem but I can’t help you with that.

16

Participants made a small proportion of their requests (less than 9% respectively) using three other types of strategies (i.e., DR, DRG, and CI). An example of each strategy is presented below: Example 5—Direct Request (DR) Strategy: Asking the professor to schedule a make-up exam A13: I need to join my brother’s wedding party. Example 6—Direct Request Strategy with Grounder (DRG): Asking for leave A13: I can not go to school because my parents will go to see me few days. Example 7—Conventionally Indirect (CI) Strategy: Asking the professor for a permission to take a writing course B07: Can you open the course just for one? Though the two groups show marked similarities with respect to variation in selection of request strategy, the total number of request strategies used by agent-based IDCT group is approximately 1.3 times more than the number of those used by text-based IDCT group. When examined closely, the data reveals that participants in agent-based IDCT group more commonly selected DRG, CIG, and NCI strategies in making requests. The chief difference between these three types of strategies and the others is the use of grounders. There is a tendency for participants to incorporate grounders when interacting with the pedagogical agent. It might suggest that the presence of the animated agent increases the perceived threat to face, which could encourage participants to justify the causes for making requests by using grounders. It was also found that several participants in the agent-based group were equivocal talkers (for definition, see Table 2) who overused nonconventionally indirect strategies and were not specific about the exact request they were supposed to make. The tendency to make a request indirectly by providing a grounder was apparent in the responses of the students. The finding is concordant with the reports of previous ILP studies on Chinese speakers’ oral and written requests (Kirkpatrick, 1993; Nash, 1983; Zhang, 1995) and Korean speakers’ semantic formulae for making requests (Byon, 2004).

17

Distributions of strategy transitions. To delve into the intricacies of how participants switched from one strategy to another, we examined the probability of strategy transitions. Figure 7 and Figure 8 present the paths of strategy transitions in time. The paths that start from alerters were analyzed because the interactions initiated by alerters accounted for about 95% of all interactions, whereas interactions initiated by non-alerters only accounted for less than 5% of the total for both conditions. The left part of each figure shows the strategies used by participants on their first attempt, while the right part shows subsequent attempts. The retry states (alerter, NCI, or nonsense), which allow participants to try again, are presented in labeled ovals, whereas the granted states (DRG or CIG) and the rejected states (DR or CI) are presented in labeled rectangles and dashed rectangles respectively. As indicated in the section 2.3, because the requests made using NCI strategy or nonsense were not explicit, the computer system did not reject or grant participants’ requests immediately. Instead, participants were allowed to try different strategies to make requests up to three times. Therefore, the subsequent attempts sections in two figures present the strategy paths employed by participants following their use of NCI strategy or nonsense. Only four instances of nonsense were found in agent-based IDCT condition; one in text-based IDCT condition. The requests made using nonsense were not successfully granted in either group. With respect to alerters, the percentage of the occurrence of alerters being used for consecutive times is similar for two groups. In addition to alerters, three kinds of paths are of interest: paths to rejected states, paths to granted states, and paths to retry states. Table 4 shows that the probability of the occurrence of the paths to rejected states in agent-based IDCT group (6.6%) is less than that in text-based IDCT group (9.4%). As for the probability of the occurrence of the links from the alerter to granted states, the agent-based IDCT group (64.7%) and text-based IDCT group (65.5%) have similar overall percentage for the presence of the paths. Table 5 shows a comparison of the percentages of paths made by two groups in subsequent attempts. It indicates that the percentages of participants who used NCI strategy the first time and were rejected at the end were 2.5 times lower in agent-based IDCT group (8%) than in text-based IDCT group (20.5%). The results also showed that agent-based IDCT group (54%) was higher than text-based IDCT group (38.2%) with respect to the probability of the participants who employed NCI strategy the first time and ended up had their requests granted.

18

First attempt

1.3% (2)

Subsequent attempts DR

4% (2)

DR

14% (7)

DRG

5.4% (8) 4.7% (7) Alerter

DRG

22% (32) 60% (87)

1.3% (2)

32% (16)

NCI

NCI

40% (20) CIG

CIG

6% (3) 75 % (3) Nonsense

4% (2)

Nonsense

25 % (1)

5.3% (8)

CI

CI

Figure 7. State transition diagram of request strategies used by students in the agent-based interactive discourse completion task (IDCT) group (n). The left part shows the strategies used by participants on their first attempt; the right part shows subsequent attempts. The retry states (alerter, NCI, or nonsense), which allow participants to try again, are presented in labeled ovals, the granted states (DRG or CIG) are presented in labeled rectangles, and the rejected states (DR or CI) are presented in dashed rectangles. Note. CI = conventionally indirect, CIG = conventionally indirect with grounder, DR = direct request, DRG = direct request with grounder, NCI = nonconventionally indirect.

19

First attempt

Subsequent attempts

4.3% (5)

DR

3.4% (4)

DRG

5.2% (6)

5.8% (2)

DR

5.8% (2)

DRG

19% (22) Alerter

41.2% (14)

NCI

NCI

62.1% (72) 32.4% (11) 0.9% (1)

CIG 14.7% (5)

5.1% (6)

CIG

0% (0)

Nonsense Nonsense 100% (1) CI

CI

Figure 8. State transition diagram of request strategies used by students in the text-based interactive discourse completion task (IDCT) group (n). The left part shows the strategies used by participants on their first attempt; the right part shows subsequent attempts. The retry states (alerter, NCI, or nonsense), which allow participants to try again, are presented in labeled ovals, the granted states (DRG or CIG) are presented in labeled rectangles, and the rejected states (DR or CI) are presented in dashed rectangles. Note. CI = conventionally indirect, CIG = conventionally indirect with grounder, DR = direct request, DRG = direct request with grounder, NCI = nonconventionally indirect.

20

Table 4 Transition Analysis (First Attempt – Alerter) Types of links

Agent-based IDCT (n)

Text-based IDCT (n)

Rejected

6.6% (10)

9.4% (11)

Granted

64.7% (94)

65.5% (76)

Retry

23.3% (34)

19.9% (23)

Note: IDCT = interactive discourse completion task.

Table 5 Transition Analysis (Subsequent Attempts—Nonconventionally Indirect [NCI]) Types of links

Agent-based IDCT (n)

Text-based IDCT (n)

Rejected

8% (4)

20.5% (7)

Granted

54% (27)

38.2% (13)

Retry

38% (19)

41.2% (14)

Note: IDCT = interactive discourse completion task.

We cannot draw conclusions about the effect of pedagogical agent on the student performance at the first attempt because the percentages of paths that led to granted, rejected, and retry were similar for both groups. However, the fact that a higher percentage of request strategies were granted in the agent-based IDCT group than in the text-based IDCT group on a subsequent attempt provides evidence that the agent-based IDCT condition is better than textbased IDCT condition in terms of guiding and supporting the use of more appropriate strategies (i.e., increasing granted paths and reducing rejected ones). Based on the comments provided by participants, indicated in section 3.3.1, it seems that the presence of the pedagogical agent helps create a conversational environment reflective of real-life situations. The simulated setting allows students to perceive peripheral cues (i.e., aural feedback and facial expressions) as valid ones and thus modify or refine their request strategies to improve appropriacy based on given information.

21

4. Discussion and Future Work The results of the study offer encouraging evidence that A Game of Persuasion provides estimates of learners’ pragmatic competence and individualized lifelike interaction as well as formative and summative feedback. As an effort to expand ILP research, this game contributes to the field of interlanguage pragmatics in terms of engaging learners with “contextualized, pragmatically appropriate input” (Bardovi-Harlig, 2001, p. 31). Following Bachman and Palmer (1996), pragmatic knowledge is considered an important area of language knowledge that should be focused in order to gain insights to a fuller range of the components of language ability. Drawing from the real-world settings and situations, the authentic and pragmatic nature of the game may serve as an alternative formative assessment tool to elicit language learners’ knowledge of the conventions that determine the appropriate use of expressions. Moreover, while pragmatic knowledge of ELL is currently not measured by the TOEFL or Test of English for International Communication™ (TOEIC®), these kinds of assessment-based tools can be used to implement interactive dialogues in controlled contexts that would help people practice traditional language skills (e.g., listening, reading, or writing). Moreover, the study contributes to the understanding of the negotiation process of ELLs with a pedagogical agent in an educational gaming environment. The usability information obtained from the usability survey suggests that A Game of Persuasion has potential to be used as a learning tool. The program motivates students to engage in a constructive learning process, allows them to recognize the outcomes when applying different request strategies in different scenarios, and finally enhances the intake of the preferred forms of request. The current game can be enhanced by adding natural language processing tools (e.g., a spelling engine and a synonyms database) that complement the current keyword-based scoring engine. These tools would make the system more robust. Future work includes the integration of c-rater™ components as part of the system. A further limitation is that written dialogue data present information on students’ uses of sentences for making requests while the intonation and tone is not conveyed in writing. The ability to use appropriate prosody for making requests is considered an integral part of one’s pragmatic knowledge. In addition, the written responses may reflect what student think they will do or ideally want to do, which may differ from what they do in real-life situations.

22

Although the difference in the use of request strategies was not significant, more research needs to be conducted in order to better understand the change of interlanguage features when incorporating different pedagogical agents in different situations. This paper reports on an initial exploratory study with a limited sample size. Given the limitations mentioned above, we should not conclude that the agent-based IDCT is ineffective. On the contrary, when we look at the responses of individual students, we see wide variation in receptivity to the pedagogical agent. It is worth examining these individual results more closely in the hope of learning more about what works for whom and why. Insights from individual success, and also from failures, may suggest some avenues for further research. Future development tasks may also include comparing pragmatic performance of ELLs with varying language and cultural backgrounds in several situations and settings, carrying out randomized controlled studies with a larger population of students, enhancing the dialogue engine to support a wider range of response utterances (e.g. different mitigating supportive moves), and improving the gestures and emotional reactions of the pedagogical agent to increase the interactiveness of the program. Further research is clearly warranted in exploring the interactions between different context variables and receptivity to different types of pedagogical agents using a tighter experimental design.

23

References Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford, England: Oxford University Press. Bardovi-Harlig, K. (2001). Evaluating the empirical evidence: Grounds for instruction in pragmatics? In K. R. Rose & G. Kasper (Eds.), Pragmatics in language teaching (pp. 13– 32). Cambridge, England: Cambridge University Press. Bardovi-Harlig, K., Hartford, B., Mahan-Taylor, R., Morgan, M., & Reynolds, D. (1991). Developing pragmatic awareness: Closing the conversation. ELT Journal, 45, 4–15. Baylor, A. L., & PALS Research Group. (2003, July). The impact of three pedagogical agent roles. Paper presented at the second international joint conference on autonomous agents and multiagent systems, Melbourne, Australia. Beebe, L. M., & Cummings, M. C. (1996). Natural speech act data versus written questionnaire data: How data collection method affects speech act performance. In S. M. Gass & J. Neu (Eds.), Speech acts across cultures: Challenges to communication in a second language (pp. 65–86). Berlin, Germany: Mouton de Gruyter. Blum-Kulka, S., House, J., & Kasper, G. (1989). Investigating cross-cultural pragmatics: An introductory overview. In S. Blum-Kulka, J. House, & G. Kasper (Eds.), Cross-cultural pragmatics. Norwood, NY: Ablex Publishing. Blum-Kulka, S., & Olshtain, E. (1986). Too many words: Length of utterance and pragmatic failure. Studies in Second Language Acquisition, 8, 165–180. Brown, J. D. (2001). Pragmatics tests. In K. R. Rose & G. Kasper (Eds.), Pragmatics in language teaching (pp. 301–325). Cambridge, England: Cambridge University Press. Brown, P., & Levinson, S. (1987). Politeness: Some universals in language usage. Cambridge, England: Cambridge University Press. Byon, A. S. (2004). Sociopragmatic analysis of Korean requests: Pedagogical settings. Journal of Pragmatics, 36(9), 1673–1704. Chan, T. W., & Baskin, A. B. (1990). Learning companion systems. In C. Frasson & G. Gauthier (Eds.), Intelligent tutoring systems: At the crossroads of artificial intelligence and education (pp. 6–33). Norwood, NY: Ablex Publishing.

24

Conati, C., & Zhao, X. (2004, January). Building and evaluating an intelligent pedagogical agent to improve the effectiveness of an educational game. Paper presented at the international conference on intelligent user interface, Funchal, Madeira, Portugal. Faerch, C., & Kasper, G. (1989). Internal and external modification in interlanguage request realization. In S. Blum-Kulka, J. House, & G. Kasper (Eds.), Cross-cultural pragmatics: Requests and apologies (pp. 221–247). Norwood, NY: Ablex Publishing. Graesser, A. C., Person, N., Harter, D., & TRG. (2001). Teaching tactics and dialog in AutoTutor. International Journal of Artificial Intelligence in Education, 12, 257–279. Grant, L., & Starks, D. (2001). Screening appropriate teaching materials: Closing from textbooks and television soap operas. International Review of Applied Linguistics in Language Teaching, 39, 39–50. Harless, W. G., Zier, M. A., & Duncan, R. C. (1999). Virtual dialogues with native speakers: The evaluation of an interactive multimedia method. CALICO Journal, 16(3), 313–337. Hassall, T. (2001). Modifying requests in a second language. International Review of Applied Linguistics, 39, 259–283. Holtgraves, T. M. (1986). Language structure in social interaction: Perceptions of direct and indirect speech acts and interactants who use them. Journal of Personality and Social Psychology, 51, 305–314. House, J., & Kasper, G. (1987). Interlanguage pragmatics: Requesting in a foreign language. In W. Lorscher & R. Schulze (Eds.), Perspective on language and performance. Tubingen, Germany: Narr. Ishihara, N. (2007). Web-based curriculum for pragmatics instruction in Japanese as a foreign Language: An explicit awareness-raising approach. Language Awareness, 16(1), 21–40. Johnson, W. L., Rickel, J. W., & Lester, J. C. (2000). Animated pedagogical agents: Face-to-face interaction in interactive learning environments. International Journal of Artificial Intelligence in Education, 11, 47–78. Kasper, G. (2000). Data collection in pragmatics research. In H. Spencer-Oatey (Ed.), Culturally speaking (pp. 316–341). New York: Continuum. Kasper, G., & Schmidt, R. (1996). Developmental issues in interlanguage pragmatics. Studies in Second Language Acquisition, 18, 149–169.

25

Kirkpatrick, A. (1993). Information sequencing in modern standard Chinese. Australian Review of Applied Linguistics, 16, 27–60. Kuha, M. (1999). The influence of interaction and instructions on speech act data. Unpublished doctoral dissertation, Indiana University, Bloomington. Leech, G. (1983). Principles of pragmatics. London: Longman. Lester, J. C., Stone, B. A., & Stelling, G. D. (1999). Lifelike pedagogical agents for mixedinitiative problem solving in constructivist learning environments. User Modeling and User-Adapted Interaction, 1, 1–43. McCarthy, M., & Carter, R. (1995). Spoken grammar: What is it and how do we teach it? ELT Journal, 49(3), 207–218. McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden, MA: Blackwell Publishing. Mote, N., Johnson, L., Sethy, A., Silva, J., & Narayanan, S. (2004, June). Tactical language detection and modeling of learner speech errors: The case of Arabic tactical language training for American English speakers. Paper presented at the 2004 symposium on computer assisted learning, Venice, Italy. Naiditch, F. (2006). The pragmatics of permission: A study of Brazilian ESL learners. Unpublished doctoral dissertation, New York University. Nash, T. (1983). An instance of American and Chinese politeness strategy. RELC Journal, 14, 87–98. Rose, K. R. (2000). An exploratory cross-sectional study of interlanguage pragmatic development. Studies in Second Language Acquisition, 22(1), 27–67. Shaw, E., Ganeshan, R., Johnson, W. L., & Millar, D. (1999, July). Building a case for agentassisted learning as a catalyst for curriculum reform in medical education. Paper presented at the ninth international conference on artificial intelligence in education, Marseilles, France. Shaw, E., Johnson, W. L., & Ganeshan, R. (1999, May). Pedagogical agents on the web. Paper presented at the third international conference on autonomous agents, Seattle, Washington.

26

Vellenga, H. (2004). Learning pragmatics from ESL & EFL textbooks: How likely. Teaching English as a Second or Foreign Language, 8. Retrieved July 20, 2007, from http://wwwwriting.berkeley.edu/TESL-EJ/ej30/a3.html. Vine, B. (2004). Getting things done at work: The discourse of power in workplace interaction Philadelphia: John Benjamins Publishing. Wang, N., Johnson, W. L., Rizzo, P., Shaw, E., & Mayer, R. E. (2005, January). Experimental evaluation of polite interaction tactics for pedagogical agents. Paper presented at the international conference on intelligent user interfaces, San Diego, CA. Wolfson, N., Marmor, T., & Jones, S. (1989). Problems in the comparison of speech acts across cultures. In S. Blum-Kulka, J. House, & G. Kasper (Eds.), Cross-cultural pragmatics: Requests and apologies (pp. 174–196). Norwood, NY: Ablex Publishing Corp. Zapata-Rivera, D., Vanwinkle, W., Shute, V., Underwood, J. S., & Bauer, M. (2008). English ABLE. In R. Luckin, K. Koedinger, & J. Greer (Eds.), Artificial intelligence in education—Building technology rich learning contexts that work (Vol. 158, pp. 323– 330). Amsterdam: IOS Press. Zhang, Y. (1995). Indirectness in Chinese requesting. In G. Kasper (Ed.), Pragmatics of Chinese as native and target language (pp. 69–118). Honolulu, HI: University of Hawaii Press.

27

Notes 1

Face refers to a person’s self-image in response to other people’s communication behaviors.

2

The total score for each scenario is 10 points.

3

Total final scores can range from 0 to 100.

4

Number of words in total including spaces.

28