A Triangulation of Eye-Movement Data, Verbal

0 downloads 0 Views 664KB Size Report
Dec 3, 2012 - model, the awareness codings sufficed to predict word recognition scores. ..... were only tested on the verb forms that were in the crossword puzzle. ..... ecological validity of the present study, however, may be higher than that ...
183

GODFROID & SCHMIDTKE

© Creative Commons Attribution-NonCommercial-ShareAlike 3.0 draft September 12, 2013 4:02 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

12 What Do Eye Movements Tell Us About Awareness? A Triangulation of Eye-Movement Data, Verbal Reports, and Vocabulary Learning Scores Aline Godfroid Jens Schmidtke Michigan State University Common wisdom suggests that paying attention is an effective way to acquire new information. In the area of second language acquisition (SLA), Schmidt argued that attention facilitates learning because it leads to noticing, which he defined as the conscious registration of some surface element of language (Schmidt, 1995, 2012). This study triangulates distinct measures of attention and awareness—namely, eyemovement recordings and verbal reports—to elucidate the differential contributions of these two mechanisms to receptive vocabulary learning. Advanced EFL learners read 20 English paragraphs embedded with 12 novel pseudowords for meaning, while an eye-tracker recorded their eye movements. Participants’ ability to recognize the pseudowords in context was tested on a surprise posttest. After that, each participant took part in a post-task interview that measured her conscious recollection of reading each of the 12 target words. Results showed that both a participant’s total fixation time on the pseudoword and her recollection of reading the word predicted word recognition. Furthermore, words for which participants reported autonoetic awareness (i.e., retrieval of an episodic memory) were fixated significantly longer than words with reported noetic awareness (i.e., a sense of familiarity) or no awareness. When both fixation times and awareness levels were entered into a single regression model, the awareness codings sufficed to predict word recognition scores. These findings suggest that attention (looking at a word) induced awareness (encoding the what, where, or when of a processing episode), which was itself a strong predictor of vocabulary learning.

The noticing hypothesis (Schmidt, 1990, 1994, 1995, 2001, 2012) has been highly influential in the second language acquisition (SLA) research of the past 20 years as evidenced, for

Godfroid, A., & Schmidtke, J. (2013). What do eye movements tell us about awareness? A triangulation of eye-movement data, verbal reports, and vocabulary learning scores. In J. M. Bergsleithner, S. N. Frota, & J. K. Yoshioka (Eds.), Noticing and second language acquisition: Studies in honor of Richard Schmidt (pp. 183–205). Honolulu: University of Hawai‘i, National Foreign Language Resource Center.

184

GODFROID & SCHMIDTKE

1 example, by over 2,000 citations of the 1990 paper on Google Scholar. At a general level, it 2 has fuelled SLA researchers’ interest in cognitive processes in SLA—most notably, attention 3 and awareness—and has incited part of the SLA community to delve into the psychological 4 literature on these topics. The influence of the noticing hypothesis also shows in scholars’ 5 sustained efforts to refine its theoretical foundations (Godfroid, Boers, & Housen, in press; 6 Godfroid, Housen, & Boers, 2010; Truscott & Sharwood Smith, 2011). This study continues 7 the latter line of research. 8 The process of noticing, which is hypothesized to be necessary for adult second-language (L2) 9 learning (Schmidt, 1990), is comprised of two psychological mechanisms: focal attention 10 and a low level of awareness (e.g., Robinson, 1995, 2003; Schmidt, 1995, 2001). Many 11 SLA researchers have investigated noticing by focusing on a single constituent process, 12 either awareness or attention. We combined distinct measures of attention and awareness 13 in a mixed-method approach to test the extent to which these two measures coincide as 14 predictors of learning and, thereby, lend indirect evidence to the existence of a mediating 15 construct, noticing. Our results suggest that attention and awareness may be closely related 16 but that, due to the different nature of the data collected to measure each mechanism, a 17 mixed-method approach will afford a richer perspective on L2 learners’ cognitive processes 18 than any single measure could. 19 20 Noticing as attention and awareness 21 The noticing hypothesis, put in its simplest form, states that learning can only take place 22 if new linguistic structures are noticed in the input, whereby noticing is defined as “the 23 conscious registration of attended specific instances of language” (Schmidt, 2012, p. 32) or 24 “the conscious registration of the occurrence of some event” (Schmidt, 1995, p. 29). In its 25 strongest form, the hypothesis claims that noticing is a “necessary and sufficient condition 26 for the conversion of input to intake” (Schmidt, 1990, p. 129). Schmidt’s original hypothesis 27 was a reaction against Krashen’s (1981, 1983, 1985) ideas about subconscious language 28 learning driven by simple exposure to comprehensible language input. Thus, Schmidt’s 29 hypothesis must be viewed as an attempt to underscore the importance of focused attention 30 to linguistic form in adult SLA (see also Truscott & Sharwood Smith, 2011, p. 502). 31 Central to the noticing hypothesis are the psychological constructs of attention and awareness. 32 Learners have to attend to features in the input to notice them, and, according to Schmidt, 33 attending to features is virtually the same as being aware of them: “A low level of awareness, 34 called here ‘noticing,’ is nearly isomorphic with attention and seems to be associated with all 35 learning” (Schmidt, 1995, p. 1) and “focal attention and awareness are essentially isomorphic” 36 (Schmidt, 1995, p. 20). As the review below indicates, second-language researchers appear to 37 subscribe to this view, as, to the best of our knowledge, no SLA study to date has attempted to 38 disentangle attention and awareness empirically or conceptually. 39 40 Operationalization of the constructs of attention and awareness in SLA 41 Attention in SLA 42 The role of attention in learning a second language had captivated SLA researchers 43 years before Schmidt’s first publication of the noticing hypothesis. Sharwood Smith 44 (1981, 1991), for example, discussed input enhancement (originally dubbed consciousness 45 raising; Sharwood Smith, 19811), a technique that consists of highlighting (e.g., bolding 46 or underlining) certain words or structures in a text that the teacher or researcher 47 48 1 In the later publication, Sharwood Smith (1991) preferred the term input enhancement to avoid the 49 terminological and conceptual issues surrounding the term consciousness. 50 51 52

WHAT DO EYE MOVEMENTS TELL US ABOUT AWARENESS?

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 40 50

185

wants learners to focus on. The function of input enhancement is to make the chosen target features more salient to learners in hopes that learners will notice them. In this sense, input enhancement is primarily a manipulation of stimulus-driven or bottom-up attention (Corbetta & Shulman, 2002) as opposed to top-down allocation of attention initiated by the learner. It is, of course, not guaranteed that this manipulation will draw the learner’s attention to the stimulus (Sharwood Smith, 1991, p. 120–22) or that the stimulus-driven attention, if any, will be complemented or enhanced by a learner’s top-down attention. Nevertheless, the idea of input enhancement has been taken up in many SLA studies yielding mixed results (see Han, Park, & Combs, 2008; Lee & Huang, 2008, for meta-analyses). One study that looked at the effect of enhanced vs. unenhanced input is Gass, Svetics, and Lemelin (2003). The authors argued that enhanced input, operationalized in this study by underlining target structures, can be beneficial for learning, because highlighting certain language features may help language learners focus their limited-capacity attention on relevant aspects of the text. Thirty-four college students of Italian were randomly assigned to either a [+focused attention] or a [–focused attention] condition. The former group made significantly more progress from pre- to post-test than the latter, revealing a positive effect of enhanced over unenhanced input. Gass et al. (2003) concluded that “focused attention does seem to be a powerful mechanism for learning” (p. 526). At the same time, though, the authors noted that “there is no claim that [their] experimental procedure unequivocally results in attention’s being focused in one condition and not in the other” (p. 508) and recommend adding a stimulated recall procedure for future research. Thus an inherent limitation of this kind of research is that inferences about learners’ attention can only be made indirectly by comparing pretest to posttest scores. An important step forward in this respect was Winke (2013), who measured attention online by means of eye-tracking technology. The author tested 55 intermediate ESL learners on their correct use of passive constructions before and after reading a text flooded with sentences in the passive voice. Half of the participants read the text with textual enhancements of the passives (underlining and coloring) and the other half read it without any enhancements. Winke’s eye-movement data revealed that the participants in the [+enhancement] condition looked at the passives longer than those in the [–enhancement] condition did. However, participants in both groups showed only small improvements on a passive form correction test, and their gain scores did not differ significantly from each other. Thus it appears that “noticing” the colored and underlined passive forms did not contribute to form learning in this study, perhaps because the learners did not go beyond the registration of the physical appearance of the passive verbs. More generally, eye tracking can only tell us what participants looked at but not what their internal thought processes were—for instance, whether they really noticed the passive constructions or only their textual layout. Therefore, as we argue in this chapter, combining a measure of attention (eye tracking) with a measure of awareness might provide us with a more complete picture of learners’ cognitive processes in SLA. In the next section we review two ways in which awareness has been operationalized and measured in SLA research.

Awareness in SLA Studies on awareness in SLA have made extensive use of verbal reports, either concurrent or retrospective. This approach capitalizes on the fact that reportability is a key property of awareness (Baars, 2003; Weiskrantz, 1997) or, in Schmidt’s (2001) words, the assumption that “nothing can be verbally reported other than the current contents of awareness” (p. 20).

186

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

GODFROID & SCHMIDTKE

Concurrent verbal reports are known colloquially as think-aloud protocols. They consist of a participant saying out loud, in real time, the thoughts that he or she has while carrying out a particular task. Think-aloud protocols gained momentum in the field of SLA following Leow’s (1997, 2000) influential studies. Leow (1997, 2000) added a think-aloud procedure to a traditional pretest, treatment, posttest design. The treatment was designed to induce learners’ awareness of certain target forms and thereby trigger noticing, which was operationalized as “making a verbal or written correction to the targeted form” (Leow, 1997, p. 474). After coding the verbal protocols for awareness, Leow found a positive association between levels of awareness (Leow, 1997) or between unawareness vs. awareness (Leow, 2000) and learning gains observed on immediate and delayed posttests. This led Leow to conclude that awareness is facilitative of SLA (Leow, 1997) or even necessary (Leow, 2000). An exclusive reliance on think-aloud protocols, however, may not warrant the strong conclusion drawn by Leow (2000), given that a lack of verbal report need not imply a lack of awareness. Because participants do not verbalize everything they are aware of (Allport, 1988; Jourdenais, 2001; Robinson, Mackey, Gass, & Schmidt, 2012; Rosa & O’Neill, 1999; Sachs & Polio, 2007), a number of participants who are reportedly unaware may, in fact, have a low-level awareness of the target form that they fail to render in their verbal protocols. The extent to which verbal protocols yield an exhaustive record of learners’ conscious experiences is known as their completeness (Ericsson & Simon, 1993, chapter 3). Research on completeness has been overshadowed thus far by research on reactivity (i.e., the question of whether concurrent verbalizations change one’s thought processes; Bowles, 2010; Fox, Ericsson, & Best, 2011); nonetheless, the completeness issue seems just as important, especially if evidence is to be inferred from the absence of a given piece of information from a verbal report. The question of whether learning a second language, or certain second-language features, is possible without awareness remains a debated issue in the field of SLA (e.g., Truscott, 1998; Truscott & Sharwood Smith, 2011). Williams (2004, 2005) reported data from two studies showing that learning without awareness might be possible. In an improved version of his 2004 study design, Williams (2005) presented participants with four novel articles combined with existing English words. Participants were told that the articles indicated the distance of the referent to the subject of the sentence: gi and ro for ‘near’ and ul and ne for ‘far’. Participants were not told, however, that within each distance category, the choice of article was determined by the animacy of the accompanying noun (gi and ul for inanimates, ro and ne for animates). During a learning phase, participants were asked to make judgments about an object’s distance from the subject based on the article. Later, participants were given new and old, animate and inanimate nouns along with two article choices, the one for animate and the one for inanimate objects. Results showed that participants were able to choose the correct article above chance even when they did not report awareness of the underlying animacy rule during a post-task interview. An extension of Williams (2005), Hama and Leow (2010) did not replicate Williams’s findings of unaware (i.e., implicit) learning (see also Faretta-Stutenberg & Morgan-Short, 2011). An important part of Hama and Leow’s (2010) argument centered on the time when participants’ awareness of the underlying rule is measured best: online, at the “stage of encoding the incoming information” (p. 466) or off-line, at the “stage of retrieval of stored knowledge of the construct” (ibid.). Hama and Leow (2010) added think-alouds (as a measure of awareness at the time of encoding) to Williams’s original exit questionnaire (a measure of awareness at the time of retrieval). They found a small number of mismatches, namely three, where a participant made reference to animacy at one time of measurement but not the other. While Hama and Leow’s triangulation of two types of verbal report corroborates the

WHAT DO EYE MOVEMENTS TELL US ABOUT AWARENESS?

187

1 construct validity of either measure as an index of awareness, it does not provide compelling 2 evidence for the superiority of think-alouds over retrospective reports (see also Leung & 3 Williams, 2011, p. 37; Leung & Williams, 2012, p. 638). In particular, unlike Williams’s 4 (2005) participants (who had performed the training task in silence), the unaware, 5 verbalizing participants in Hama and Leow showed no significant memory for article– 6 noun combinations, even for “old” items that they had encountered six times during the 7 training phase. This raises the question of whether the think-aloud requirement prevented 8 participants from rehearsing the article–noun pairs subvocally and thereby interfered 9 with their encoding of the processed information in a more durable memory format (e.g., 10 Baddeley, 2007; Craik & Lockhart, 1972; Craik & Tulving, 1975; Gathercole, 2006). 11 In another extension of Williams’s (2004, 2005) research, Leung and Williams (2011, 2012) did 12 report further evidence for unaware learning. The major methodological improvement of these 13 studies, compared to Williams (2004, 2005), was that the learning of the target rule was no longer 14 evaluated by means of posttests, but online, from participants’ reaction times (RTs) to increasing 15 numbers of article–noun exemplars. Participants were once again introduced to a set of four 16 novel articles, governed by two semantic dimensions (e.g., distance and animacy), with only one 17 dimension being made explicit to them. They saw two pictures on a screen and heard a sentence 18 containing a novel article plus a noun that referred to one of the entities on the screen. Their 19 task was to respond to the named entity as fast as possible. Learning was measured “by reaction 20 time slowdowns in a block of trials in which some regularity [was] violated” (Leung & Williams, 21 2012, p. 654), in particular where the mapping between the article and the “hidden” semantic 22 property was reversed (e.g., the word bull appeared with the article for inanimate objects). Leung 23 and Williams (2012, Experiment 1) indeed observed slower responses on violation trials for 24 participants who later reported to be unaware of the animacy rule. They took this as evidence 25 for implicit learning. In addition, the authors compared RTs of aware and unaware participants, 26 as established by post-experiment interviews. Only the aware participants who could report the 27 correct form-meaning mapping showed evidence in their RT behavior of correctly anticipating 28 the target object before they actually heard it (very fast button presses). Thus, the RTs served as a 29 cross-validation of participants’ verbal reports. 30 When comparing Williams (2005) and Leung and Williams (2011, 2012) to Hama and Leow 31 (2010), there seems to be a trade-off between memory decay in the case of retrospective 32 verbal reports and the additional task demands of concurrent verbal reports. Furthermore, 33 either type of verbal report is prone to be incomplete (see above). We believe that 34 triangulating some type of verbal report with eye-movement data may reduce potential 35 omission issues because, unlike verbalizations, participants’ eye movements during processing 36 never stop and never stop being recorded. In this sense, eye tracking provides a complete 37 record of participants’ cognitive processing. Unfortunately, eye tracking and think-aloud 38 protocols can only be combined in a between-subjects design, because the requirement to 39 voice one’s thoughts increases time on task (Bowles, 2010) and will thus distort the durations 40 of any eye movements made in the process. In this study, we found it important to collect 41 attention and awareness data for the same participants (i.e., within subjects) and therefore we 42 used retrospective, rather than concurrent, verbal reports. 43 44 Attention and awareness in vocabulary learning 45 The studies reviewed above all deal with the acquisition of some aspect of grammar: either 46 verb morphology (Leow, 1997, 2000; see also Godfroid & Uggen, 2013) or form-meaning 47 connections in articles (Faretta-Stutenberg & Morgan-Short, 2011; Hama & Leow, 2010; 48 Leung & Williams, 2011, 2012; Williams, 2004, 2005). One might argue that grammar 49 40 50

188

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

GODFROID & SCHMIDTKE

learning involves the learning of abstract rules that cannot possibly be noticed in the input (Truscott, 1998) and that vocabulary learning, the focus of the current study, is very different because it does not involve rules. However, from a usage-based perspective, all learning is initially item based. Linguistic rules are not innate but emerge gradually as abstract knowledge schemata over the piecemeal accumulation of instances in memory (Abbot-Smith & Tomasello, 2006; Ellis, 2002a; Goldberg, 2003). Individual items or exemplars have to be noticed in Schmidt’s (2012) sense (cf. Ellis, 2002a, 2002b) whereas it is usually assumed that the abstraction of regularities can occur without the individual being aware of this process (Williams, 2005). However, an individual can become aware of an underlying linguistic regularity, in which case this knowledge would equal awareness at the level of understanding (Schmidt, 1990, 1995, 2012). In Leow (1997, 2000), for example, participants did not have to generalize the target structure to previously unseen verbs but were only tested on the verb forms that were in the crossword puzzle. In contrast, Williams (2004, 2005) also tested new article–noun combinations, for which participants arguably had to have abstracted the animacy rule to perform above chance. Some participants showed awareness at the level of understanding—those who could report the animacy rule—and others awareness at the level of noticing. Clearly attention to the form and context of a word is necessary to make form-meaning connections, but the question is whether this is a conscious process. Ellis (1994) maintains that vocabulary acquisition involves both conscious and unconscious processes: “Recognition and production aspects of vocabulary learning rely on unconscious processes, whereas meaning and mediational aspects of vocabulary heavily involve explicit, conscious learning processes” (p. 39). In this view, reading a word may leave a memory trace of the orthographic representation of that word without any conscious processes, but inferring its meaning is a conscious process. An alternative view is that learning new words might involve an unconscious statistical learning mechanism where the meaning of a word is abstracted over multiple encounters with the word from the contexts where the word occurred. For example, Smith and Yu (2008) showed how infants as young as 12 months can infer the referents of novel words presented in ambiguous contexts over multiple trials. In this study children saw two novel objects, A and B, and heard two novel words, for example, bossa and gasser. On another trial, children saw objects B and C and heard gasser and manu. Thus the probability of gasser referring to object B was higher than that of gasser referring to objects A or C. After a learning phase of less than four minutes involving six novel objects and words, children were shown two objects and heard a word associated with one of the objects while their eye gazes were recorded. They looked at the “right” object significantly longer than at the distractor, thus showing that they were able to learn the associations between the words and their referents over multiple trials despite their ambiguity. Although Smith and Yu (2008) did not address the issue of consciousness in their article, it seems reasonable to assume that 12-month-olds are not yet conscious of their learning in the sense of explicit learning.2 Further evidence for the assumption that word learning can occur implicitly comes from studies investigating people with amnesia. Williams (2005, p. 274) cites three amnesia studies presenting evidence of implicit vocabulary learning in people that had impaired explicit memory. Thus it seems that consciousness, or awareness, might not be necessary for learning

2

Yu and Smith (2007) demonstrated that adults were also able to learn via cross-situational statistics although with the difference that participants were instructed to learn the words and their referents, which likely led to more conscious processing.

WHAT DO EYE MOVEMENTS TELL US ABOUT AWARENESS?

189

1 new words, not when it comes to the recognition of a word and possibly not when it comes 2 to mapping form and meaning either. 3 In addition, there might be alternative ways to characterize awareness in vocabulary 4 learning besides the distinction between noticing and understanding. Tulving (1983, 5 2002), among others, proposed the existence of two different explicit memory systems, 6 semantic memory and episodic memory. Encoding in or retrieval from semantic memory 7 (i.e., memory for facts) involves noetic consciousness, which is the state of knowing 8 (consciously) that something happened. On the other hand, access to episodic memory 9 is hypothesized to rely on autonoetic consciousness, which is the conscious recollection 10 of the what, where, and when of past experiences3. To illustrate the reality of this 11 psychological construct, Tulving (2002) gives the example of K. C., a patient with 12 amnesia. After an accident at age 30, his semantic memory for events prior to the 13 accident remained largely intact but his episodic memory was severely impaired. He 14 could still remember detailed facts such as that his parents had a summer cottage and 15 where it was on a map, but he did not have any memory of personal experiences. The 16 difference between noetic and autonoetic awareness can also be seen in classic memory 17 experiments in which participants are asked to remember lists of words (for a review see 18 Yonelinas, 2002). Participants might remember that a certain word was on the list (noetic 19 consciousness or familiarity) or they might remember the experience of reading the word 20 on the list (autonoetic consciousness or recollection). This is known as the remember21 know paradigm (Tulving, 1985). As we will discuss later, we found some evidence for this 22 distinction with respect to incidental vocabulary acquisition in the data we present here. 23 24 Operationalization of attention and awareness in the present study 25 Attention 26 We operationalized attention in the present study as the time a participant spent fixating a 27 novel word, when other word properties (e.g., length, predictability, part of speech) were held 28 constant. The relationship between eye gaze data and attention is well established in reading 29 research (e.g., Rayner, 2009). More recently, attempts have been made to relate the noticing 30 construct to eye gaze during reading (Godfroid et al., 2010; Godfroid et al., in press; Godfroid 31 & Uggen, 2013; Smith, 2010; Winke, 2013). 32 33 Although they did not frame their research in terms of attention, two studies measuring 34 reading times have reported an association between eye fixation durations and vocabulary 35 learning. Williams and Morris (2004) did an eye-tracking study on the effects of word 36 familiarity on word-based reading times. Their second experiment included one novel-word 37 condition in which the participants, English native speakers, read sentences that contained 38 an English-like pseudoword followed by a highly constraining context: for example, Jim said 39 the BOSER was killed for its fur (capitalization ours). After the reading task, an unannounced 40 vocabulary posttest assessed the participants’ receptive knowledge of the novel word 41 meanings by means of a two-option synonym test. Williams and Morris (2004) found that 42 readers tended to spend less initial processing time (shorter gaze duration) on novel words 43 whose meaning they later identified correctly but more time rereading them (as indexed by 44 second pass time). 45 Brusnighan and Folk (2012) examined the roles of contextual and morphological 46 information in native English speakers’ compound processing. Their second experiment 47 48 3 According to Tulving (2002), episodic memory is what allows us to travel back in time in our minds 49 and re-experience past events. 40 50

190

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

GODFROID & SCHMIDTKE

was a self-paced reading task in which participants read novel compounds embedded in informative sentence frames. Similar to Williams and Morris’s (2004) study, participants subsequently took a receptive knowledge test on the novel word meanings with two answer choices per item. Mean accuracy was over 90%, suggesting the participants may have become aware of the presence of the novel words in the sentences and, hence, of the purpose of the experiment. (They were informed at the beginning that there would be two parts to the experiment, only not what the second part would be.) Nonetheless, Brusnighan and Folk (2012) demonstrated that sentences whose targets were correctly defined on the posttest were read significantly more slowly than sentences with target words whose meaning later was not identified correctly.

Awareness We originally operationalized awareness in the present study as a participant’s ability to remember reading a particular novel word as reported in a post-task interview. In the interview, which followed the vocabulary posttest, the first author showed the items from the posttest one by one and asked the participant if she remembered what she had answered and whether she remembered reading the word. A trial was initially coded as [+aware] if a participant indicated that she remembered reading a word and as [–aware] in all other cases. This procedure resulted in 75% agreement between two raters, the second author and a research assistant. We noticed, however, that many of the disputed cases were trials for which a participant indicated that she had seen the word before but was unable to remember the context in which the word had appeared. This finding reminded us of the literature on episodic memory (see the discussion of Tulving, 2002, in the previous section) and, in particular, of the distinction between remembering and knowing. As a result, we opted for a more detailed coding scheme of the verbal reports that comprised three categories: 1.

The participant does not consciously remember the target word, which results in a coding of [–awareness].4 In this case the participant might attribute a correct choice on the posttest to intuition, feel, or guessing.

2.

The participant remembers that the word was somewhere in the texts she read (sense of familiarity or noetic awareness).

3.

The participant remembers reading the word in a particular sentence (recollection or autonoetic awareness).

Note that we use the term recall in this study to refer to participants’ recollection of seeing or processing the target word in the readings (i.e., to operationalize awareness) whereas recognition is used to denote participants’ correct identification of the target word on the vocabulary posttest (i.e., it is our measure of receptive word learning). The operationalizations of attention and awareness are independent of one another in this study although it is reasonable to assume that in practice the two mechanisms will often co-occur; that is, participants who can recall the experience of reading a certain word likely also paid more attention to that particular word during reading. Our research questions then were as follows:

4

We use the labels no awareness and unaware as shorthand for “no verbal report of awareness.” We acknowledge that the absence of a verbal signaling of awareness does not necessarily imply that there was no awareness (see section Awareness in SLA).

WHAT DO EYE MOVEMENTS TELL US ABOUT AWARENESS?

191

1 RQ1 Do advanced EFL learners’ eye fixation durations on novel words during 2 reading predict their recognition of these words in an unannounced, immediate 3 vocabulary posttest? 4 RQ2 Does advanced EFL learners’ recall of reading a novel word in a text predict their 5 subsequent recognition of that word? 6 RQ3 Do advanced EFL learners look longer at those words for which they subsequently 7 report awareness? 8 9 RQ4 Are eye fixation durations and retrospective verbal reports equally good predictors 10 of subsequent word recognition? 11 12 Methods 13 Participants 14 Twenty-nine female, advanced EFL learners (L1: Dutch) participated in this study. All of 15 them were second or third-year English language majors (age range: 19–28) at the same 16 Belgian university and were proficient in English. They had started to learn English as 17 a foreign language from age 13 onwards in secondary school in Belgium. Their English 18 proficiency level was at the B2 (upper intermediate) or C1 (lower advanced) levels of the 19 Council of Europe’s (2001) Common European Framework (CEF). All participants had either 20 normal or corrected-to-normal eyesight. 21 Materials 22 23 Participants read 20 English paragraphs, of which 12 were critical to the experiment. All 24 the paragraphs were pretested on a group of first-year English language majors at the same 25 university to ensure that most words would be known to the present, more advanced sample 26 of students. Thus, the participants encountered the novel words in contexts with mostly 27 familiar words that could help them infer the meaning of the novel targets. Each participant 28 read the experimental paragraphs in one of four conditions that were identical except for the 29 target word. For example, for the paragraph with the target word average a participant would 30 read either: (i) average (control condition); (ii) canimat; (iii) canimat or average; or (iv) average 31 or canimat (see Appendix A for the full sample paragraph).5 Condition was varied within 32 subjects, such that each participant read three different paragraphs per condition, good for a 33 total of 12 critical paragraphs. The assignment of a given paragraph to a given condition was 34 counterbalanced between subjects according to a Latin square design. 35 Target-word retention was measured through an unannounced vocabulary posttest. On this 36 test, participants saw the original sentence in which the target had appeared, along with 18 37 possible answers for the missing word. Appendix B contains an example. The majority of 38 the distractors were novel words that participants had read in a different paragraph. Some 39 others were novel words that had not appeared before, and two or three items were known 40 English words. As there were 18 answer options, the probability of guessing the correct word 41 by chance was low. 42 43 Procedure 44 All participants were tested individually. They were instructed to read the 20 paragraphs 45 for meaning while an eye tracker, the EyeLink II, recorded their eye movements. Next, they 46 took the unannounced vocabulary posttest. Finally, they took part in an interview with the 47 48 5 For an analysis of the effects of contextual cues on attention and vocabulary recognition, see Godfroid 49 et al. (in press). 40 50

192

GODFROID & SCHMIDTKE

1 researcher in Dutch. In the interview the researcher showed the participant the vocabulary 2 posttest items again, one by one, and asked the participant whether she remembered what 3 she had answered. If she did not, the correct response was provided. Then the researcher 4 asked the participant if she could remember reading the word (see Figure 1). If she did, the 5 researcher asked her what she had thought. The participants were not informed that most of 6 the target words were artificial until a debriefing session that took place a few months after 7 the experiment. To judge by their interview protocols, none of the participants debunked 8 the pseudowords in the texts. 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Figure 1. The interview procedure (with the original, binary coding scheme for awareness) 32 Analysis 33 34 All retrospective verbal reports were transcribed verbatim and translated into English. 35 Then two raters independently coded each experimental item as +awareness or –awareness 36 based on whether the participant indicated that she could remember reading the word 37 or not. This procedure resulted in 75% simple percent agreement. Because many cases 38 that resulted in disagreement could not easily be classified as one or the other, we decided 39 that dividing the +awareness category into two distinct categories might capture the 40 data better (see Operationalization of Awareness above). After recoding the data with 41 the three-tiered system, interrater reliability rose to 86%, which appears to confirm 42 our decision to opt for a more fine-grained coding scheme. The remaining cases of 43 disagreement were discussed until we reached consensus. This resulted in the following 44 2 (± recognition) by 3 (autonoetic/noetic/no awareness) cross-tabulation of possible 45 outcomes (Table 1): 46 47 48 49 50 51 52

WHAT DO EYE MOVEMENTS TELL US ABOUT AWARENESS?

193

1 Table 1. Percentage of words that participants recognized and did not recognize on the posttest as 2 a function of awareness 3 recognition on posttest no recognition on posttest total 4 5 autonoetic awareness 8.0% (21) 4.2% (11) 12.3% (32) 6 noetic awareness 5.7% (15) 13.0% (34) 18.8% (49) 7 no awareness 6.9% (18) 60.5% (158) 67.4% (176) 8 9 total 20.7% (54) 77.8% (203) 98.5% (257) 10 note. Percentages refer to the total number of pseudowords. Four words were coded as missing values with regard to awareness. 11 12 The following interview excerpts, translated from Dutch, illustrate the six possible 13 classifications. Note that recognition refers to recognition of the correct word on the 14 posttest. “P” is participant and “R” is researcher. 15 [+recognition, autonoetic awareness] : target is lurgled: 16 17 P5 Yes, “lu-, lu-, lu-”. 18 R “Lurgled,” yes. 19 20 P5 I do not know. That word stuck because it was a word that I also found funny. 21 R Yes. 22 P5 I do not know why, but that’s... I paid special attention to it when I read it, like “oh, 23 such a cool word.” 24 25 [–recognition, autonoetic awareness] : target is evidoses: 26 R Sentence 4? 27 P5 I think I picked “eliphor” or “evidoses,” it was with an “e.” It was one of the two. 28 29 R Yes. 30 P5 I do not know which one. 31 32 R What do you think? 33 P5 I do not remember what I picked, but it was one of those two. I think I 34 picked “evidoses.” 35 R You picked “eliphor.” 36 37 P5 Ah “eliphor” yes. [Laughs] 38 (…) 39 R Do you remember— 40 41 P5 —I remember, I remember that it was with an “e,” I paid attention to that. I do not 42 know exactly why I now [inaudible] but…I know it was something with an “e.” 43 [+recognition, noetic awareness] : target is staveners: 44 45 R Sentence 20? 46 P7 Yes, I did not remember this one well, either... I picked “staveners.” 47 R Hm. 48 49 40 50

194

GODFROID & SCHMIDTKE

1 P7 Yes... [hesitant] [laughs]. I do not know. I could not specifically recall it, but... yes, 2 somehow I was thinking “it could be.” 3 [–recognition, noetic awareness] : target is staveners: 4 R (…) Here it’s “staveners.” 5 6 P14 Yes. 7 R Does it ring a bell? 8 P14 Yes, it looks familiar, but [inaudible], well many of those words I do not know 9 huh, so... 10 11 [+recognition, no awareness] : target is canimat: 12 R Sentence 9? 13 14 P8 3, “wricety?” No, cannot be. No... 15 R You also answered this one correctly (…). 18, “canimat.” 16 P8 [Laughs] That was a guess. 17 18 R A guess? Based on how the word looks or, or...? 19 P8 Maybe unconsciously that I could remember, I do not know. [Laughs] 20 [–recognition, no awareness] : target is redaster: 21 22 R And when I say this, does it, does it ring a bell, do you say “Ah yes, that’s true?” 23 P1 No, no. 24 25 To measure attention, we used the total amount of time that a participant fixated a pseudoword 26 during reading, which consists of the duration of the first eye fixation plus any subsequent 27 fixations made on the target word. We addressed the research questions by means of mixed28 model regression analyses with random intercepts for subjects and words (Baayen, Davidson, 29 & Bates, 2008). Including a random intercept for subjects allows for individual differences 30 in reading times, while the hierarchical structure of the model pulls outlying observations 31 towards the grand mean. Overall model fit was assessed by means of marginal and conditional 32 R2 (see Nakagawa & Schielzeth, 2012). Marginal R2 shows the variance explained by the fixed 33 factors (i.e., fixation time and/or awareness level) whereas conditional R2 measures the variance 34 explained by both the fixed and the random factors (i.e., subjects and words). 35 Results 36 RQ1: Attention predicts word recognition 37 38 In general, the vocabulary posttest proved a difficult task. Participants recognized an average 39 of 2.1 out of nine words (SD=1.42). To see whether more attention paid to a word increased 40 the probability of recognizing that word on the posttest, we ran a logistic mixed model 41 with (posttest) recognition as binary outcome variable and fixation times as a continuous 42 predictor variable. We centered fixation times around the y-axis to improve model fit. Two 43 outlying observations (