Computer Assisted Vocabulary Learning - People.csail.mit.edu

20 downloads 115819 Views 783KB Size Report
Vocabulary learning has always been a popular subject in CALL programs, especially in the early stages of ... Rue Grafé 4, App. 206, Namur, 5000, Belgium.
Computer Assisted Language Learning Vol. 19, No. 1, February 2006, pp. 15 – 45

Computer Assisted Vocabulary Learning: Design and evaluation Qing Maa* and Peter Kellyb a

University of Louvain, Belgium; bThree Gorges University, China

This paper focuses on the design and evaluation of the computer-assisted vocabulary learning (CAVL) software WUFUN. It draws on the current research findings of vocabulary acquisition and CALL, aiming to help Chinese university students to improve their learning of English vocabulary, particularly that with which they experience most difficulty. It is argued that vocabulary should be learned explicitly as well as implicitly; learners need to be trained to become good learners, e.g., by being instructed in useful learning strategies, to enable them to learn vocabulary more efficiently and effectively. A design model of CALL efficacy is constructed to ensure the quality of vocabulary learning in CALL programs; it is employed in the design of the software WUFUN. Finally, the preliminary results of the software evaluation are reported and discussed.

Introduction Vocabulary learning has always been a popular subject in CALL programs, especially in the early stages of CALL (1980s) when technology was relatively simple and it was thought that vocabulary learning could be easily integrated into CALL programs. The earlier programs typically included a single type of language learning activity, such as text reconstruction, gap-filling, speed-reading, simulation, and vocabulary games (Levy, 1997). The range was narrow, probably because previously computers were less powerful and language teachers did not have sufficient knowledge of programming (Goodfellow, 1995). It might also have been due to the limited number of vocabulary learning theories at a time when vocabulary learning was just starting to attract people’s attention.

*Corresponding author. Rue Grafe´ 4, App. 206, Namur, 5000, Belgium. Email: [email protected] ISSN 0958-8221 (print)/ISSN 1744-3210 (online)/06/010015–31 Ó 2006 Taylor & Francis DOI: 10.1080/09588220600803998

16 Q. Ma and P. Kelly Nowadays, vocabulary learning is often viewed as a sub-component of a multimedia package or a CALL program, particularly in commercialised materials. Some researchers have tried to create CALL programs devoted to vocabulary learning (Goodfellow, 1994; Groot, 2000; Boers, Eyckmans, & Stengers, 2004). One common feature is situating vocabulary learning in context instead of treating it as an isolated activity, as was the case before. Another important trend is for learners to be given as much freedom as possible to choose what to learn and how to learn. However, this could be problematic if learners do not know how to deal with the learning tasks and use the software effectively. Too much freedom will sometimes adversely affect the learning result. A way forward is for learners to be given some help to become ‘good learners’—that is, to acquire sufficient knowledge about language learning and have the ability to take charge of their own learning effectively and efficiently.1 They can thus benefit maximally from the freedom of learning. In this article, we first review the literature on those approaches to vocabulary learning and CALL programs that take into account vocabulary learning. This is followed by an introduction to a CALL efficacy model, which aims to help and guide the learner to the completion of learning tasks as a way to ensure the quality of a CALL program. This CALL efficacy model is used to design the software WUFUN, developed for Chinese university students to help them to learn vocabulary perceived as difficult. A pilot study was carried out to evaluate a prototypical unit of the software as well as to validate the CALL efficacy model empirically in two settings: individual use and classroom use. Results are reported and discussed. Finally, possible improvements regarding both the software design and future research are outlined. Current Approaches to Vocabulary Learning Approaches to vocabulary learning can be generally categorized under two broad paradigms: the implicit and the explicit learning paradigm. In this article, the meaning of ‘implicit’ and ‘explicit’ is not restricted to what they mean in ‘implicit learning’ and ‘explicit learning’2 in cognitive psychology; rather, the literal meanings of the two words are used to refer to the main features associated with the two paradigms. Implicit learning is associated with natural, effortless and meaning focused learning; explicit learning implies that learning requires more deliberate mental effort than simply engaging in meaning focused activities and that a link has to be established between meaning and form by various means. The Implicit Learning Paradigm The basic assumption of the implicit learning paradigm is that words can be acquired naturally through repeated exposure in various language contexts with reading as the major source of input, a notion that is strongly supported by findings in respect of L1 vocabulary acquisition. Incidental learning is perhaps the most important feature of this learning paradigm. It can be defined as the process of acquiring vocabulary and grammar through meaning focused communicative activities such as reading and

Computer Assisted Vocabulary Learning

17

listening (Hulstijn, 2003, p. 349). Several studies support the implicit learning paradigm. Krashen’s input hypothesis (1989, 1993) postulates that vocabulary can be acquired by reading as long as the input is comprehensible to the learner. Nagy, Herman, and Anderson (1985) hold the view that children acquire most L1 words through reading and that they do so incidentally. In the same vein, Sternberg (1987, p. 89), relying on studies in L1 acquisition, claims that ‘‘most vocabulary is learned from context’’ by contextual guessing, although whether this process can take place successfully or not depends on several ‘‘moderating variables’’ (pp. 92 – 94), such as the density of unknown words; the learner may be overwhelmed by a large number of unknown words with the result that no learning takes place. The main problem with regard to acquiring vocabulary incidentally in L2 acquisition seems to be attributable to three sources. First, incidental learning inevitably involves a great deal of contextual guessing of the unknown words. Context alone does not always facilitate meaning transfer; in some cases even educated adults cannot infer the meaning of L1 words in context (Ames, 1966; Beck, McKeown, & McCaslin, 1983, cited in Duquette, Renee, & Laurier, 1998). Second, as a consequence, the learning rate is very low (see Hulstijn, 1992). According to Nation (1990), 5 – 16 exposures are needed to fully acquire a word. This is implicitly supported by Nagy et al. (1985) who reported a 5% – 15% probability of a word being learned at first exposure; similarly, Knight (1994) demonstrated a learning rate of 5% – 21% from her studies, also for one exposure. Third, the vocabulary acquired through incidental learning is mainly for recognition and hardly at all for production (see Paribakht & Wesche, 1997; Wesche & Paribakht, 2000). This is due to the nature of incidental learning: the main language activity is reading where the focus is on meaning and content and only limited attention is paid to the lexical and syntactic features of the new words. The quality and quantity of lexical processing in incidental learning is simply insufficient to enable the learner to grasp the precise meanings and correct usage of words that will lead to correct production. The Explicit Learning Paradigm Authors who adhere to this paradigm argue that vocabulary and vocabulary learning strategies should be learned or taught explicitly so that learning can be more efficient. They agree with upholders of incidental learning that context is the main source for acquiring vocabulary, but they claim that learners need some extra help to build up an adequate vocabulary and to acquire the strategies necessary to cope with the vast reading context (see Coady, 1997). There are two main approaches in respect of the explicit learning paradigm: explicit instruction and strategy instruction. Authors who favour explicit instruction argue that learners should be taught vocabulary explicitly by using various means including direct memorization techniques (Coady, 1993; Nation, 1990, 2001). Here the concern is mainly with low level learners who do not have enough vocabulary to read extensively. Nation (2001) suggests that high frequency (2,000 word level) and low frequency vocabulary should be treated differently. High frequency words have a high coverage (80%) of

18 Q. Ma and P. Kelly text (p. 11) and should be mastered as soon as possible; this can be achieved by direct teaching (teacher explanation, peer teaching), direct learning (using word cards, consulting dictionaries), incidental learning (contextual guessing, communicative activities) and planned encounters with the words (graded reading, vocabulary exercises) (p. 16). As for the low frequency words, teachers should train learners to use strategies such as contextual guessing, dictionary use, memory techniques and vocabulary cards to cope with these words and to enlarge their vocabulary (p. 20). According to Laufer (1997, p. 23), learners should master a basic vocabulary of 3,000 word families to be able to use the ‘‘high level processing strategies’’ needed to comprehend a general text. The empirical studies of Paribakht and Wesche (1997) and Wesche and Paribakht (2000) show that reading plus explicit vocabulary training enables learners to learn vocabulary both quantitatively and qualitatively better than by simply relying on context alone. Laufer (2001) demonstrated a superior lexical gain when decontextualised word-focused activities were used than when learners were simply engaged in reading comprehension. The second approach, strategy instruction, emphasizes teaching the learners specific learning strategies to make learning more efficient (Cohen, 1998; Cohen, Weaver, & Li, 1996; O’Malley et al., 1985; Oxford & Scarcella, 1994). Researchers of strategy instruction often hold the view that context can provide the essential means for learning vocabulary but additional support, such as explicit instruction, is also needed (Oxford & Scarcella, 1994). The typical strategies recommended that learners be instructed in are word grouping, word association, imagery, mnemonics (for example, keyword method, hookword method), and semantic mapping, etc. Traditionally, strategy instruction seems to concern advanced learners rather than low level learners (Coady, 1997). However, strategy instruction to low level learners can be very useful. For example, strategies such as imagery and mnemonics will be very helpful since the greatest difficulty in acquiring a word in the initial stages is to link the form and the meaning in memory (Kelly, 1986; Laufer, Eldder, Hill, & Congdon, 2004). This is particularly true in respect of an unrelated language and was the initial driving force behind the keyword method (Atkinson & Raugh, 1975). It would seem that the explicit learning paradigm is best summarized as a ‘‘mixed approach’’, to use Coady’s words (1993, p. 17). Supporters of this paradigm combine a whole variety of activities, including explicit vocabulary instruction, vocabulary exercises, vocabulary learning strategies, and extensive reading. The strength of the explicit learning paradigm is that implicit learning is not excluded but rather is seen as one of the two complementary learning approaches that are necessary to vocabulary acquisition. The two would work best in combination with each other.

Review of Call Programs for Vocabulary Learning Multimedia Packages with Vocabulary Learning Activities This is perhaps the most popular type in terms of the number of products that have been sold and their wide use in educational settings. Most are commercialised

Computer Assisted Vocabulary Learning

19

programs. The criticism is often made that these programs lack a pedagogical basis. The investment in such projects is usually considerable but it does not necessarily mean that solid research has preceded them. They are particularly vulnerable when it comes to the issue of users’ needs being addressed. Commercialised programs are often remote from the users; background information, such as the age, sex, cultural background, other foreign language knowledge, computer knowledge and so on, of those users for whom the programs are intended is not specified and can only be guessed (Levy, 1999, 2002). Given their general lack of research basis as well as the comparatively small amount of time and space devoted to vocabulary learning, the quality of the vocabulary learning resulting from the utilization of these programs is often disappointing. Programs Made up of Written Texts with Electronic Glosses This is probably the most popular type in research-based programs, and is a reflection of the prevailing interest in incidental learning. These programs are written texts with hyperlinks and equipped with an electronic dictionary or glossary. The main emphasis is on reading comprehension and the acquisition of some new lexical items is a by-product of the reading process. The advantage of providing electronic glosses is that the lexical information can be accessed easily simply by a click (or by typing the word) with little interruption of the reading process. Moreover, glosses are made much more informative and attractive than traditional lexical entries by utilizing multimedia effects. Chun and Plass (1996a), Laufer and Hill (2000), and De Ridder (2002), have carried out studies that demonstrated how vocabulary can be learned in such a setting, though each of these studies focuses on a different aspect. The main concern of this type of program is with the information that should be included about a word and with the way the information should be presented. The learning rates are reportedly higher in the computer-mediated situations than in paper materials for incidental learning. Chun and Plass (1996a) reported a learning rate of 24.1% – 26.6%; Hill and Laufer (2000) reported a learning rate of 33.3% – 62%. However, the learning rate in each of the studies is, strictly speaking, only tested at recognition level. It is reasonable to anticipate a lower learning rate at production level due to the nature of the learning task in this type of program. It is productive vocabulary learning that this type of program cannot address adequately. Programs Dedicated to Vocabulary Learning Another type of CALL program, which is often based on research, usually takes a different approach. The CALL authors choose a particular theory of language learning and implement it via computer technology. A good example is provided by Groot (2000, p. 64), where the three stages of acquiring a new word in the mental lexicon: ‘‘noticing’’, ‘‘storage’’ and ‘‘consolidation’’, are simulated by the CALL program ‘‘CAVACO’’. The learning process is composed of four stages in sequential

20 Q. Ma and P. Kelly order: ‘‘deduction’’, ‘‘usage’’, ‘‘examples’’ and ‘‘retrieval’’. A careful look at the program reveals that learners were encouraged to deduce word meanings and word usage. However, instead of leading to deeper processing, this may risk inducing mere guessing since learners are prone to take short-cuts and perform activities requiring less mental effort. This may explain why the learning result of the experimental groups was not much higher than that of the control bilingual list groups in Groot’s investigation. Goodfellow’s ‘Lexica’ (1994) is based on Kukusla-Hulme’s model of ‘‘Journey of a vocabulary item’’ (1988, p. 164) in which a vocabulary item to be learned goes through the following procedure (Figure 1). ‘Lexica’ adopted this model and elaborated the ‘written record’; the user of Lexica is asked to group words according to ‘form’, ‘meaning’, and ‘context’ and then find the meanings and usage of the words with the help of lexical tools (for example, dictionary, concordancer). The weakness of such a design, according to Goodfellow (1995, p. 220), is the lack of explicit instruction how each task should be carried out. Consequently, word grouping was found to be difficult for some learners and on the whole they tended to adopt a superficial learning approach, such as using L1 translations. The expected learning rate of eight words per hour was achieved by very few subjects. A Design Model for CALL Efficacy The model that we suggest is an attempt to provide an alternative way of addressing CALL design, bearing in mind that this is only a starting point and that there still remains plenty of scope for the better integration of computer technology into the design of CALL programs. What is provided here is a simple preliminary model; the major concern is with identifying the most important parameters that determine the efficacy or the quality of a CALL program (see Figure 2). CALL Efficacy CALL efficacy can be interpreted as the quality of the CALL program—that is, how effective and helpful it is when used by the learner. It can be assessed both quantitatively and qualitatively. Quantitative data include the performance of the user on the program’s tasks, which can be revealed by the scoring system of the program; they also include the progress (or the regression) that takes place through using the program, which can be assessed by pre-tests and post-tests in an experimental setting. Qualitative data include the recording of the user

Figure 1. Journey of a vocabulary item (adapted from Kukulska-Hulme, 1988)

Computer Assisted Vocabulary Learning

21

Figure 2. The CALL efficacy model

interaction with the program, which could be provided by a profile recording system built into the program. The user’s own evaluation of the program is another important source of qualitative data regarding the efficacy of the program. This can be obtained by a questionnaire and/or an interview on the completion of the tasks. Theory First, it is commonly agreed that a sound theoretical underpinning is vital to ensure the quality of a CALL program. It has been demonstrated that the quality of a CALL program is determined by the methodology behind it rather than the computer technology itself. Methodology refers to the overall approach to the design of the program; the underlying theoretical principles constitute a very important component of the methodology. Here theory mainly means language learning theory, which is used as a general term to refer to the program designer’s assumptions about the nature of language, language learning and the process of learning. What specific language learning theory to choose depends on what language knowledge aspects or skills the CALL program would like to focus on. In CALL programs for vocabulary learning, learning theories or research findings specific to vocabulary learning should be considered first. On the other hand, language is best learned as a whole rather than in separate components. There are thus some general learning aspects shared by CALL programs though they have different focuses. The selection of a specific or general language learning theory will serve as a guide in the selection of the technologies to be used.

22 Q. Ma and P. Kelly Computer Technology Traditionally, computer technology is referred to as the means or the medium used to deliver learning materials to learners. Clark (1994) distinguished between ‘methods’ and ‘media’. Media are the means of delivering the methods which consist in ‘‘a number of possible representations of a cognitive process or strategy that is necessary for learning’’ (p. 26). He claimed that a method can be implemented by many means other than computer technology; thus media (or the computer technology) might ‘‘influence the cost or speed efficiency of learning but methods are causal in learning’’ (p. 26). There seems to be a tradition of dividing CALL into two broad categories: technology-driven and pedagogy-driven based projects (Colpaert, 2003; Levy, 1997). Developers in this category are often accused of producing CALL materials based on their intuition instead of on research in language learning. There is a dividing line in conceptualising CALL design: there are those who do so according to technologies and those who do so according to methodology: each side focuses on its own aspect and plays down the other. There is therefore on both sides an inclination to view method (methodology) and media (computer technology) as two separate components. Technology alone cannot determine the design, but should it be viewed solely as a means of implementing the materials? A crucial question arises: Is there a merging point of technology and pedagogical knowledge in conceptualising CALL design? If so, where is it? We argue that computer technology could be thoroughly integrated into the design and become an inseparable part of the methodology; technology can be used to monitor and control user actions so that users can be guided in performing language learning activities and achieve high learning potential. User Actions Learner performance, or user actions, is an important source of data for the evaluation of CALL programs. Chapelle (2001) puts learner performance at the third level of evaluation after that of the CALL program itself and the teacher’s planned activity. What the learner has actually done and how s/he interacts with the program is a good indicator of the learning outcome. In line with the current emphasis on ‘learner autonomy’ and ‘learner focus’, the trend is for the user to be given as much freedom as possible in the use of the program. Closely associated with these concepts is ‘learner development’, discussed in detail by Wenden (2002, p. 32) who defines it as ‘‘a learner-centred innovation in FL/SL instruction that responds to the learner by aiming to improve the language learner’s ability to learn a language’’. It can be said that learner development is the process and improved ability in language learning is the objective. This entails the premise that learners initially do not necessarily possess good learning ability and efficient learning strategies for language learning; they need to learn how to become good learners. Obviously, something has to be done to facilitate learner development and it is very unlikely that learners could simply learn to become good learners by themselves

Computer Assisted Vocabulary Learning

23

without any help. They have to be equipped with metacognitive knowledge, learning strategies, and skills for self-direction to be able to become good learners. ‘Strategiesbased instruction’, reported by Cohen et al. (1996), has two major components: explicit strategy instruction and strategy instruction integration. In the first case, students are explicitly taught how, when, and why strategies can be used to facilitate language learning and language use tasks. In the second case, strategies are integrated into everyday class materials and may be explicitly or implicitly embedded in the language tasks. If we are going to train learners to master good learning strategies in language activities, we have to draw up rules to constrain what they do instead of giving them complete freedom, which will go against the learning goals of the program. We therefore propose that user actions be controlled to some degree and that this be done by integrating computer technology into the overall design. The user is directed to the completion of the learning tasks as well as the embedded learning strategies instruction. S/he is guided and not left to wander at will through the program. Learner Information Language learning is also a very idiosyncratic process that is subject to a series of learner characteristics, such as mother tongue, knowledge of other foreign languages, level of proficiency in the target language, learning difficulty, learning style, learning strategy, motivation, age, sex, etc. Obviously, different types of learners have different needs, and these should be taken into account when designing the CALL program. In the same way, it is suggested that a CALL program should be targeted to a particular group of learners who have in common a series of characteristics. As much information as possible should be obtained about the learner before the software is designed; this information constitutes an important part of ‘analysis’ in Colpaert’s RBRO design model (2004, p. 135). From the CALL efficacy model, it can be seen that learner information influences the other three components by providing background information to inform the choices made in respect of theory, technology, and user actions. When choosing language learning theories for the program, we should ask ourselves a series of questions in respect of the learner. For example: 1. Will the theory chosen help our learners to acquire language knowledge or skills? And if so, how? 2. Will the theory chosen have the potential to improve our learners’ ability to learn? 3. How can we apply the theory to the learning activities so that learners will enjoy them? Other questions can be asked, depending on the specific context of learner information. As for computer technology, learner information can tell us what type of specific technology is favoured or rejected by our users. For example, if users are used to having the right mouse click function to display information for a word, we need to consider developing this type of technology in the software. If they are

24 Q. Ma and P. Kelly interested in sound or visual effects, we should develop more audio, video, speech, or animation technologies so that learners’ different sensory learning styles can be accommodated. Learner information can also provide information on how user actions should be controlled so that learners can be guided in their language learning. It should also be borne in mind that, if complete freedom will harm the learning results, so doubtless will no freedom at all. The degree to which learners’ freedom or control over the program should be restricted largely depends on the learner characteristics: learning style, learning strategy, knowledge about language learning, and perceived useful and harmful effects of the guided instruction, etc. All these have to be taken into account in deciding what should be allowed and what should be restricted regarding learner freedom.

Design of WUFUN Some General Learner Information The main learning difficulties for vocabulary include fixing the new vocabulary in memory, mastering the meaning(s) of new items, using vocabulary items correctly, and incorporating idiomatic expressions into one’s vocabulary. These are the problems faced by every language learner. Chinese learners of English face other specific problems in acquiring vocabulary due to the huge linguistic distance between English and Chinese and the considerable cultural gap. In particular, it is observed that: 1. The practice of mechanical memorization (rote), which is deep-rooted in Chinese culture, characterizes their learning. 2. Lack of direct contact with western culture makes it extremely difficult to bridge the cultural gap and to use language appropriately. 3. The exam orientation of all language teaching and learning, which has for so long encouraged rote learning and discouraged a communicative learning approach. A CAVL program, named WUFUN, is being developed to help Chinese university learners of English to overcome the learning difficulty they face by incorporating learning activities that specifically address their needs. The design of the software is based on the CALL efficacy model presented earlier. Integrating the Theory into the Design Vocabulary learning in WUFUN is addressed in a holistic way; learning is situated in context with particular attention being paid to the items. Following a series of systematic studies on Chinese learners (Kelly, Li, Vanparys, & Zimmer, 1996; Vanparys, Zimmer, Li, & Kelly, 1997; Li et al., 1999), it was decided that the specific elaboration strategies for item learning that will be potentially useful to Chinese

Computer Assisted Vocabulary Learning

25

learners are imagery, verbal association, programmed rehearsal and oral input. The originality of the approach consists in the integration of a listening approach and memorization techniques and in sensitising the learner to cultural differences. Mnemonic techniques have been documented since an interest was first shown in language strategies (see Atkinson & Raugh, 1975; Cohen, 1987; Pavio & Desrochers, 1979; Pressley & Levin, 1981) and they have been generally proved to be much more effective than rote. The effectiveness of most mnemonics consists in the interplay between images and verbal representations, formulated by Pavio and Desrochers as the ‘dual-coding theory’ (1980). However, these memory strategies are investigated largely in language laboratory settings (O’Malley & Chamot, 1990), and their potential is rarely exploited in classroom teaching/ learning with the result that they remain largely unknown to most learners. Not only have these mnemonic techniques been demonstrated to be as much as three times more effective than the traditional rote method (Paivio & Desrochers, 1979) but, as so many researchers have pointed out, they transform the learning of vocabulary from what is invariably viewed as a tedious, boring task into one that is enjoyable and even amusing. Listening is viewed by a number of leading researchers as the basic skill in SLA (see Asher, 1983; Krashen & Terrell, 1983; Nord, 1978; Winitz, 1978). Through the progressive build-up of a language store, involving both hemispheres of the brain, speaking results, in much the same way as with the acquisition of L1. Many investigations have demonstrated this transfer (see Asher, 1964; Ervin-Tripp, 1974; Postovsky, 1975; Winitz & Reeds, 1973). In addition, the auditory perception of the learner progressively develops and this becomes the basis of a good pronunciation (Gary, 1975; Winitz, 1977). Furthermore, it has been shown that listening aids in the long-term retention of vocabulary, whether it be for reading or for listening purposes (Gary & Gary, 1982; Kelly, 1992). Language always mirrors the background culture of whoever is speaking. This is particularly true in respect of vocabulary (see De Saussure, 1974; Miller, 1996). There should, in consequence, be a strong focus on the cultural aspect3 in respect of vocabulary learning. This is done in WUFUN via the images, the stories which introduce the vocabulary, the idioms and proverbs and, in particular, via the humour/ true stories. Humour can be a valuable tool for bringing out salient characteristics of a culture, without indulging in negative stereotypes. Differences between different western cultures—their customs, practices, attitudes, behaviour, humour, even political and economic situation—are also brought out. When the learning theories are decided on, the next step is to create the learning content underpinned by these theoretical guidelines. Approximately 300 words were taken from the 4,000 word list that Chinese university students (non-language specialists) are required to master; their selection was based on student and teacher judgement of word learning difficulty (for example, pronunciation, word length, spelling, confusion with similar forms, cultural connotations, and so on), together with other criteria, such as usefulness and relevance (Kelly & Li, 2005). These words were then used to create 20 stories as learning texts combined with other

26 Q. Ma and P. Kelly different learning activities, forming 20 units in all. One of the 20 units has been developed into a computer program as the prototypical unit of WUFUN, containing 25 words plus three idioms4 to be studied (see Appendix A). Some of the words may already be known receptively or productively to learners. The results of the pre-vocabulary tests in the pilot study confirmed that this was the case. The following provides an overview of the sequence of learning activities in WUFUN. First, a preview of the context (overview of the story), serving as an ‘advanced organizer’, a device aiming to activate useful background information (see Chun & Plass, 1996b, p. 504), is presented to the learner. The user can view a series of pictures, each of them accompanied by a short spoken sentence (some of the words to be learned will appear in the sentences for the first time; the word meaning can be easily guessed from the pictures). This will give the user a general idea of the story presented later. Then some vocabulary items are presented in the form of a mini dictionary (Word Focus); the glosses include meanings, collocations, example sentences and usage. In addition, the learner can listen to the word, view the picture if available, and ask for a Chinese translation of the word. The user can then read the text, the complete version of the preview, in which glossed words in Word Focus will reappear in the context. Then some vocabulary learning strategies are introduced to the learner in Word Memorisation Aids, the main ones relating to verbal association, imagery, rhyming or alliteration (see Boers & Lindstromberg, 2005), etc. The user chooses a word s/he wants to know better from a list, and s/he will be given a useful tip (with the option to display the Chinese translation) on how to memorise the word. For example, for the word acquaintance, a sentence is given: The queen is an acquaintance of mine. The user can listen to the sentence and is asked to form a mental image of the sentence while listening to it. Different tips are given to facilitate the learning of the word; whether the word contains affixes or roots, whether it is imageable, whether it can be associated with other known words, etc. What is central to these tips is the combination of image, sound and verbal information. Their combination will help word memorization and accommodate different learning styles. Next are the exercises where the words will be practised and rehearsed in context. By doing exercises, the learner becomes familiar with the meaning and usage of the words. Exercises include supplying synonymous expressions, finding antonyms, using words in collocations or as they typically occur in contexts, differentiating words having similar but not identical meanings (for example, ridiculous and funny5), etc. The whole procedure can be repeated. The vocabulary processing procedure in WUFUN is described in Figure 3. After the exercises comes the section on idioms followed by that on humour/true stories. The idioms are usually found to be very difficult to learn as their meaning is not apparent and often heavily culture bound. In accordance with the duel coding theory of Pavio and Desrochers (1980), which advocates dual modality input to enhance vocabulary learning, the user clicks on the idiom s/he wants to study and a picture that illustrates the meaning of the idiom will pop up on the right of the screen;

Computer Assisted Vocabulary Learning

27

Figure 3. The vocabulary processing procedure in WUFUN

in the meantime, the user can listen to an explanation of the idiom. The humour/true stories are to arouse the learner’s awareness of the cultural elements underlying language learning. Thus each story or joke to a certain degree reflects a facet of western culture (though not necessarily the culture of English-speaking countries since the language is spoken by a much larger population). The learner can read and listen to the stories. Integrating Computer Technology into the Design The computer technology has a two-fold function. It is used to create the multimedia program and, more importantly, to make the user follow the design model of the program. Users have restricted freedom in using the software. The idea is that they can always go back to the previous steps while they have to complete some basic requirements before going on to the next step. If the user does not obey these rules, the forward button on the navigation bar to go to the next page will be disabled. Here are a few examples: the user has to have listened to all the short sentences in the overview of the story before being able to go to the Word Focus (WF); s/he has to look up at least one word in WF before reading the story; s/he can only access the correct answers of the exercises in written form after having listened to them first; s/he normally has to finish one exercise before starting the next one or s/he can go directly to the next exercise but will get a score of ‘0’ for the exercise skipped. Every decision regarding user freedom for each step is thought out so that the user can obtain some benefit from doing the activities without being frustrated to the point that s/he no longer wishes to continue. Technology is employed in such a way as to ensure that each step is completed to a minimum requirement. Taking into Account User Actions in the Design In order to induce the learner to follow the design model of our program, a learning metaphor is represented in the menu screen, namely, learning is a cyclic process and learning tasks are to be finished step by step (see Figure 4). A help system will

28 Q. Ma and P. Kelly

Figure 4. The main menu screen of WUFUN

be at hand to show the learner how the software should be used. Each learning activity is accompanied by detailed instructions on how to carry out the task. The interface design in each page of the software is consistent and easy to understand. To monitor users’ performance, some user actions are recorded by the system while s/he is using the software: the total time spent on the software, the number of words viewed in WF and in Word Memorization Aids (WMA),6 the time spent on exercises and the score obtained. These data will provide important information for evaluation of the software. A Pilot Study for Software Evaluation When the prototypical unit was ready we carried out a pilot study in a Chinese university to evaluate the software. The study is a pre-test and post-test design combined with questionnaires and interview. The evaluation of the software will be conducted in terms of: learning outcome as measured by vocabulary learning rate and the vocabulary learning strategies acquired; learner evaluation as revealed by degree of satisfaction in the use of the software; restricted freedom impact (on learning outcome and learner evaluation) as measured by the relationship between user actions and learning outcome/learner evaluation. Through the software evaluation, the CALL efficacy model discussed earlier can be empirically validated.

Computer Assisted Vocabulary Learning

29

Research Questions Our research questions are the following: 1. What is the learning outcome of WUFUN? More specifically: (a) To what extent will WUFUN help Chinese learners to acquire vocabulary perceived as difficult at the receptive and the productive level in two different settings: individual use and classroom use? (b) Are learners likely to develop vocabulary learning strategies that will facilitate vocabulary learning in the long run in the two different settings? 2. How do users evaluate WUFUN in the two different settings? 3. How are user actions related to learner evaluation and to the learning results in the two different settings? The Study Subjects. Two groups of first year students at Three Gorges University, Yichang, China, of various study backgrounds (non-language specialists) participated in the study. They are low intermediate learners who have a vocabulary of 2,000 – 3,000 words. Initially we tried to include more subjects, but due to some unexpected practical constraints we only had 35 subjects, divided into two groups according to the experiment setting. Group 1 (G1) contains 17 students who volunteered to participate in the experiment after a brief introduction to WUFUN. They made an appointment with the researcher and completed the experiment on an individual basis. Group 2 (G2) contains 18 students who did the experiment together in a computer room as a self-learning class. They were required by their teacher to participate in the study. It should be noted that individual use or classroom use of language learning software are the two most prototypical settings for CALL. When learners volunteer or choose to use a piece of language software, as in the case of G1, it can be assumed that they are displaying an interest in the task. According to the process model of motivation (Do¨rnyei, 2001), this generates motivation7 at the start of the learning task. However, no such assumption can be made in respect of subjects who are coerced into performing the task, which was the case with G2. Experiment instruments (see examples for each type of instrument in Appendix B): pre- and post-vocabulary (receptive/productive) tests. A separate receptive and productive test was administered before software use to test whether the students knew the new vocabulary items that appeared in WUFUN receptively or productively. Laufer (1998) distinguished three types of vocabulary knowledge, namely passive vocabulary, controlled active vocabulary and free active vocabulary. In a more recent article (2004), she divides knowledge of a word into four degrees of strength: productive recall, receptive recall, productive recognition, receptive recognition, which are ranked hierarchically (from the highest to the lowest) in terms of the strength of the word knowledge. We chose two test formats for the receptive knowledge test: the receptive

30 Q. Ma and P. Kelly recognition test (the lowest strength) and the vocabulary level test (Laufer & Nation, 1995). For the productive knowledge test, we used the controlled active vocabulary test (Laufer, 1998), which closely resembles the equivalent of the receptive recall test for the second highest strength of the word knowledge. To avoid the test-wise effect, we used some distracters in both tests. There were 25 words to be marked in the receptive test and 21 words in the productive test. The same two tests were administered again after software use to see whether there were vocabulary gains and what these might be. Pre- and post-questionnaires. A pre-questionnaire (Q.1) was administered before software use to glean information about the students’ vocabulary learning strategies and their expectations of the software (WUFUN) they were going to use. We mainly used multiple-choice questions; both the questions and the choice of answers were carefully designed to ensure the information given would be as complete as possible and thus give as accurate a picture as possible of the students’ opinions. A post-questionnaire (Q. 2) was administered after software use. It aimed to find out to what degree the students were satisfied after using WUFUN and to obtain their comments and suggestions. It is divided into 13 sections and made up of 44 questions on a 5-point scale plus a few open questions. Students were asked to give a rating in terms of their satisfaction regarding the various components (see Figure 3 for a brief review) of the program. Questions were also asked on the scoring and checking system (feedback system), interface design, graphic design, sound system, etc. At the end there was an open section for any comments and suggestions regarding the software and to find out whether the students had learned or been aware of the vocabulary learning strategies embedded in the software. Experiment procedure. The whole experiment follows an eight-step linear sequence: pre-receptive test, pre-productive test, pre-questionnaire, software use, postquestionnaire, post-receptive test, post-productive test and an interview. The last step, the interview, was limited to G1; it was not used with G2 due to the practical constraints. It took about 2 – 2.5 hours to complete the whole procedure. It should be noted that learners were told beforehand that they would study a piece of vocabulary learning software but they did not know about the detailed procedure involved. Although they were tested before using the software, most students would not have expected a test afterwards. Data collection and analysis. For each subject in G1 we collected four scores on vocabulary tests, two sets of information in the pre- and post-questionnaires, user actions recorded by the software system, and some follow-up information in the interview. For G2, we have all the information except the follow-up information. We obtained each student’s vocabulary gain at both the receptive and the productive level by subtracting the pre-scores from the post-scores. We performed a t-test to see whether there was a significant difference between the two groups. We calculated all the ratings in all the sections for the post-questionnaire and calculated a

31

Computer Assisted Vocabulary Learning

mean for each student with a rating from 1 – 5 as the learner evaluation. A profile recording system built into the software enabled us to examine the user actions during software use. For both groups we performed a correlation test between the user actions and the vocabulary gain and another correlation test between the user actions and the learner evaluation. Results and Discussion Pre-questionnaire. From Q1, we get a detailed picture of the students’ profile. In addition, an in-depth study of the quantified results reveals some characteristics of the students’ learning habits and of their perception of CALL program learning. As for vocabulary learning, the most popular memorisation strategies are rote accompanied by periodic review. Other more elaborate techniques, such as mnemonics and word grouping, are also reported to have been used, but less frequently. The listening approach is adopted by the least number of students. They tend to be ready to perform tasks perceived as interesting or less demanding, such as viewing pictures or reading stories, and are more likely to avoid demanding tasks such as doing exercises or learning vocabulary. However, the avoidance could be compensated for by the usefulness they perceived in performing the task. If they received help to make the task easier, they would certainly be more willing to do it. Gain in receptive and productive vocabulary. Table 1 presents the mean score of both the receptive and the productive tests for both groups. Table 2 presents the means of receptive gain between the pre-test and the post-test for both groups. Table 1. Mean and standard deviation (SD) for pre-test and post-test Mean

Receptive Full ¼ 25 Productive Full ¼ 21

G1 G2 G1 G2

SD

Minimum

Maximum

Pre

Post

Pre

Post

Pre

Post

Pre

Post

15.59 16.06 10.91 9.64

21.88 20.5 16.12 14.25

2.62 3.11 2.31 4.05

1.45 3.84 2.64 4.41

11 8 7 2

19 10 11.5 2

19 20 15 15.5

24 24 20.5 20

Table 2. Mean for receptive gain

G1 G2

Mean

SD

Learning rate*

Minimum

Maximum

6.29 4.44

2.69 2.38

40% 28%

3 1

11 9

Note: *Learning rate is calculated by dividing the mean of the pre-test score by the difference between the mean of the post-test score and the mean of the pre-test score, e.g., the receptive learning rate of 40% for G1 is obtained by dividing the mean of the pre-test score (15.59) by the difference between the post-test score and the pre-test score (6.29).

32 Q. Ma and P. Kelly Table 3 presents the means of productive gain between the pre-test and the posttest for both groups. The mean scores set out in Table 1 revealed that the pre-test scores for both groups regarding receptive and productive vocabulary are quite similar (receptive: 15.59 – 16.06 out of 25; productive: 10.91 – 9.64 out of 21); a t-test indeed confirms that there is no difference between the two groups (not reported here for the sake of space). It would seem that both groups have a similar starting point in terms of preknowledge of the vocabulary items to be studied. However, it is noted that G2 had a higher SD than G1 for both pre-test and post-test on both vocabulary levels, showing that there was a bigger difference between the subjects within G2 than within G1. The gain for both groups was quite satisfactory considering there was a high baseline for each group (see Table 1). Figures presented in Table 1 imply that G1 had nine words to learn to a receptive level and 10 words to a productive level; G2 had nine words to learn to a receptive level and 11 words to a productive level. Our first research question was: To what extent will WUFUN help Chinese learners to acquire vocabulary perceived as difficult at the receptive and the productive level in two different settings: individual use and classroom use? It seems that both groups achieved a considerable learning rate at both the receptive and the productive level. Moreover, both groups have a higher vocabulary learning rate at the productive level than at the receptive level (47% 4 40% for G1; 48% 4 28% for G2). Initially, it appeared that G1 had gained more vocabulary at both the receptive and productive levels. By performing a t-test to compare the means for both groups we find, however, that G1 did significantly better than G2 at the receptive level but not at the productive level. See Tables 4 and 5 for the results. Table 4 shows that the difference in receptive gain between G1 and G2 is significant (t Stat 2.16 4 t Critical 2.03, df ¼ 33, p 5 .05.); however, the difference in productive gain is insignificant as shown in Table 5 for both groups (t Stat 0.78 5 t Critical 2.03, p 4 .05.).

Table 3. Mean for productive gain

G1 G2

Mean

SD

Learning rate

Minimum

Maximum

5.32 4.56

2.59 3.17

47% 48%

0 0

9.5 11.5

Table 4. T-test of receptive gain between two groups T-test

Mean

Variance

Observations

df

T-stat

T-critical

p

G1 G2

6.29 4.44

7.22 5.67

17 18

33

2.16

2.03

.038*

Note: *p 5 .05. (two-tailed).

Computer Assisted Vocabulary Learning

33

The two findings given above—that the productive learning rates are higher than the receptive learning rates for both groups and that there is no significant difference in vocabulary gain between the two groups at the productive level but the difference is significant at the receptive level—seem to indicate that WUFUN is slightly more able to help learners to learn vocabulary productively than receptively regardless of whether for individual or classroom use. Post-questionnaire (learner evaluation). As mentioned earlier, there are two types of questions in Q2: rating scale questions and open questions. We will focus only on the rating scale questions and leave the open questions to a later stage. See Table 6 for the means of evaluation for both groups. A t-test shows there is no significant difference between the two groups (t Stat 0.93 5 t Critical 2.03, df ¼ 33, p 4 .05.). Both groups gave a good evaluation of the program; G1 had a mean of 4 out of 5 and G2 had a mean of 3.83 out of 5. In answering the question whether they would like to use the software when more units are developed in the future, all the subjects in G1 unanimously replied ‘‘Yes’’. 3 out of 18 in G2 replied ‘‘No’’, which still leaves a positive result since G2 were forced, as it were, into participating in the experiment. In response to the research question: How do users evaluate WUFUN in the two different settings?, the software evaluation by the learners in the individual or classroom setting is satisfactory with most students from both groups expressing their willingness to continue to use the software in the future. Of the 13 sections of Q2, the favourite section for G1 is the ‘‘scores and checking system’’, which has an average of 4.53. It is the same for G2 who have an average rating of 4.47. The lowest section (3.65) for G1 is the ‘‘program sequence’’ in which students are asked whether they like the sequence of the program and whether they feel they should follow the guidelines of the program instead of doing what they want. The CALL efficacy model described earlier is implemented in the program sequence as also is the restricted user freedom regarding the control of the program. For G2, this section is rated the second lowest (3.47). Nevertheless, the ratings for both Table 5. T-test of productive gain between two groups T-test

Mean

Variance

Observations

df

T-stat

T-critical

p

G1 G2

5.32 4.56

6.69 10.03

17 18

33

0.78

2.03

.44

Table 6. Mean of learner evaluation of the software

G1 G2

Mean

SD

Minimum

Maximum*

4 3.83

0.42 0.63

3.21 2.1

4.74 4.53

Note: *Full rating ¼ 5.

34 Q. Ma and P. Kelly groups for this section have exceeded the middle point in the rating scale. This suggests that subjects in both settings do not particularly like the constraints but that they find them acceptable. User actions. Table 7 presents the mean of user actions: time spent on the program, number of words viewed in WF, number of words viewed in WMA, time spent on the exercises and score obtained for the exercises for both groups. A quick look at this table will reveal that the two groups are very different regarding the way they use the software. At first sight, it appears that G2 spent more time on the program than G1 but G2 had a much greater SD (33.5) than G1 (18.72). A careful look at the data shows that three subjects (all females) in G2 spent 141, 150 and 156 minutes on the program. If the three were taken out, the average time for G2 would be about 73 minutes. For G1, the longest time spent was 112 minutes. Thus, in fact, subjects in G2 generally spent less time than those in G1, except the three female subjects. G2 also spent less time on the exercises and scored much lower than G1. To answer the research question: How are user actions related to learner evaluation and to the learning results in the two different settings?, we performed multiple correlation tests between each selected user action (listed in Table 7) and the receptive, productive gain (learning results) and the learner evaluation (results of Q2) for both groups. See Table 8 and Table 9 for the results. Note that in both Tables 8 and 9, the correlation r whose absolute value is smaller than 0.2 is excluded. As revealed in Tables 8 and 9, the situations for both groups are quite different. For G1, the total time spent on the software seems to have a good significant negative correlation (r ¼ 7.52, p 5 .05.) with the learner evaluation; that is, the more time spent on the software, the lower the evaluation tends to be. This is the opposite for G2 in which there is a good significant positive correlation (r ¼ .51, p 5 .05.) with the evaluation. The correlations between total time and the receptive Table 7. Mean of user actions

G1 G2

Time (minutes)

WMA

WF

Time on ex. (minutes)

Score for ex. (Max. ¼ 100)

80.77 85.44

7.94 8.56

17.65 18.67

22.53 17.44

60.08 36.30

Table 8. Correlation between user actions and their learning results and learner evaluation for G1 Person r

Time

WMA

Learner evaluation Receptive gain Productive gain

7.52* .25 .32

7.22 .61** .44

Note: *p 5 .05. **p 5 .01. (two-tailed).

WF

Time on ex.

Score for ex.

.36 .49*

Computer Assisted Vocabulary Learning

35

Table 9. Correlation between user actions and their learning results and learner evaluation for G2 Person r

Time

WMA

WF

Time on ex.

Score for ex.

Learner evaluation Receptive gain Productive gain

.51* .38 .31

.35 .52* .28

.44 .21 .5*

.35 .47* .42

.33 .51* .21

Note: *p 5 .05. (two-tailed).

and productive gain are weak and insignificant for both groups. The number of words viewed in WMA seems to have a good significant positive correlation with the receptive gain for both groups (r ¼ .61, p 5 .01 for G1; r ¼ .52, p 5 .05 for G2). The number of words viewed in WF has little correlation with receptive and productive gain for G1; it has quite a good significant correlation with the productive gain for G2 and a weaker insignificant correlation with the receptive gain. Time spent on the exercises seems to have little to do with the receptive and productive gain for G1; it has a better positive correlation (r ¼ .47, p 5 .05) with the receptive gain for G2. The score for the exercises has a good significant positive correlation (r ¼ .49, p 5 .05) with the productive gain for G1 and a good significant positive correlation (r ¼ .51, p 5 .05) with the receptive gain for G2. In both tables we find that three factors, total time spent on the program, words viewed in WMA and score obtained for the exercises, seem to be more closely related to the learning results and learner evaluation for both groups. The way these factors correlate with the learning results and evaluation is quite different for both groups. For example, we are not very clear why the total time spent on the program is correlated in two opposite directions for G1 and G2. The only common phenomenon shared by both groups is that WMA has similar positive correlation with receptive gain. This proves that WMA, the main section to introduce vocabulary learning strategies, is more likely to be helpful to receptive vocabulary gain. But why is it less likely to be helpful for productive vocabulary gain? One assumption might be that a single exposure to vocabulary learning strategies is not enough to help the students to learn the vocabulary to a productive level. To learn a word productively, one needs, in addition to deep mental processing of the lexical information, sufficient familiarity with the word in different contexts. Therefore, the score the subjects obtained for the exercises would be more likely to account for the productive gain. This is the case with G1 for which a significant positive correlation is found between the exercise score and the productive gain. This is not the case with G2, where a significant positive correlation is only found between the exercise score and the receptive gain. Finally, an extra correlation test was performed between the learner evaluation and the learner outcome. No significant correlation was found for both groups and the two types of vocabulary gain. Unlike previous findings, learner attitude toward the learning tasks does not greatly affect the learning results. For example, the subject in G1 who had the lowest evaluation of the software (3.21) turned out to have achieved a high vocabulary gain both receptively (10 words) and productively (eight words).

36 Q. Ma and P. Kelly This subject stated frankly in the interview that he did not like the software because the ‘rigid’ order of the program would not allow him to exercise his individuality and creativity. He spent 90 minutes on the software, of which 19 minutes were devoted to the exercises, and obtained a score of 73.9. In addition, he viewed 15 words in WMA and 17 in WF. Note that he spent more time, viewed more words in WMA and did better on the exercises than the average (See Table 7). Although he did not like the restricted freedom regarding the software use, it is this design feature that guided and controlled his actions which led to his superior learning results over others who gave a higher evaluation of the software but who spent less time and viewed fewer words. This, on the one hand, proves that the design of WUFUN based on the CALL efficacy model has been preliminarily successful; on the other hand, it indicates that affective factors such as attitudes towards the learning task do not always predict learning result. What matters is what learners actually do in the learning process. Subjects in the two different settings provided rather different pictures of how user actions are related to vocabulary gain and learner evaluation in the two different settings. The difference can be attributed to the quantitatively different user behaviour, as shown in Table 7. It is very likely that the two groups differ in several respects; for example, individual users in G1 are doubtless more motivated to use the software than group users in G2 since the former volunteered to participate in the study while the latter were coerced into doing so. One subject in G2 spent only 44 minutes on the software, including two minutes on the exercises, viewed two words in WMA and one word in WF. His vocabulary gain turned out to be the lowest: two words receptively and zero word productively. The comments he gave were negative: he considered that the software was boring and that it did not differ much from their textbooks. His insufficient user actions and poor learning outcome are clearly the result of a lack of motivation. In addition, the subjects in G1 might have been more at ease, attentive, and relaxed than those in G2 in the experiment due to the different settings. It should be remembered that subjects in G1 completed the experiment individually while all the subjects of G2 were placed together in a computer room. Other information. This information includes comments and suggestions given by the subjects in the free open section in Q2 and further information obtained from the interview (limited to G1). In addition, the subjects were asked to indicate whether they acquired some useful strategies for learning vocabulary from the software. Table 10 presents quantitative information regarding answers to the two questions. The response rates for these two questions are much better for G1 than for G2; in addition, the quality of the answers for G1 is definitely better in terms of content and Table 10. Response rate for questions in free section in Q.2

G1 (n ¼ 17) G2 (n ¼ 18)

Comments/suggestions

Percentage

Ideas for voc. learning

Percentage

16 15

94% 83%

13 5

76% 27%

Computer Assisted Vocabulary Learning

37

length. We will discuss only the strategies that they acquired from using the software in order to answer the research question: Are learners likely to develop vocabulary learning strategies that will facilitate vocabulary learning in the long run in the two different settings? See Table 11 for the categorization of the strategies both groups claimed to have acquired from the software. We noted two facts. First, most learners mentioned just one or two strategies. Second, the learners tend to adopt the strategies that require less mental effort and show less interest in those requiring more mental effort, such as imagery and practising words in different contexts. This could also be due to the perceived usefulness of each category. Thus our answer to this research question is that the majority of individual users acquired one or two strategies perceived to be useful from using the software but strategies requiring more mental effort are less likely to be appreciated. In contrast, the embedded vocabulary learning strategies are largely ignored by most group learners. This does not necessarily mean that those strategies were not perceived to be useful, but simply that the strategies have not entered into their metacognitive repertoire. It seems that a single exposure to the software for a short period is not enough to help students to develop vocabulary learning strategies in a systematic way. It may be also due to the limited mental processing capacity: when learners attend to both the form and the meaning of the vocabulary items, the cognitive load might be too heavy to allow them to pay more than limited attention to the embedded learning strategies. Conclusion and Suggestions for the Future Study Our main objective has been to introduce the CALL efficacy model to ensure the quality of CALL programs. The model is constructed by identifying four main components, theory, computer technology, user actions, and learner information, and integrating them into a whole. They influence and interact upon each other, thus strengthening all the fibres or links of the model. It is these that determine the quality of a CALL program as well as constituting the methodology of a CALL program. It is

Table 11. Learning strategies acquired by G1 and G2

Put words in sentences to memorize them Put words similar in form and meaning together to study Separate roots or affixes from the words Make word associations Practise words (in diversified contexts) Listen to the words in sentences or a text Compare words and group words Image the meaning of words

G1

G2

4 3 2 2 1 1 1 1

1

1 1

38 Q. Ma and P. Kelly shown how the model can be applied to the design of the CAVL software WUFUN for Chinese university students to learn difficult vocabulary items. A pilot study is reported in order to evaluate the software and to validate the model empirically in both individual use and classroom use. From the results of the study, it seems that the CALL efficacy model underpinning WUFUN has been preliminarily proved to be effective in both settings. Due to the complicated experimental procedure we collected a large amount of different data. Data analysis and results reporting were also a painstaking process. It is arguable whether we have chosen the ideal research methodology for such a complicated study. Regarding our first research question, the learning outcome of WUFUN, it is demonstrated that by using the software, learners can acquire vocabulary perceived as difficult both receptively and productively in both settings. Moreover, the productive learning rate is slightly higher for both. Learners have acquired a few vocabulary learning strategies but not in a systematic way that would allow their further independent use, which is probably due to their limited mental processing. For the second research question, learner evaluation in both settings is fairly satisfactory despite the constraints incorporated into the software, and the majority of learners reported that they would like to use the software when more units are developed. There is not yet a satisfactory answer to the third question. Some user actions, such as total time, number of words viewed in WMA and the score obtained for exercises, seem to be closely related to the learning outcome and learner evaluation; however, subjects in different settings, individual use or classroom use, revealed very different pictures. Learner attitudes towards the software do not appear to affect the learning outcome which is more related to what learners actually do in the learning process. There are, however, a number of suggestions to be made for the next study. Improvements will be made to the design of the WUFUN software based on the results of the pilot study and the comments/suggestions made by learners (for example, more pictures should be added to the software). More importantly, the following questions will be addressed: 1. The instruction of vocabulary learning strategies will be made more explicit. The first thing is to make learners notice the existence of vocabulary learning strategies and convince them of their usefulness. In other words, a (short) strategy training session can be held preceding the software use by arousing the learners’ metalinguistic awareness to fully maximize the software learning potential. 2. The user data recording system will be elaborated to allow a more detailed recording of the user actions, e.g., what words are viewed in WF or WMA. This can enable us to look at user actions more clearly in relation to other learner information, such as previous vocabulary knowledge. It might also lead to a more satisfactory answer to the question how user actions are related to learning results. 3. We need to further test the CALL efficacy model. In the present study, user actions and learning results were investigated under the constraints embedded in

Computer Assisted Vocabulary Learning

39

the software. In the next study we will make a different version of WUFUN with all the constraints removed, where the user is given complete freedom to decide what and in what order to do with the software. We shall compare the user actions and the learning outcome in two conditions: one with constraints and the other constraint-free. Acknowledgements We wish to extend our thanks to the following: Sylviane Granger (University of Louvain) for her support and for her constructive comments on earlier versions of this research; Nora Condon (University of Louvain) for her insightful remarks and her participation in our lengthy discussions; Frank Boers (Erasmus College of Brussels and University of Antwerp) for his careful reading of the text, for his many helpful suggestions and for his keen interest in our research; the two anonymous reviewers on whose suggestions we have endeavoured to act. Notes 1. This independence and know-how are essentially what we mean by ‘good learner’. It was a key feature of the method of language learning for non-language specialists developed by a number of Belgian linguists in the 1980s (Kelly, 1989; Ostyn & Godin, 1985). The learner assumes responsibility for his or her learning, and is given the materials and knowledge needed to progress on their own. It is beyond the scope of this paper to say what that knowledge is as that would take us into the wide and well-researched world of learning strategies. 2. Implicit learning in cognitive psychology can be defined as ‘‘learning without awareness of what is learned’’ (Dekeyser, 2003, p. 314). Thus explicit learning can be defined as learning with awareness of what is learned. 3. This cultural aspect of vocabulary learning is stressed and discussed at some length in one of the research papers that preceded the development of the software (Vanparys et al., 1997). 4. Idioms are introduced for two purposes: to add them to the learners’ lexicon and to show how idioms in different languages reflect the culture of the language. 5. Ridiculous and funny can be both translated into hao xiao de in Chinese. Thus if a Chinese learner only remembers the translation for the two words s/he would not be able to know that ridiculous has a negative connotation while funny is always positive. 6. Each time the user goes to WF or WMA to view a word, a count will be recorded. If the data show that a user has viewed 15 words in WF, this only means s/he has referred to words in WF 15 times and does not necessarily mean s/he has looked up 15 different words, because a given word can be viewed several times. Due to some technical constraints, the software programmer was unable to develop the function to record what words were viewed. 7. The motivation source may be their perceived value of using the software since most Chinese learners are keen to improve their English on account of the exam requirement and to give them more chances of professional advancement.

Notes on contributors Qing Ma is currently doing a Ph.D. in applied linguistics at the University of Louvain, Faculty of Arts, Belgium. Her main research interests include second language vocabulary acquisition and CALL.

40 Q. Ma and P. Kelly Peter Kelly is a professor of linguistics at China University of Three Gorges, formerly senior professor at the University of Namur, Belgium, where he directed the School of Modern Languages. His main research interests are in the area of second language acquisition.

References Ames, W. S. (1966). The development of a classification scheme of contextual aids. Reading Research Quarterly, 11(1), 57 – 82. Asher, J. J. (1964). Towards a neo-field theory of behaviour. Journal of humanistic psychology, 4, 85 – 94. Asher, J. J. (1983). Learning another language through actions. Los Gatos: California Sky Oaks Productions, Inc. Atkinson, R. C., & Raugh, M. R. (1975). An application of the mnemonic keyword method to the acquisition of a Russian vocabulary. Journal of Experimental Psychology: Human Learning and Memory, 104(2), 126 – 133. Beck, I. L., McKeown, M. G., & McCaslin, E. S. (1983). Vocabulary development: all contexts are not created equal. The Elementary School Journal, 83(3), 177 – 181. Boers, F., Eyckmans, J., & Stengers, H. (2004). Researching mnemonic techniques through CALL: the case of multiword expressions. Proceedings of The Eleventh International CALL Conference (pp. 43 – 48). Antwerp: University of Antwerp. Boers, F., & Lindstromberg, S. (2005). Finding ways to make phrase-learning feasible: the mnemonic effect of alliteration. System, 33, 225 – 238. Burt, M., & Dulay, H. (1975). New directions in second language teaching, learning and bilingual education. Washington, DC: TESOL. Chapelle, C. A. (2001). Computer applications in second language acquisition. Cambridge: Cambridge University Press. Chun, D. M., & Plass, J. L. (1996a). Effects of multimedia annotations on vocabulary acquisition. The Modern Language Journal, 80(2), 183 – 198. Chun, D. M., & Plass, J. L. (1996b). Facilitating reading comprehension with multimedia. System, 14(4), 503 – 518. Clark, R. E. (1994). Media will never influence learning. Educational Technology Research and Development, 42(2), 21 – 29. Coady, J. (1993). Research on ESL/EFL vocabulary acquisition: putting it in context. In T. Huckin, M. Haynes, & J. Coady (Eds.), Second language reading and vocabulary learning (pp. 3 – 23). Norwood, NJ: Ablex Publishing. Coady, J. (1997). L2 vocabulary acquisition: a synthesis of research. In J. Coady & T. Huckin (Eds.), Second language vocabulary acquisition (pp. 273 – 290). Cambridge: Cambridge University Press. Cohen, A. D. (1987). The use of verbal and imagery mnemonics in second-language vocabulary learning. Studies in Second Language Acquisition, 9(1), 43 – 64. Cohen, A. D. (1998). Strategies in learning and using a second language. Harlow, Essex: Longman. Cohen, A., Weaver, S. J., & Li, T. Y. (1996). The impact of strategies-based instruction on speaking a foreign Language. CARLA Working Paper Series, 4. Retrieved January 10, 2005, from www.carla.umn.edu/about/profiles/CohenPapers/SBIimpact.pdf Colpaert, J. (2003). Introduction to CALL. Lecture given at the ELSNET Summer School 2003, June, Lille, France. Colpaert, J. (2004). Design of online interactive language courseware: conceptualisation, specification and prototyping. Research into the impact of linguistic-didactic functionality on software architecture. Unpublished PhD thesis, University of Antwerp, Belgium. Retrieved June 8, 2005, from www.didascalia.be/doc-design.pdf

Computer Assisted Vocabulary Learning

41

Dekeyser, R. (2003). Implicit and explicit learning. In J. Doughty & M. L. Long (Eds.), The hand book of second language acquisition (pp. 313 – 348). Oxford: Blackwell. De Ridder, I. (2002). Visible or invisible links: Does the highlighting of hyperlinks affect incidental vocabulary learning, text comprehension, and the reading process? Language Learning & Technology, 6(1), 123 – 146. De Saussure, F. (1974). Course in general linguistics. London: Fontana/Collins. Duquette, L., Renie´, D. & Laurier, M. (1998). The evaluation of vocabulary acquisition when learning French as a second language in a multimedia environment. Computer Assisted Language Learning, 11(1), 3 – 34. Ervin-Tripp, S. (1974). Is second language learning like the first? TESOL Quarterly, 8, 111 – 127. Gary, J. O. (1975). Delayed oral practice in initial stages of second language learning. In M. Burt & H. Dulay (Eds.), New directions in second language teaching, learning and bilingual education (pp. 89 – 95). Washington, DC: TESOL. Gary, N., & Gary, J. O. (1982). Packaging comprehension materials: towards effective language instruction in difficult circumstances. System, 10(1), 61 – 69. Goodfellow, R. (1994). A computer-based strategy for foreign-language vocabulary learning. Unpublished PhD thesis, Open University, UK. Goodfellow, R. (1995). A review of the types of CALL programmes for vocabulary instruction. Computer Assisted Language Learning, 2 – 3, 205 – 226. Groot, P. J. M. (2000). Computer assisted second language vocabulary acquisition. Language Learning & Technology, 4(1), 60 – 81. Hulstijn. J. (1992). Retention of inferred and given word vocabulary learning. In P. J. Arnaud and H. Be´joint (Eds.), Vocabulary and applied linguistics (pp. 113 – 125). London: Macmillan. Hulstijn, J. (2003). Incidental learning and intentional learning. In J. Doughty & M. L. Long (Eds.), The handbook of second language acquisition (pp. 349 – 381). Oxford: Blackwell Publishing Ltd. Kelly, P. (1986). Solving the vocabulary retention problem. ITL, 74, 1 – 16. Kelly. P. (1989). A particular application of the RALEX method of foreign language learning. Le Langage et l’Homme, 24(70), 153 – 160. Kelly, P. (1992). Does the ear assist the eye in the long-term retention of lexis? International Review of Applied Linguistics, 30(2), 137 – 145. Kelly, P., & Li, X. (2005). A new approach to learning English vocabulary: more efficient, more effective, and more enjoyable. Beijing: Foreign Language Teaching and Research Press. Kelly, P., Li, X., Vanparys, J., & Zimmer, C. (1996). A comparison of the perceptions and practices of Chinese and French-speaking Belgian university students in the learning of English: the prelude to an improved programme of lexical expansion. ITL, 113 – 114, 275 – 303. Knight, S. (1994). Dictionary: The tool of last resort in foreign language reading? A new perspective. The Modern Language Journal, 78, 285 – 299. Krashen, S. (1989). We acquire vocabulary and spelling by reading: additional evidence for the input hypothesis. The Modern Language Journal, 73(4), 440 – 464. Krashen, S. (1993). The power of reading. Englewood Colorado: Libraries Unlimited Inc. Krashen, S., & Terrell, T. (1983). The natural approach: language acquisition in the classroom. Oxford: Pergamon Press. Kukusla-Hulme, A. (1988). A computerized interactive vocabulary development system for advanced learners. System, 16(2), 163 – 170. Laufer, B. (1997). The lexical plight in second language reading: words you don’t know, words you think you know and words you can’t guess. In J. Coady, & T. Huckin (Eds.), Second language vocabulary acquisition (pp. 20 – 34). Cambridge: Cambridge University Press. Laufer, B. (1998). The development of passive and active vocabulary in a second language: Same or different? Applied Linguistics, 19(2), 255 – 271. Laufer, B. (2001). Reading, word-focused activities and incidental vocabulary acquisition in a second language. Prospect, 16(3), 44 – 54.

42 Q. Ma and P. Kelly Laufer, B., Elder, C., Hill, K, & Congdon, P. (2004). Size and strength: do we need both to measure vocabulary knowledge? Language Testing, 21(2), 202 – 226. Laufer, B., & Hill, M. (2000). What lexical information do L2 learners select in a CALL dictionary and how does it affect word retention? Language Learning & Technology, 3(2), 58 – 76. Laufer, B., & Nation, I. S. P. (1995). Vocabulary size and use: lexical richness in L2 written production. Applied Linguistics, 16, 307 – 322. Levy, M. (1997). Computer assisted language learning. Oxford: Clarendon Press. Levy, M. (1999). Design processes in CALL: integration theory, research and evaluation. In K. Cameron (Eds.), Computer assisted language learning: media, design and applications (pp. 84 – 107). Lisse: Swets & Zeitlinger. Levy, M. (2002). CALL by design: discourse, products and process. ReCALL, 14(1), 55 – 84. Li, X., Song, X., Zimmer, C., Vanparys, J., & Kelly, P. (1999). WUFUN: a new approach to more efficient and effective vocabulary learning. ITL, 125 – 126, 181 – 194. Miller, G. A. (1996). The science of words. New York: Scientific American Library. Nagy, W. E., Herman, P. A., & Anderson, R. C. (1985). Learning words from context. Reading Research Quarterly, 20, 233 – 253. Nation, I. S. P. (1990). Teaching and learning vocabulary. New York: Newbury House Publishers. Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press. Nord, J. R. (1978). Developing listening fluency before speaking: an alternative paradigm. Paper presented at the 5th World Congress of Applied Linguistics, Montreal, Canada. O’Malley, J. M., & Chamot, A. U. (1990). Learning strategies in second language acquisition. Cambridge: Cambridge University Press. O’Malley, J. M., Chamot, A. U., Stewner-Manzanares, G., Russo, R. P., & Kupper, L. (1985). Learning strategy applications with students of English as a second language. TESOL Quarterly, 19, 557 – 584. Ostyn, P. & Godin, P. (1985). RALEX: An alternative approach to language teaching. The Modern Language Journal, 6(4), 346 – 355. Oxford, R. L., & Scarcella, R. C. (1994). Second language vocabulary learning among adults: State of the art in vocabulary instruction. System, 22(2), 231 – 243. Paribakht, T. S., & Wesche, M. B. (1997). Vocabulary enhancement activities and reading for meaning in second language vocabulary acquisition. In J. Coady, & T. Huckin (Eds.), Second language vocabulary acquisition (pp. 174 – 200). Cambridge: Cambridge University Press. Pavio, A., & Desrochers, A. (1979). Effects of an imagery mnemonic on second language recall and comprehension. Canadian Journal of Psychology, 33, 17 – 28. Pavio, A., & Desrochers, A. (1980). A dual-coding approach to bilingual memory. Canadian Journal of Psychology, 34, 388 – 399. Postovsky, V. A. (1975). The priority of aural comprehension in the language acquisition process. Paper presented at the 4th AILA World Congress, Stuttgart, Germany. Pressley, M., & Levin, J. R. (1981). The keyword method and recall of vocabulary words from definitions. Journal of Experimental Psychology: Human Learning, 17(1), 72 – 76. Sternberg, R. J. (1987). Most vocabulary is learned from context. In M. G. McKeown & M. E. Curtis (Eds.), The nature of vocabulary acquisition (pp. 89 – 105). London: Lawrence Erlbraum Associates. Vanparys, J., Zimmer, C., Li, X., & Kelly, P. (1997). Some salient and persistent difficulties encountered by Chinese and Francophone students in the learning of English vocabulary. ITL, 115 – 116, 137 – 164. Wenden, A. L. (2002). Learner development in language learning. Applied Linguistics, 23, 32 – 55. Wesche, M. B., & Paribakht, T. S. (2000). Reading-based exercises in second language vocabulary learning: an introspective study. The Modern Language Journal, 84(2), 196 – 213. Winitz, H. (1977). Nonauditory auditory disorders. Otolaryncologic Clinics of N. America, 10, 187 – 192. Winitz, H. (1978). The learnables. Kansas: International Linguistics Corporation. Winitz, H., & Reeds, J.A. (1973). Rapid acquisition of a foreign language (German) by the avoidance of speaking. International Review of Applied Linguistics, 18(3), 245 – 247.

Computer Assisted Vocabulary Learning

43

Appendix A. Vocabulary items to be studied in WUFUN Words Acquaintance, available, burst, dam, damage, despair, dump, fail, formal, funny, injury, jump, land, policy, quantity, ridiculous, roof, shallow, shrink, sign, stretch, suit, utterly, wonder, weight. Idioms He is in the depths of despair I am fit to burst I split my sides laughing.

44 Q. Ma and P. Kelly

Computer Assisted Vocabulary Learning

45