L@S 2016 · Work in Progress
April 25–26, 2016, Edinburgh, UK
Browser Language Preferences as a Metric for Identifying ESL Speakers in MOOCs Judith Uchidiuno
Evelyn Yarzebinski
Abstract
HCI Institute
HCI Institute
Carnegie Mellon University
Carnegie Mellon University
Pittsburgh, PA 15213 USA
[email protected]
Pittsburgh, PA 15213 USA
[email protected]
Amy Ogan
Jessica Hammer
HCI Institute
HCI Institute
Carnegie Mellon University
Carnegie Mellon University
Pittsburgh, PA 15213 USA
[email protected]
Pittsburgh, PA 15213 USA
Open access and low cost make Massively Open Online Courses (MOOCs) an attractive learning platform for students all over the world. However, the majority of MOOCs are deployed in English, which can pose an accessibility problem for students with English as a Second Language (ESL). In order to design appropriate interventions for ESL speakers, it is important to correctly identify these students using a method that is scalable to the high number of MOOC enrollees. Our findings suggest that a new metric, browser language preference, may be better than the commonly-used IP address for inferring whether or not a student is ESL.
[email protected]
Kenneth R. Koedinger HCI Institute Carnegie Mellon University Pittsburgh, PA 15213 USA
[email protected]
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s). L@S 2016, April 25-26, 2016, Edinburgh, Scotland UK ACM 978-1-4503-3726-7/16/04. http://dx.doi.org/10.1145/2876034.2893433
Author Keywords
Foreign Language Students; MOOC Accessibility
Introduction Massively Open Online Courses (MOOCs) are revolutionizing higher education by making collegelevel courses available to anyone in the world with an internet connection. This open access is proving successful – the average MOOC has about 43,000 students enrolled [3] with a vast percentage of these enrollees from non-English speaking countries [2]. Given the international audience that MOOCs attract, there is a potential accessibility issue for students with
277
L@S 2016 · Work in Progress
April 25–26, 2016, Edinburgh, UK
English as a Second Language (ESL), as the majority of MOOC courses are offered in the English language1.
analyze the video interaction behavior for native English speaking and ESL students as identified by IP address, and again for native English speaking and ESL students as identified by browser language preference. We then evaluate which metric better captures the expected differences in video interaction behavior between English-native and ESL learners.
Many interventions are possible to support ESL students in MOOCs, such as contextual dictionary support, closed captioning, and connecting them with language-diverse peers. However, in order to deploy these interventions, we must be able to reliably identify ESL students at scale. Current practice is to assume learners speak the dominant language in the country indicated by the IP address from which they access the MOOC [2,4,6]. While this method can identify many ESL speakers correctly, it will misidentify ESL speakers who currently live in English speaking countries – in the US alone, over 20% of the population speaks a language other than English at home [5]. This metric also misidentifies English-proficient students who live in non-English speaking countries.
Empirical Evaluation and Results We analyzed video interaction logs of 905 students who enrolled in and completed a 12-week Psychology MOOC course. We analyzed the number of times they pressed the play and pause buttons, changed the speed of the video (“rate change”), and the number of times their video stalled as a result of buffering and bandwidth issues (“stall”), across all videos in the course. Similar to [4], we determined if students were categorized as English speakers (by IP) if their IP address was associated with a country where English is spoken2. Otherwise, the student was categorized as ESL. To determine students’ ESL status by browser language, students were categorized as English speakers if their most-preferred browser language was English, and as ESL if it was any other language.
There is another readily available metric to infer a student’s language abilities – all popular web browsers allow individuals to set the languages in which they prefer to view their web pages. This metric is based on students’ personal preferences rather than simply their location, yet does not require self-report or access to survey responses which learners may not return. To evaluate the potential of this metric, we analyzed video interaction logs of students enrolled in an MOOC deployed on the Coursera platform. Based on prior literature showing ESL students’ relative difficulty comprehending spoken English [1], we expect that these students will interact differently with MOOC videos compared to native English speakers; we expect that they will experience more difficulty with videos, as demonstrated by increased pausing and replaying. We
We determined the students’ ESL status using each of the two metrics, and analyzed the results using our expectation of finding behavioral differences in video use between English-native and ESL learners to evaluate the goodness of these metrics. Of the 905 students, 34.97% were flagged as ESL by their browser language preference, while 38.64% were flagged as ESL based on their IP address. The ESL status for 77.5% of students matched across both metrics. 2
1
https://www.mooc-list.com /languages
https://www.cia.gov/library/publications/the-worldfactbook/fields/2098.html
278
L@S 2016 · Work in Progress
Avg. Play Count Avg. Pause Count
English
ESL
N=473;
N=228;
67.4%
33.1%
151.05
183.55
134.12
181.57
April 25–26, 2016, Edinburgh, UK
Figure 1 shows a comparison of the video feature use (play, pause, rate change, and stall) of English speaking to ESL students when categorized by their IP address. A one-way ANOVA was run to determine if there were significant differences in the behavior of the students between the two groups.
Table 1. Average play/pause count when ESL status based on IP and Browser match; 77.5% of total N
Avg. Play Count Avg. Pause Count
English
ESL
N=120;
N=84;
59%
41%
132.03
253.71
111.59
236.04
Table 2. Average play/pause count when IP_ESL Browser_ESL, showing ESL based on browser; 22.5% of total N
significantly more than English speaking students, as expected from prior literature. There were no significant differences observed on the rate change count (F(1)=0.089, p=0.765) or the stall count (F(1)=0.773, p=0.379). In other words, when ESL status was determined by the browser language preference, those identified as ESL students are more likely to interact differently with MOOC videos (increased pausing and replaying) than those identified as English-speaking students. It is important to note that since there was no significant difference in stall count, these effects are not a result of bandwidth or buffering issues.
Figure 1: Average Number of Occurrences of Video Interaction Features with ESL Metric of IP Address
Our results show that there were no significant differences in the play count (F(1)=0.001, p=0.975), pause count (F(1)=0.127, p=0.722), rate change count (F(1)=0.107, p=0.744), and stall count (F(1)=2.009, p=0.157). In other words, when students’ ESL status was determined based on IP address, there were no significant differences in the video interaction behavior of English speaking students, and foreign language students. Figure 2 shows the same factors with ESL status determined by browser language preference. Results of a one-way ANOVA shows that ESL students are pressing the play button (F(1)=6.406, p=0.0115), and the pause button (F(1)=8.191, p=0.00431)
Figure 2: Avg. Number of Occurrences of Video Interaction Features with ESL Metric of Browser Language Preference
Table 1 shows the average play/pause count for the 77.5% of the cases where both metrics (IP and browser) matched in identifying students as ESL or English speaking. The average play/pause count is similar to the results shown in Figure 2 – ESL students are pressing play and pause more than English speaking students. Table 2 shows the play/pause count
279
L@S 2016 · Work in Progress
April 25–26, 2016, Edinburgh, UK
for the 22.5% of cases where the students’ browser language and IP address differ, displayed by browser classification; it, too, is consistent with the results in Figure 2, with ESL students pressing pause and play more. This means that with the inverse classification, by IP address, students identified as English speaking pause and replay more than students identified as ESL. When browser language and IP address data conflict, inferring students’ language abilities based on their browser preferences appears to identify Englishspeaking and ESL students better than classifying by IP address.
Conclusion In our research study, we assessed the use of two metrics for ESL identification by evaluating video interactions of students in a MOOC course. We show that when ESL students are identified using the popular method of IP address tracking, their video interaction behaviors are very similar to English-speaking students, which does not support well-established research findings that foreign language students struggle significantly with comprehending spoken language. Our findings show instead that inferring students’ language abilities from their web browser language preferences more accurately predicts their video usage, a proxy for ESL status. Although there may be some students who set English as their preferred language (even if they are not a native speaker), they may already be proficient enough to not need additional language support. We believe this finding will more quickly and accurately classify ESL and native English speakers, and will allow MOOC providers to design and accurately deliver more appropriate interventions for each group.
Acknowledgements We thank Google for their support in this and related research through a Google Focus Award.
References 1. Anna Ching-Shyang Chang and John Read. 2006. The Effects of Listening Support on the Listening Performance of EFL Learners. TESOL Quarterly 40, 2: 375–397. 2. Jennifer DeBoer, Glenda S. Stump, Daniel Seaton, and Lori Breslow. 2013. Diversity in MOOC students’ backgrounds and behaviors in relationship to performance in 6.002 x. Proceedings of the Sixth Learning International Networks Consortium Conference. 3. Gregory Ferenstein. Study: Massive Online Courses Enroll An Average Of 43,000 Students, 10% Completion. TechCrunch. http://social.techcrunch.com/2014/03/03/study-massiveonline-courses-enroll-an-average-of-43000-students-10completion/ 4. Philip J. Guo and Katharina Reinecke. Demographic Differences in How Students Navigate Through MOOCs. L@S 2014, March 4–5, 2014. 5. US Census Bureau Public Information Office. New Census Bureau Report Analyzes Nation’s Linguistic Diversity American Community Survey (ACS) - Newsroom - U.S. Census Bureau. https://www.census.gov/newsroom/releases/archives/amer ican_community_survey_acs/cb10-cn58.html 6. Daniel T. Seaton, Sergiy Nesterko, Tommy Mullaney, Justin Reich, Andrew Ho, and Isaac Chuang. 2014. Characterizing video use in the catalogue of MITx MOOCs. Proceedings of the European MOOC Stakeholders Summit. Lausan: PAU Education: 140–146.
280