Brief communication
The effect of word familiarity on actual and perceived text difficulty Gondy Leroy,1 David Kauchak2 1
Department of Management Information Systems, Eller College of Management, The University of Arizona, Tucson, Arizona, USA 2 Computer Science Department, Middlebury College, Middlebury, Vermont, USA Correspondence to Dr Gondy Leroy, Department of Management Information Systems, Eller College of Management, The University of Arizona, McClelland Hall, Room 430, PO Box 210108, Tucson, AZ 85721-0108, USA;
[email protected] Received 6 July 2013 Revised 9 August 2013 Accepted 10 August 2013 Published Online First 7 October 2013
ABSTRACT There is little evidence that readability formula outcomes relate to text understanding. The potential cause may lie in their strong reliance on word and sentence length. We evaluated word familiarity rather than word length as a stand-in for word difficulty. Word familiarity represents how well known a word is, and is estimated using word frequency in a large text corpus, in this work the Google web corpus. We conducted a study with 239 people, who provided 50 evaluations for each of 275 words. Our study is the first study to focus on actual difficulty, measured with a multiple-choice task, in addition to perceived difficulty, measured with a Likert scale. Actual difficulty was correlated with word familiarity (r=0.219, p