Mar 4, 2000 - in grades 2-6, again support the validity of reading aloud as a measure of overall reading achievement ... In their study, scores obtained by third-grade students on the Reading subtest of the. Woodcock ..... Journeys (Grade 3).
Running Head: TIMED ORAL READINGS AND STATEWIDE TESTS
Using Timed Oral Readings to Predict Students’ Performance on Statewide Achievement Tests
Lindy Crawford Steve Stieber Gerald Tindal
University of Oregon March 4, 2000
Abstract
In this study, the curriculum-based measure of reading aloud from narrative passages is used to predict performance on a statewide achievement test. CBM performance is moderately correlated with scores on multiple choice, reading and math tests administered during the same year, and tests administered one year later. The results provide support for use of timed oral readings to predict students’ performance on statewide achievement tests. Results are interpreted as a new application of research conducted on CBM during the past two decades.
2
Introduction Current educational reform efforts emphasize increases in student performance as demonstrated by scores on criterion-referenced statewide achievement tests. Often, these tests are administered on an infrequent basis, providing teachers with limited information about students’ ongoing progress toward mastering academic benchmarks. In spite of this infrequency, high-stakes decisions are made with statewide test data, making it imperative that students’ academic progress be closely monitored through use of other measurement systems. The widespread adoption of statewide tests, as well as the importance placed on test results has created a need for teachers to adopt classroom-based measurement systems capable of providing valid progress-monitoring data about the academic gains made by students. One progress monitoring system with proven reliability and validity is curriculum-based measurement (Deno, 1985). Numerous research studies have demonstrated the usefulness of CBM for monitoring the academic progress of students in the basic skill area of oral reading (Fuchs, Deno, & Mirkin, 1984; Marston, Deno, & Tindal, 1983; Marston & Magnusson, 1985). However, relatively little research has been conducted exploring its use as a measurement tool for predicting students’ progress toward passing criterion-referenced statewide achievement tests (Helwig & Tindal, 1999; Nolet & McLaughlin, 1997). The majority of progress monitoring studies using curriculum-based measurement collect data during one school year (see meta-analysis by Fuchs & Fuchs, 1986), and few studies report on the collection of CBM data over multiple-years (Dong-il, 1998). This study generates longitudinal data on a cohort of students as they progress from second- to third-grade, providing information on the criterion validity, as well as the predictive validity of curriculum-based measures in reading. Specifically, we analyze the relationship between students’ scores on the
3
curriculum-based measure of timed oral readings and their scores obtained on a statewide reading and math test. The research question that we propose in this study is, “When given a grade-level passage taken from a basal text, how fluently do students need to read aloud to be able to pass statewide reading and math tests?” Answering this question will assist teachers in the identification of students needing more intensive instruction far in advance of their actual participation in statewide tests. Furthermore, establishing a range of reading rates that strongly predict students’ scores on statewide tests will provide teachers with classroom-based information that can be frequently collected and used to adjust instruction. Evidence of a strong relationship between scores on the timed oral readings and scores on the statewide achievement tests will further validate the use of CBM for measuring students’ progress toward achieving preestablished academic benchmarks. Curriculum-based measurement in Reading. The validity of curriculum-based measures in reading has been well established in a multitude of educational contexts using various normreferenced, achievement tests as criterion measures (Shinn, 1998). Initial research studies, conducted at the University of Minnesota during the early 1980’s, supported the technical adequacy of CBM in the area of reading. For example, in 1982, Deno, Mirkin, and Chiang established the criterion validity of reading aloud as a measure of general reading ability. In their study, 43 elementary-age students read lists of isolated words, read aloud from a narrative passage, and filled in missing words from a similar passage (cloze test). Students’ scores were correlated with scores on two technically adequate achievement tests. The authors found that reading aloud was highly correlated with students’ test performance (r = .78; r = .80). Marston (1989) lists 14 studies conducted between 1981 and 1988, further validating the
4
use of timed oral reading as an indicator of overall reading achievement. Correlation coefficients ranged from .63 to .90, with most correlations clustering around .80. Follow-up studies support earlier findings, strengthening the validity of timed oral reading as a measure of reading achievement. For example, Fuchs, Fuchs, and Maxwell (1988) reported that words read correctly in a one-minute sample correlated with the Stanford Achievement Test Reading Comprehension subtest (Gardner, Rudman, Karlsen, & Merwin, 1982), at a level of statistical significance (r =.91; p < .05). Furthermore, words read correctly was also a strong indicator of performance on two other validated curriculum-based measures of reading -- cloze comprehension tests, and reading recalls. As recently as 1993, Jenkins and Jewell studied the correlation between readaloud measures and two norm-referenced achievement tests. Results obtained from 335 students in grades 2-6, again support the validity of reading aloud as a measure of overall reading achievement (r = .80-.88). Establishing the technical adequacy of any measurement system is an important first step toward its acceptance by researchers and practitioners. The usefulness of any assessment procedure for teachers, however, often is judged by its ability to detect student growth. The CBM practice of timed oral reading has demonstrated its effectiveness in this area. For example, Marston and Magnusson (1985) compared student growth in reading using the procedure of timed oral readings, and scores obtained on the reading portion of the Science Research Associates (SRA) Achievement Series: Vocabulary and Comprehension (Naslund, Thorpe, & Lefever, 1978). The authors used correlated t tests to analyze student gains on all three measures. Over a sixteen week period of time, scores obtained on timed oral readings were reported as being the most sensitive to student progress (p > .001), while the SRA Vocabulary also demonstrated the ability to detect student progress (p > .01). The SRA Comprehension test did
5
not reveal any significant differences. In a similar study, Marston, Deno, and Tindal (1983) found that curriculum-based measures in reading were more sensitive to student growth than were norm-referenced achievement tests. The CBM measure in this study was a grade-level word list read by lowachieving students in grades 3-6 during a one-minute timing. The standardized test was the Stanford Achievement Test (SAT) Reading Comprehension subtest (Madden, Gardner, Rudman, Karlsen, & Merwin, 1978). Students completed both measures at the beginning of the study, and completed the same measures 10 weeks later. Using a paired t-test analysis with effect size as the dependent variable, the authors found significant differences between the gains students made on the curriculum-based measures in reading and the gains made on the SAT Reading Comprehension subtest ( p < .001). Curriculum-based measurement in Math. A second area of interest to this study is the relationship of oral reading and math achievement. Statewide math tests often consist of multiple-choice questions and/or problem-solving tasks (Council of Chief State School Officers, 1996), that require a certain level of reading proficiency. The importance of proficient reading on successful test performance in math has been well-established (Clarkson, 1983; Helwig, Rozek-Tedesco, Heath, & Tindal, 1999; Tindal, Hollenbeck, Heath, & Almond, 1998). One study, conducted by Espin and Deno (1993) reported low-moderate correlations (r= .32; .36), between timed, text-based reading and math achievement as measured by the Tests of Achievement and Proficiency (Scannell, Haugh, Schild, & Ulmer, 1986). However, reading aloud was more strongly correlated with scores on the achievement tests than scores on other classroom measures (comprehension questions, and student grade point averages). Interestingly, there was a large discrepancy between correlations
6
obtained with high-performing students (r = .05 and .13 respectively), and those obtained with low-performing students (r = .28 and .31 respectively). These discrepancies support the usefulness of using timed oral readings in the classroom to closely monitor the progress of lowperforming students as they work toward meeting statewide benchmarks in math. Although there is a paucity of studies that explore the relationship of timed oral readings and math achievement, studies have explored the relationship between non-CBM tests of reading achievement and math achievement. For example, in a stepwise regression consisting of 9 independent variables, Roach (1981) found that three variables, mental ability, reading achievement, and family size, accounted for 40% of the variance on the dependent variable of math achievement. Although no simple correlations for the independent variables were reported, reading achievement was a strong predictor of math achievement, loading second only to mental ability on the regression formula. In a longitudinal study, spanning six years, McGrew and Pehl (1988) studied the predictive validity of the Woodcock Johnson Tests of Achievement (Woodcock & Johnson, 1977). In their study, scores obtained by third-grade students on the Reading subtest of the Woodcock Johnson demonstrated signfiicant correlations with ninth-grade math achievement as measured by the California Achievement Test (CTB/McGraw Hill, 1977). Interestingly, scores on the Reading subtest of the Woodcock Johnson were as effective at predicting future math performance ( r = .66), as were scores on the Math subtest (r = .68). Research into the relationship between reading and math achievement has established that the two behaviors are correlated, and research on curriculum-based measures in reading demonstrates a moderate relationship between timed oral readings and performance on normreferenced achievement tests. Our aim is to expand on this research by exploring the ability of
7
various reading rates to predict eventual performance on criterion-referenced statewide achievement tests. Establishing a critical range of reading rates will extend the use of CBM as a viable classroom tool for monitoring students’ progress toward meeting statewide benchmarks. Method Participants Participants for this study represented six blended classrooms, consisting of second- and third-grade students. Classrooms were located within one rural school district in Western Oregon. Seventy-seven second-grade students participated in Year One of the study. Fifty-one of these students also participated as third-graders (Year Two). Our study includes the 51 students who participated in both years of the study. Twenty-nine of these students were females and 22 were males, representing a sample that was predominantly white (94%). The majority of students were in general education (n = 42). Out of the 9 students receiving special education services, 4 received assistance in at least one academic area, 3 received speech and/or language services, and two students received both academic assistance and speech/language services. Students classified as special education received most of their instruction in the general education classroom. Six general education teachers and two specialists volunteered to participate in the study. Five of the general education teachers had at least 10 years teaching experience, and the two specialists had an average of 8 years of teaching experience. All of the teachers had their elementary teaching certifications with no additional endorsements. One specialist had a M.Ed., and the other specialist had a Bachelor of Arts degree. Measures To assess students’ reading rate, we chose three passages for use during each year of the
8
study. Passages from the Houghton Mifflin Basal reading series (1989), were modified to contain approximately 200-250 words with cogent beginning and endings. Passages used in Year One were randomly selected from the second-grade basal, and passages used in Year Two were randomly selected from the third-grade basal. Students oral reading of each passage was scored by calculating correct words read per minute, and totals were averaged across all three timings. Third-grade students were also tested on the Oregon Statewide Math Assessment and the Oregon Statewide Reading Assessment, both criterion-referenced tests containing multiple choice questions and performance tasks (Oregon Department of Education, 1999). Results of the multiple choice section of the statewide reading and math tests are reported in standardized scores on a Rasch scale. Internal consistency coefficients for the reading test range from .68 to .92, and on the math test from.73 to .91. We conducted two analyses. Our first analysis explored the correlation between the scores obtained on both measures within Year Two (when students were in third-grade), and the second analysis used the scores obtained on timed oral readings in Year One (when students were in second-grade) to predict scores obtained on the statewide tests during Year Two of the study (when students were in third-grade). Students completed the curriculum-based measures in January of each year of the study. Third-grade students also completed the statewide reading, and math assessments during March of the second year of the study. Results The means and standard deviations of the study variables are reported in Table 1. Results show that the mean for scores on the statewide reading assessment meets the state-established criterion for a passing score (set at 201). However, the mean for the statewide math assessment falls short of the established criterion by 2 points (set at 202). Scores ranged from 172 to 235 on
9
the reading test and 179 to 230 on the math test. Out of the 51 scores reported representing all students in the study, 65% passed the reading assessment, and 45% passed the math assessment. The data also show evidence of a large increase in the number of correct words read per minute by third-grade students compared to the number of correct words read the previous year. We calculated Pearson correlation coefficients to examine the association between students’ performance on the reading test in third-grade, and the number of correct words read per minute on timed oral readings (see Table 2). This relationship was moderate, with correlations between second-grade timed oral readings and state scores slightly higher than those obtained in the third-grade. Correlation coefficients for students’ performance on the math test in the third-grade and their timed oral reading scores also were moderate, with the Across-years correlation slightly higher than the Within-year correlation. We found no significant differences between the Within-year, and Across-years correlations (Howell, 1987). To assess the ability of CBM in differentiating between students’ who passed or did not pass the statewide reading test, we constructed a 3 x 4 classification table for the Within-Year scores, and one for the Across-Years scores. We relied on previously established normative data to determine the values of each cell. In 1992, Hasbrouck and Tindal collected normative data on 9,164 students across 4 grade-levels. Their results are often cited when interpreting results of other CBM studies (Nolet & McLaughlin, 1997), and have been used by published curriculum materials (Read Naturally, 1999). Due to the broad acceptance of the validity of the Hasbrouck and Tindal norms (1992), and the relatively small sample size reflected in this study, we decided to use their norms to create a context for interpreting our scores. The Within-Year data are reported in Table 3. In the norms established by Hasbrouck and Tindal (1992), students reading below the 25th percentile in the winter of third-grade read
10
between 0-70 words per minute. We used these rates to establish our first cell. The remaining three cells in our study reflect the remaining three quartiles in the Hasbrouck et. al. study, and are represented by the following rates: (a) second cell, 71-92 words per minute, (b) third cell, 93-122 words per minute, and (c) fourth cell, 123 words and above. The Within-Year data highlight a general pattern between students reading rate and their scores on the statewide reading test. The strongest finding is that 84% of students reading at the 50th percentile and above passed the statewide assessment. Out of the 20 remaining students, 13 did not pass the test. As a third-grade student, 117 correct words per minute was the critical rate needed to pass the statewide reading test, evidenced by the fact that 94% of students reading less than 117 words per minute did not pass (17 of 18 students). The Across-Years data are reported in Table 4. In the norms established by Hasbrouck and Tindal (1992), students reading below the 25th percentile in the winter of second-grade read between 0-46 words per minute. We used these rates to establish our first cell value. The remaining three cells in our study reflect the remaining three quartiles in the Hasbrouck et. al. study, and are represented by the following rates: (a) second cell, 47-77 words per minute, (b) third cell, 78-105 words per minute, and (c) fourth cell, 106 words and above. In the Across-Years analysis, oral reading rate distinctly parallels categories represented by the three levels of statewide test scores. Scores for 21 of the 29 students who passed the statewide reading test were within the middle two quartiles, and all of the students who read below the 50th percentile, as established by Hasbrouck and Tindal (1992), also failed the statewide assessment. A critical finding is that 100% of the students reading at least 72 correct words per minute in second grade passed the statewide reading test in third grade.
11
Discussion Our results demonstrate that rates obtained on timed oral readings are moderately correlated with scores obtained on criterion-referenced reading and math tests. The Within-Year results of this study are similar to results presented by Jenkins and Jewell (1993). They report Within-Year correlations ranging from .60 to .87 between read aloud measures and scores obtained on the Metropolitan Achievement Test Sixth Edition (Prescott, Balow, Hogan, & Farr, 1984) for third-grade students. Of particular interest is the strength of the Across-Years correlations. Although differences between the Within-Year and Across-Years correlations are not statistically significant, the Across-Years’ correlation is stronger for both measures -- reading and math. The correlation between timed oral readings and success on the math test is not surprising. The ability to read proficiently is essential in order to perform various tasks in math (Aaron, 1968), and proficient reading is necessary in order to access information presented on math tests containing word problems (Helwig, et. al. 1999). Because demands on the math portion of any large-scale test consisting of multiple choice questions require a certain level of reading skill, it is logical that good readers do well and poor readers do poorly. Interpretation of these results is tempered by some of the study’s methodological weaknesses. One limitation of the study is that all measures were administered by participating teachers. Although teachers were trained in the administration of reading timings, and proficient in the administration of the statewide test, no formal reliability checks were performed to assure standardization. However, because students received their education in blended classrooms, data were collected by the same teacher for both years of the study. Any bias that may have occurred would have affected the strength of the overall correlations, but not the differences between the
12
correlations. For example, if teachers were weak in their standard administration of timed oral readings during Year One of the study, they were probably weak during Year Two of the study, thus reducing the reliability of the overall correlations, but not directly impacting the differences in the Within-Year, and Across-Years correlations. Construction of the classification tables also warrants caution. We assigned cell values based on normative data generated from previous research. However, the analyses generated by our classification tables could have been strengthened by creating either distinct normative data based on the materials used in this study (requiring a larger, randomized sample), or by assessing students on the same materials used in the study Hasbrouck et. al. study (1992). Either of these steps would have strengthened the validity of our conclusions, and the absence of these steps may explain differences between the mean reading rate of students in this study, and the median reading rate of students reported in the Hasbrouck and Tindal study (1992). For example, the mean reading rate for second-grade students in this study exceeds the median of 78 words per minute established in the Hasbrouck et. al. study (1992), while the mean reading rate of thirdgraders in this study is much lower than their previously reported 93 words per minute. Conclusions In this study, we extended the conditions of use of CBM into the arena of criterionreferenced statewide achievement tests, further validating its use as a measurement tool capable of providing information about students’ current as well as their future performance. In the clamor for the implementation of statewide tests, it is important that we not lose sight of the benefits derived from standardized, classroom-based assessments. There are obvious benefits to teachers who use curriculum-based measures in reading to monitor students progress. First, measuring students’ oral reading rate allows teachers to predict students’ future performance on
13
statewide tests. The most important finding of this study is the fact that 100% of the secondgrade students who read at least 72 correct words per minute passed the statewide reading test taken the following year. In the third-grade, 94% of the students reading less than 117 correct words per minute did not pass the statewide reading test taken during the same year. This clear and simple data communicates powerful information to practitioners. Increasingly, students’ academic skills are being evaluated through use of statewide tests, and students with a wide diversity of academic skills are being included in these test programs. Another benefit to teachers is that scores obtained on curriculum-based measures can be used to determine the degree of participation of students with disabilities on statewide tests. For example, further research exploring the relationship between oral reading rate and scores on various statewide achievement tests may provide teachers with empirically-sound data for making test inclusion decisions such as when students need test accommodations or modifications. Further research into the predictive validity of timed oral readings on statewide test performance will extend the conditions of use for CBM, making it possible for all students to meaningfully participate in educational reform efforts.
14
References Aaron, I.E. (1968). Reading in mathematics. In V.M. Howes & H.F. Darrow (Eds.), Reading and the elementary school child: Selected readings on programs and practices (pp. 7074). New York: MacMillan. Clarkson, P. (1983). Types of errors made by Papua New Guinean students. Educational Studies in Mathematics, 14, 355-367. Council of Chief State School Officers (1996). State mathematics and science standards, frameworks, and student assessments: What is the status of development in the 50 states? [On-line]. Available: http://www.ccsso.org/frmwkweb.html CTB/McGraw-Hill (1977). The California Achievement Tests. Monterey, CA: Author. Deno, S.L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional Children, 52, 219-232. Deno, S.L., Mirkin, P.K., & Chiang, B. (1982). Identifying valid measures of reading. Exceptional Children, 49, 36-45. Dong-il, K. (1998). Specification of growth model and inter-individual differences for students with severe reading difficulties: A case of CBM. Paper presented at the annual meeting of the Council for Exceptional Children, Minneapolis, MN. (ERIC Document Reproduction Service No. ED 418 553) Espin, C.A., & Deno, S.L. (1993). Performance in reading from content area text as an indicator of achievement. Remedial and Special Education, 14, 47-59. Fuchs, L.S., Deno, S.L., & Mirkin, P.K. (1984). The effects of frequent curriculum-based measurement and evaluation on pedagogy, student achievement and student awareness of learning. American Educational Research Journal, 21, 449-460.
15
Fuchs, L.S., & Fuchs, D. (1986). Effects of systematic formative evaluation: A metaanalysis. Exceptional Children, 53, 199-208. Fuchs, L.S., Fuchs, D., & Maxwell, L. (1988). The validity of informal reading comprehension measures. Remedial and Special Education, 9, 20-28. Gardner, E.F., Rudman, H.C., Karlsen, B., & Merwin, J.C. (1982). Stanford Achievement Test. Iowa City: Harcourt, Brace, Jovanovich. Hasbrouck, J.E., & Tindal, G. (1992). Curriculum-based oral reading fluency norms for students in grades 2 through 5. Teaching Exceptional Children, 24, 3, 41-44. Helwig, R., Rozek-Tedesco, M.A., Heath, B., & Tindal, G. (1999). Reading as an access to math problem solving on multiple choice tests. Journal of Educational Research, 93, 113-125. Helwig, R. & Tindal,G. (1999). Modified measures and statewide assessments. Manuscript submitted for publication. Houghton Mifflin Basal Reading Series (1989). Journeys (Grade 3). Discoveries (Grade 2). Howell, D.C. (1987). Statistical methods for psychology (2nd ed.). Boston, MA: Duxbury Press. Jenkins, J.R., & Jewell, M. (1993). Examining the validity of two measures for formative teaching: Read aloud and maze. Exceptional Children, 59, 421-432. Madden, R., Gardner, E., Rudman, H., Karlsen, B., & Merwin, J. (1978). Stanford Achievement Test. New York: Harcourt Brace Jovanovich. Marston, D. (1989). Curriculum-based measurement approach to assessing academic performance: What it is and why do it. In M.R. Shinn (Ed.). Curriculum-based measurement: Assessing special children, (pp. 18-78). New York: Guilford Press.
16
Marston, D., Deno, S., & Tindal, G. (1983). A comparison of standardized achievement tests and direct measurement techniques in measuring pupil progress (Research Rep. No. 126). Minneapolis, MN: University of Minnesota, Institute for Research on Learning Disabilities. (ERIC Document Reproduction Service No. ED 236 198) Marston, D., & Magnusson, D. (1985). Implementing curriculum-based measurement in special and regular education settings. Exceptional Children, 52, 266-276. McGrew, K.S., & Pehl, J. (1988). Prediction of future achievement by the WoodcockJohnson Psycho-Educational Battery and the WISC-R. Journal of School Psychology, 26, 275281. Naslund, R.A., Thorpe, L.P., & Lefever, D.W. (1978). SRA Achievement Series. Chicago: Science Research Associates. Nolet, V., & McLaughlin, M. (1997). Using CBM to explore a consequential basis for the validity of a state-wide performance assessment. Diagnostique, 22, 146-163. Oregon Department of Education. (1999). Assessment homepage. Available: http://www.ode.state.or.us//asmt/index.htm Prescott, G.A., Balow, I.H., Hogan, T.P., & Farr, R.C. (1984). Metropolitan Achievement Test (MAT-6). San Antonio, TX: The Psychological Corporation. Read Naturally (1999). Saint Paul, MN: Turman Publishing. Roach, D.A. (1981). Predictors of mathematics achievement in Jamaican elementary school children. Perceptual and Motor Skills, 52, 785-86. Scannell, D.P., Haugh, O.M., Schild, A.H., & Ulmer, G. (1986). Tests of Achievement and Proficiency. Chicago, IL: Riverside Publishing.
17
Shinn, M.R. (Ed.). (1998). Advanced applications of curriculum-based measurement. New York: Guilford Press. Tindal, G., Heath, B., Hollenbeck, K., Almond, P., & Harniss, M. (1998). Accommodating students with disabilities on large-scale tests: An experimental study. Exceptional Children, 64, 439-50. Woodcock, R., & Johnson, M. (1977). Woodcock-Johnson Psycho-Educational Battery. Allen, TX: DLM/Teaching Resources.
18
Table 1 Means and Standard Deviations of Study Variables Variable
Means
S. D.
Range Min.
Max.
3rd-Grade Statewide Reading Assessment
202.5
12.4
172
235
3rd Grade Statewide Math Assessment
200.1
9.9
179
230
2nd-Grade Correct Words per Minute
62.3
32.6
7
140
3rd-Grade Correct Words per Minute
103.8
38.6
15
190
3rd Grade: n = 51; 2nd Grade: n = 51
19
Table 2 Correlations between Correct Words per Minute and Statewide Reading, and Math Assessments
Reading Assess. Statewide Reading Assessment Statewide Math Assessment
Grade 3-CWPM
Grade 2-CWPM
---
.60*
.66*
.64*
.46*
.53*
n = 51 * correlations significantly different from zero
20
Table 3 Within-Year Classification Table (3rd grade CWPM to 3rd Grade Statewide Reading Assessment)
Correct Words Per Minute
Fails to Meet State Standard in Reading
Meets State Standard in Reading
Exceeds State Standard in Reading
0 - 70
5
2
0
71 - 92
8
5
0
93 - 122
4
12
0
1
10
4
123 + n = 51
21
Table 4 Across-Years Classification Table (2nd Grade CWPM to 3rd Grade Statewide Reading Assessment)
Correct Words Per Minute
Fails to Meet State Standard in Reading
Meets State Standard in Reading
Exceeds State Standard in Reading
0 - 46
10
5
0
47 - 77
8
12
1
78 - 105
0
9
2
0
3
2
106 +
n = 51
22