An exploration of rapid-use reading-accuracy tests in

0 downloads 0 Views 3MB Size Report
Mar 17, 2007 - reading-accuracy tests such as the Dynamic Indicators of Basic ... My foremost thanks go to the many children and families I have ...... orthographic units, and diagonal slashes / / to represent the spoken ...... Australian spelling was used for all passages, e.g. colour/color, favourite/favorite. ...... yard for garden.
An exploration of rapid-use reading-accuracy tests in an Australian context

Susan Anne Galletly BSpThy UQ, Grad Dip Teach (Primary) NBCAE, MEd USQ

A dissertation submitted in partial fulfilment of the award of

Doctor of Philosophy

Central Queensland University Faculty of Arts, Humanities, and Education

March 2007

Copyright 2007 Susan Galletly PhD Thesis

1

Abstract When compared to western countries such as the USA, Australia has made relatively little use of reading-accuracy tests at the school level. This is despite there being readily available rapid-use reading-accuracy tests such as the Dynamic Indicators of Basic Early Literacy Skills (DIBELS, Good & Kaminski, 2002a), and the Test of Word Reading Efficiency (TOWRE, Torgesen, Wagner, & Rashotte, 1999). Current developments such as the publication of the report of the National Inquiry into the Teaching of Literacy (NITL) have given impetus to addressing this issue. This investigation explores the use of both the DIBELS and TOWRE tests to establish their usefulness in Australian school settings for the following purposes: • Providing reliable achievement data for monitoring reading-accuracy achievement at school level. • Providing useful qualitative diagnostic data. • Building school and teacher understanding of reading-accuracy development, assessment and instruction. • Building school and teacher effectiveness in instructional decision-making from test-data, to improve reading-accuracy instruction and achievement. The results from the investigation indicate that both DIBELS and TOWRE tests are suitable for the above stated purposes. The results intimate a need to establish norms for Australian use of the tests, and value in developing specific additional tests. Based on the findings of this research, a number of recommendations have been made towards Australian use of DIBELS and TOWRE tests. A model of reading-accuracy development is also presented for use in Australian reading instruction.

Copyright 2007 Susan Galletly PhD Thesis

2

Acknowledgements My foremost thanks go to the many children and families I have worked with down the years who taught me just how hard it is for many children to learn to read, no matter how gifted they may be in other fields. You have helped me learn so much, and are the impetus for my work, my studies and this research. My thanks are also extended to Central Queensland University and the Office of Research, for ongoing support, and the scholarship which make this research possible. It is a luxury for one’s job to be ‘to learn’, and I have revelled in this opportunity which you provided. I acknowledge with utmost gratefulness the guidance and expertise provided by my supervisor Associate Professor Bruce Knight, and my mentor Emeritus Professor John Dekkers. I am indebted to you both for wise counsel and ongoing active support, which extended, on so many occasions, far above and beyond the call of duty. I appreciate so much your dedication, commitment, scholarship, and expertise. Thank you. This learning journey I have walked with you has been a privilege and great pleasure. This study would not have been possible without the extensive support of the developers of DIBELS and TOWRE tests. I thank you sincerely for your tests, advice, and patience. I also thank you for the work you have done in developing these powerful assessment tools. My sincere thanks go to the schools and teachers with whom I have worked in this research. In today’s world, teaching is a mission. Long hours and endless pressure make it not a job for the fainthearted. I hope you feel immense satisfaction about the difference you make in children’s lives. I appreciate so much your going the extra mile in so many ways in your teaching, and in our work together in Taking the Lead. We have worked hard, been a team, built knowledge, and made a difference for Australian reading instruction. My very special thanks go to Tracey, who on a year’s break from school worked so very hard on Taking the Lead, because she considered the work towards Mackay teachers having effective tests as teaching tools, so important. You epitomise our dedicated teachers. With the best of intentions, a doctorate is an extremely selfish venture. With all my heart, I thank my husband, Tim, and children, Lucy, Clare, Laura, and Kieron, for their love, patience, long-suffering and support. This thesis is as much yours as mine. My parents Jim and Sheila Skinner, who met while doing doctoral and master’s studies at Manchester University, so many years ago, have been a lifelong inspiration to me, and a source of endless comfort, love and support. My sisters and brother, Jane, Helen and John, similarly have powerfully supported my journey. Thank you. To my colleagues, thank you for wise words, support, and direction. My particular thanks are extended to Molly de Lemos, Robert Ho, Bruce Hoffman, and Daniel Teighe for advice on research design and statistical procedures. To my friends, and in particular Chris and Bev, thank you for your patience, prayers, support and understanding while I have been so absorbed in work, and functionally useless elsewhere. Special thanks to Tracey, Eleanor, Vivienne, Bev, and my daughter Laura, who supported me with proofing, and Lynelle, who had me stay (and work). Your support has been precious. Finally, and most importantly, thank you Lord God for all the ways you have guided me and supported me. You are my strength and song.

Copyright 2007 Susan Galletly PhD Thesis

3

Declaration I declare that the work presented in this dissertation is to the best of my knowledge and belief, original, except as acknowledged in the text, and that the material has not been submitted either in whole or part for a degree at this or any other university. The submission of this dissertation is in partial fulfilment of the requirements of the Doctor of Philosophy at Central Queensland University.

Susan Anne Galletly Mackay, Queensland, Australia March 2007

Table of Contents Chapter 1 Introduction........................................................................................................ 1.1 1.2

1.2.1 1.2.2 1.2.3 1.2.4 1.2.5

1.3 1.3.1 1.3.2 1.4 1.4.1 1.4.2 1.4.3 1.5 1.5.1 1.5.2 1.6 1.7

Introduction......................................................................................................... Background.......................................................................................................... A definition of reading....................................................................................... A definition of reading accuracy...................................................................... Constructs of reading accuracy used in the research …............................... Phases of reading-accuracy development....................................................... Reading-accuracy terminology......................................................................... Rationale and significance................................................................................. Rationale............................................................................................................... Significance.......................................................................................................... Aims, research questions and objectives......................................................... Aims...................................................................................................................... Research questions.............................................................................................. The research objectives....................................................................................... Limitations and the scope of the research....................................................... Limitations of the research................................................................................ Scope of the research.......................................................................................... Organisation of dissertation.............................................................................. Conclusion...........................................................................................................

Chapter 2 The research context.......................................................................................... 2.1 2.2

2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.2.6 2.2.7 2.2.8

Introduction......................................................................................................... Premises of reading accuracy................................................................................ Premise 1. Aspects of current reading instruction need re-examination ..... Premise 2: Reading-accuracy tests can support the development of reading instruction Premise 3. Reading accuracy is separate to reading comprehension............. Premise 4. Australian needs for reading-accuracy tests .................................. Premise 5. Knowledge concerning reading-accuracy development and instruction needs to become more widespread ................................................ Premise 6. Australian teachers need well-specified principles of best-practice reading instruction Premise 7: There is currently potential for developing best-practice reading instruction Premise 8: The importance of orthographic knowledge .................................

Copyright 2007 Susan Galletly PhD Thesis

4

1-1 1-1 1-3 1-3 1-4 1-5 1-6 1-10 1-12 1-12 1-16 1-17 1-17 1-18 1-19 1-21 1-21 1-23 1-24 1-25

2-1 2-1 2-2 2-3 2-7 2-12 2-16 2-19 2-24 2-25 2-28

2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 2.4

Skills for testing in reading-accuracy tests.......................................................... Phonological and phonemic awareness ............................................................. Whole-word and irregular-word reading........................................................... Orthographic knowledge ...................................................................................... Phonemic recoding ................................................................................................. Phonological recoding beyond phonemic recoding.......................................... Reading accuracy in authentic reading .............................................................. Conclusion ...............................................................................................................

Chapter 3 Overview of DIBELS and TOWRE tests........................................................ 3.1

3.1.1 3.2 3.2.1 3.2.2 3.2.3 3.2.4 3.2.5 3.2.6 3.2.7 3.2.8 3.2.9 3.3 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 3.3.6 3.3.7 3.3.8 3.4 3.4.1 3.4.2 3.4.3 3.4.4 3.4.5 3.4.6

Introduction......................................................................................................... Choice of DIBELS and TOWRE tests................................................................ Dynamic Indicators of Basic Early Literacy Skills (DIBELS)......................... Origin and availability........................................................................................ Description of DIBELS tests............................................................................... DIBELS norms and test schedules..................................................................... Evidence of floor and ceiling effects................................................................. The DIBELS predictive model of reading-accuracy development............... Administration of DIBELS tests........................................................................ DIBELS usage in schools and research............................................................ Information from DIBELS test data.................................................................. The use of DIBELS with Queensland children............................................... The TOWRE tests................................................................................................ Origin and availability....................................................................................... Description of the TOWRE tests....................................................................... TOWRE norms.................................................................................................... Evidence of floor and ceiling effects................................................................ Administration of TOWRE tests....................................................................... TOWRE usage in schools and research........................................................... Information from TOWRE data........................................................................ The use of the TOWRE tests with Queensland children............................... Conclusions.......................................................................................................... Periods of applicability of the tests.................................................................. The potential use and limitations of DIBELS tests......................................... The potential use and limitations of TOWRE tests........................................ Needs for additional reading-accuracy tests.................................................. The use of DIBELS and TOWRE tests in this research.................................. Concluding remarks...........................................................................................

Chapter 4 Methodology....................................................................................................... 4.1 4.2

4.2.1 4.2.2 4.2.3 4.2.4 4.3

4.3.1 4.3.2 4.3.3 4.3.4

Introduction......................................................................................................... The research paradigm....................................................................................... Ontology............................................................................................................... Epistemology....................................................................................................... Knowledge accumulation.................................................................................. Role of the researcher......................................................................................... The research design............................................................................................ Tests used in the research.................................................................................. Testing by teacher researchers.......................................................................... The time-sequence of the research................................................................... Ethical clearance for the research.....................................................................

Copyright 2007 Susan Galletly PhD Thesis

5

2-36 2-37 2-39 2-42 2-43 2-47 2-52 2-52

3-1 3-1 3-1 3-3 3-3 3-4 3-7 3-8 3-9 3-14 3-14 3-15 3-22 3-29 3-29 3-30 3-33 3-34 3-34 3-35 3-36 3-38 3-39 3-39 3-40 3-41 3-42 3-42 3-43

4-1 4-1 4-2 4-2 4-3 4-4 4-5 4-6 4-6 4-10 4-12 4-15

4.4 4.5 4.5.1 4.5.2 4.5.3 4.5.4 4.5.5 4.5.6 4.6 4.7

The research sample........................................................................................... Data collection..................................................................................................... Test materials....................................................................................................... Test conditions..................................................................................................... Test order.................................................................................................................. Test administration................................................................................................. School processing of test-data............................................................................... Gathering of other test-data................................................................................... Data analysis procedures....................................................................................... Conclusion...........................................................................................................

Chapter 5 Exploration of the data...................................................................................... 5.1 5.2

Introduction......................................................................................................... Rationale for choice of statistical procedure................................................... 5.2.1 Demographic variables for insights on reading-accuracy development.... 5.2.2 Achievement & norm-lines for insights on reading-accuracy achievement 5.3 Examination of the research variables............................................................ 5.3.1 Screening for inconsistent cases and outliers.................................................. 5.3.2 The rigour of modified test variables............................................................... 5.3.3 Assumptions of normality of test data............................................................ 5.4 Reading-accuracy achievement results........................................................... 5.5 Patterns of reading-accuracy development.................................................... 5.5.1 Differences between year-levels....................................................................... 5.5.2 Slowing of progress from Year 1 to Year 3...................................................... 5.5.3 Reading-accuracy development as steady progression................................ 5.5.4 Range expansion from Year 1 to Year 3........................................................... 5.5.5 Reading-accuracy development of lowest achievers..................................... 5.5.6 Changes from differentiated to nondifferentiated progress......................... 5.5.7 Development of phonemic recoding................................................................ 5.6 Relationships of demographic and test variables......................................... 5.6.1 Ethnicity and gender.......................................................................................... 5.6.2 Age in year level.................................................................................................. 5.7 Correlational associations between test variables......................................... 5.7.1 Decontextualised reading and authentic reading.......................................... 5.7.2 Predictiveness of phonological and phonemic recoding.............................. 5.8 Relationships with systemic measures of reading........................................ 5.8.1 Relationships with Year 2 Net results.............................................................. 5.8.2 Relationships with Year 3 Test results............................................................. 5.9 DIBELS Results.................................................................................................... 5.9.1 Results using DIBELS norms............................................................................. 5.9.2 Patterns of reading-accuracy development.................................................... 5.9.3 The DIBELS predictive model........................................................................... 5.9.4 Floor and ceiling effects in DIBELS tests......................................................... 5.10 TOWRE results.................................................................................................... 5.10.1 TOWRE results previously listed..................................................................... 5.10.2 Results using TOWRE norms............................................................................ 5.10.3 Floor and ceiling effects in TOWRE tests........................................................ 5.10.4 Rigour of SWE and PDE tests............................................................................ 5.10.5 Patterns of reading-accuracy development.................................................... 5.10.6 The linear spread of achievement of TOWRE test scores.............................. 5.11 Comparison of DIBELS and TOWRE results.................................................. 5.11.1 Consideration of high achievers....................................................................... Copyright 2007 Susan Galletly PhD Thesis

6

4-16 4-20 4-20 4-20 4-21 4-21 4-22 4-22 4-24 4-25

5-1 5-1 5-2 5-4 5-6 5-13 5-13 5-15 5-21 5-24 5-26 5-27 5-28 5-29 5-31 5-31 5-33 5-35 5-37 5-37 5-39 5-40 5-42 5-44 5-45 5-45 5-50 5-54 5-55 5-56 5-58 5-62 5-63 5-63 5-64 5-68 5-69 5-70 5-73 5-74 5-75

5.11.2 Consideration of low and average achievers.................................................. 5.11.3 Cohort results as interim norms....................................................................... 5.11.4 Additional test-points for DIBELS tests…………………………….................. 5.12 Phases of reading-accuracy development............................................................... 5.13 Concluding remarks..................................................................................................

5-76 5-79 5-81 5-82 5-84

Chapter 6 Consideration of the research questions.......................................................... Introduction............................................................................................................. Findings of the research......................................................................................... Research Question 1: Cohort achievement......................................................... Research Question 2: Comparison with test norms.......................................... Research Question 3: Relationships with demographic variables.................. Research Question 4: Reading-accuracy development..................................... Research Question 5: Suitable reading-accuracy tests...................................... Implications regarding reading-accuracy tests and improving reading-accuracy

6-1 6-1 6-1 6-1 6-3 6-9 6-12 6-18 6-23

The applicability of DIBELS and TOWRE tests to the Australian context Use of DIBELS and TOWRE tests in Australian schools................................ The potential of the DIBELS predictive model................................................ Limitations of DIBELS and TOWRE.................................................................

6-27 6-27 6-29 6-30

6.1 6.2

6.2.1 6.2.2 6.2.3 6.2.4 6.2.5 6.3 instruction 6.4 6.4.1 6.4.2 6.4.3

Chapter 7 Conclusions and recommendations 7.1 7.2 7.3

7.3.1 7.3.2 7.3.3 7.3.4 7.3.5 7.3.6 7.3.7 7.3.8 7.3.9 7.3.10 7.3.11 7.3.12

7.4 7.4,1 7.4.2 7.5 7.5.1 7.5.2 7.5.3 7.6

Introduction The results of the research Recommendations Rec1: Establishment & use of a comprehensive series of reading-accuracy tests in Australia Rec2: Establishment of a framework for best practice for reading-accuracy tests & instruction Rec3: The development of Australian norms for DIBELS and TOWRE tests Rec4: Introduction of rapid-use reading-accuracy tests in school practice Rec5: Integrated knowledge building on reading-accuracy needed in the Australian context Rec6: Strategies for implementation of use of reading-accuracy tests need to be used Rec7: An audit of reading-accuracy achievement and instruction is needed Rec8: Use of DIBELS tests within systemic testing be introduced Rec9: Use of TOWRE tests within national benchmark testing be introduced Rec10: Monitoring of a representative sample of Australian children is needed Rec11: Mastering phonemic recoding and whole-word reading needs prioritising Rec12 The need for research based models of reading accuracy for Australian contest The Reading-Accuracy Development model for testing theories of assessment & instruction The Reading-Accuracy Development model for building theory on reading-accuracy Hoover & Gough’s Component model of reading accuracy and comprehension Further research and development Issues raised warranting further research Researching instruction with use of rapid-use reading-accuracy tests Research using rapid-use tests to improve reading-accuracy instruction Concluding Remarks

Copyright 2007 Susan Galletly PhD Thesis

7

7-1 7-1 7-1 7-3 7-4 7-4 7-6 7-7 7-8 7-10 7-12 7-12 7-13 7-14 7-15 7-16 7-17 7-18 7-22 7-26 7-26 7-27 7-28 7-32

List of Appendixes Appendix A. Test materials used in the research A1. Mid-Year-1 test-form showing all DIBELS tests used with Year 1 A2. Examples of Grade 2 and 3 Oral Reading Fluency passages used with Years 2 and 3 A3. DIBELS stimulus pages A4. TOWRE Form-A test-form

Appendix B. Norm-lines for DIBELS and TOWRE tests Table B1. DIBELS norm-lines Table B2. TOWRE norm-lines

Appendix C. Orthographic categories of TOWRE test items Table C1 Orthographic categories of SWE and PDE words Table C2 Proportions of orthographic categories using shading Table C3 Analysis of orthographic units in each word Table C4 Numbers of TOWRE words read at different levels of achievement

Appendix D. Ethics approval and permission letters D1. Notification of ethics approval for the research D2. Letter to families D3. Parent permission form D4. Child permission form

Appendix E. Tables & figures for Chapter 5 The contents of this Appendix are as follows

Appendix E1 Abbreviations used in the tables and figures in Appendix E Table E1.1 Abbreviations used in Appendix E Appendix E2: Oral Reading Fluency (ORF) as a variable Table E2.1 Descriptive statistics on single and mean ORF-passage raw scores Table E2.2 Spread of scores on individual Oral Reading Fluency (ORF) passages Tables E2.3a to c. Correlations between Oral Reading Fluency (ORF) test-scores Table E2.3a Year 1 correlations between Oral Reading Fluency(ORF) test-scores Table E2.3b Year 2 correlations between ORF test-scores Table E2.3c Year 3 correlations between ORF test-scores Appendix E3: Correlations of TOWRE data options Table E3.1 Correlations of TOWRE data options Appendix E4 Descriptive statistics on the reading-test variables Table E4.1a Year 1 Age, DIBELS and TOWRE descriptive statistics Table E4.1b Year 2 Age, DIBELS and TOWRE descriptive statistics Table E4.1c Year 3 Age, DIBELS and TOWRE descriptive statistics Appendix E5 Achievement across year-levels and test-points Figure E5.1 Means of transformed and actual scores on word-reading tests Figure E5.2 Spread of achievement using 5° achievement increments Figure E5.3 Spread of achievement of lowest achievers (≤20°) Table E5.4a DIBELS reading accuracy pre-skill achievement lines Table E5.4b DIBELS & TOWRE word-reading test achievement lines Copyright 2007 Susan Galletly PhD Thesis

8

E-4 E-4 E-5 E-5 E-6 E-7 E-7 E-7 E-7 E-8 E-8 E-9 E-9 E-10 E-11 E-12 E-12 E-13 E-14 E-15 E-16

Appendix E6: Relationships with demographic variables: Ethnicity, gender, age in year-level Table E6.1 Means of indigenous & Caucasian cohorts Table E6.2 Significant and near-significant gender differences Table E6.3 Gender differences at different levels of achievement Table E6.4 Differences between oldest and youngest 40% of cohort Appendix E7 Correlations between DIBELS and TOWRE variables Table E7.1a Correlations among Year 1 achievement variables Table E7.1b Correlations among Year 2 achievement variables Table E7.1c Correlations among Year 3 achievement variables Appendix E8 Correlations of decontextualised and authentic reading Table E8.1a Year 1 correlations of decontextualised and authentic reading Table E8.1b Year 3 correlations of decontextualised and authentic reading Appendix E9: DIBELS results Table E9.1 Results on DIBELS Phoneme Segmentation Fluency (PSF) Table E9.2 Results on DIBELS Letter Naming Fluency (LNF) Table E9.3 Results on DIBELS Nonsense Word Fluency (NWF) Table E9.4a DIBELS Oral Reading Fluency (ORF) achievement- and norm-lines Table E9.4b DIBELS Oral Reading Fluency (ORF) risk categories Appendix E10 TOWRE results Table E10.1 TOWRE decile achievement- and norm-lines Table E10.2 Phonemic Decoding Efficiency (PDE) & Sight Word Efficiency (SWE) Table E10.3 TOWRE percentile ranks Appendix E11. Comparison of DIBELS and TOWRE results Table E11.1 Percents of cases using categories of ±1sd Table E11.2 DIBELS achievement- and norm-scores for ±1sd Appendix E12 DIBELS and TOWRE results towards interim norms Table E12.1 DIBELS cut-points and deciles Table E12.2 TOWER cut-points and deciles Appendix E12 Orthographic units in Oral Reading Fluency (ORF) test items Table E13.1 Numbers of words read in Oral Reading Fluency (ORF) passages Table E13.2 Orthographic categories sampled at different levels of achievement Table E13.3 Orthographic units sampled in ORF words

E-17 E-17 E-18 E-18 E-19 E-20 E-20 E-20 E-20 E-21 E-21 E-22 E-23 E-23 E-24 E-25 E-26 E-27 E-28 E-28 E-29 E-30 E-31 E-31 E-32 E-33 E-33 E-34 E-35 E-35 E-36 E-37

Appendix F. Features of possible rapid-use reading-accuracy tests recommended for development F1. A rationale for needs for developing supplementary tests to accompany DIBELS tests. F2. Features of possible rapid-use reading-accuracy tests.

Copyright 2007 Susan Galletly PhD Thesis

9

List of figures Figure Figure 1.1 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 3.1 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Figure 5.6a Figure 5.6b Figure 5.6c Figure 5.7 Figure 5.8 Figure 5.9 Figure 5.10 Figure 5.11 Figure 5.12 Figure 5.13 Figure 5.14 Figure 5.15 Figure 7.1 Figure 7.2 Figure 7.3 Figure 7.4

Title The Reading-Accuracy Development model Component Model (after Gough & Tunmer, 1986; Nation, 1999) NITL Recommendations 1, 2, and 9 The complex relationships of English orthography Orthographic categories for the research The predictive model sequence of tests Phoneme Segmentation Fluency (PSF) achievement Methods for testing assumptions of normality Means of variables permitting across-years comparisons Sight Word Efficiency (SWE) achievement by 5° groups of achievers Sight Word Efficiency (SWE) achievement of lowest achievers Sight Word Efficiency (SWE) achievement in Years 1 to 3 Oral Reading Fluency (G1ORF, G2ORF, G3ORF) achievement in Years 1 to 3 Figure 5.6c Nonsense Word Fluency (NWF) achievement in Years 1 to 3 Nonsense Word Fluency (NWF) by 5° levels of achievement Phoneme Segmentation Fluency (PSF) achievement- and norm-lines Year 2 Sight Word Efficiency (SWE) achievement- and norm-lines Sight Word Efficiency (SWE) achievement using Achievement Categories Percentages of cases achieving >1sd above the mean on Sight Word Efficiency (SWE) and Phonemic Decoding Efficiency (PDE) Sight Word Efficiency (SWE) achievement- and norm-lines Year 2 percentile ranks for Sight Word Efficiency (SWE) & Phonemic Decoding Efficiency (PDE) Percent of cases scoring >1sd above the mean Year 2 proportions of cases sorted by ±1sd Tests for the Australian context A proposed model of reading-accuracy development The Reading-Accuracy Development model showing authentic reading Research topics using the Component model

Copyright 2007 Susan Galletly PhD Thesis

10

Page 1-6 2-14 2-26 2-30 2-33 3-10 5-7 5-22 5-27 5-30 5-32 5-34 5-34 5-35 5-36 5-57 5-65 5-66 5-67 5-71 5-72 5-75 5-76 7-17 7-19 7-21 7-25

List of tables Table Table 1.1 Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 3.8 Table 3.9 Table 3.10 Table 4.1 Table 4.2 Table 4.3 Table 4.4 Table 4.5 Table 5.1 Table 5.2 Table 5.3 Table 5.4 Table 5.5 Table 5.6 Table 5.7 Table 5.8 Table 5.9 Table 5.10 Table 5.11 Table 5.12 Table 5.13 Table 5.14a Table 5.14b Table 5.14c Table 6.1 Table 7.1 Table 7.2

Title Phases and skills of the Reading-Accuracy Development model The contrasting premises of Skills- and Meaning-emphasis paradigms Distributions of early-identified and late-identified weak readers Differences in experimental and applied reading-accuracy research Orthographic knowledge for instructional purposes DIBELS tests used in the research DIBELS test-points and benchmarks Numbers of words read at different levels of achievement Age and curriculum differences between USA and Australia Equivalence of Queensland year-levels and American grade-levels Qld-USA schooling- and instruction-equivalent test-points Standard Score (SS) ranges for TOWRE achievement categories Ages and grades for TOWRE usage Test foci and periods of use Tests used in the research Tests used in the research Test-points used for each year-level The stages of the research The research sample Children tested on test-sets and at test-points (n=398) Statistical analyses used in the study Variables investigated in the research

Page 1-9 2-3 2-15 2-20 2-51 3-4 3-10 3-21 3-24 3-25 3-26 3-32 3-33 3-39 3-42 4-6 4-10 4-12 4-16 4-19 5-2 5-3 Descriptive statistics for Letter Naming Fluency (LNF) & Letter Sounding Fluency (LSF) 5-17 Correlations of results on LNF, LSF, and other DIBELS variables 5-18 Sample numbers using gender & ethnicity 5-37 Year 2 correlations of mid-year word-reading & end-year authentic reading 5-43 Year 2 Net achievement by gender and ethnicity using t-tests & SEM plots 5-46 Correlations of reading-accuracy test and Year 2 Net achievement 5-47 Cases ranked by Grade 2 Oral Reading Fluency (G2ORF) raw scores 5-48 Sensitivity and specificity of G2ORF using 10° and 30° cut-points 5-49 Year 3 Test achievement by gender using t-tests and SEM plots 5-50 Whole group correlations of Sight Word Efficiency (SWE) with Year 3 Test results 5-51 Correlations within achievement halves 5-53 Year 1 correlations ordered by predictors 5-59 Year 2 correlations ordered by predictors 5-59 Year 3 correlations ordered by predictors 5-60 Principles for rapid-use reading-accuracy tests to support teacher learning 6-24 Premises of the research indicating current Australian needs 7-9 Recommended topics for researching instructional effectiveness 7-24

Copyright 2007 Susan Galletly PhD Thesis

11

List of abbreviations The abbreviations in this section are in two lists: •

Abbreviations of test names.



Other abbreviations used in the thesis.

Abbreviations of test names DIBELS

Dynamic Indicators of Basic Early Literacy Skills

ISF

Initial Sound Fluency: a DIBELS test

LNF

Letter Naming Fluency: a DIBELS test

LSF

Letter Sounding Fluency: a researcher-adapted DIBELS test, using LNF

ORF

Oral Reading Fluency: a DIBELS test

PDE

Phonemic Decoding Efficiency: a TOWRE test

PSF

Phoneme Segmentation Fluency: a DIBELS test

SWE

Sight Word Efficiency: a TOWRE test

TOWRE

Test of Word Reading Efficiency

Other abbreviations used in this thesis Age-Gap

A researcher developed measure: Reading Age – Chronological Age.

Beg / B

Beginning-year, referring to beginning-year test-points, e.g., BegK, Bg1.

C

Consonant, used to denote wordforms, e.g. CVC, VC

End / E

End-year, referring to the end-year test-point, e.g., End-Year-1, Ey1.

G

USA grade-level, e.g., G1: Grade 1

GPC

Grapheme: phoneme correspondence

K

Kindergarten, USA’s first schoolyear

Mid / M

Mid-year, referring to the mid-year test-point, e.g. Mid-Year-2, My2

NITL

National Inquiry into the Teaching of Literacy (DEST, 2005)

OUPC

Orthographic unit: Phoneme/s correspondence

PISA

Program for International Student Assessment

V

Vowel, used to denote wordforms, e.g., CVC, VC

Y

Australian year-level, e.g. Y1: Year 1

Copyright 2007 Susan Galletly PhD Thesis

12

Chapter 1 Introduction 1.1 Introduction Reading accuracy is the identifying of the spoken words which correspond to written single words. It is integrally related to academic learning and progress (Adams, 1990; National Research Council (NRC, 1998), and has been the subject of extensive debate by governments and educators in the English-speaking world (e.g., Hempenstall, 1996, 1997, 2003; Moats, 2000). More recently, there has been a groundswell for increased use of reading-accuracy instruction and assessment, as evidenced in USA’s Reading First initiative (United States Government, 2004), and UK’s Rose Report (Rose, 2006). The end goal of reading-accuracy instruction is for readers to be able to read year-level text fluently and accurately. Reading-accuracy instruction focuses on reading-accuracy skills as well as authentic reading. This is because increases in the efficiency of reading-accuracy skills improves authentic reading in two ways (Chard, Vaughn & Tyler, 2002), through enabling: • Reading of words in the text which previously were too difficult. • More consideration of the meaning of what is being read, due to efficient readingaccuracy requiring less processing capacity. Reading-accuracy instruction and assessment have not been emphasised in Australia in recent years, and there is minimal reading-accuracy achievement data currently available (de Lemos, 2001a, 2001b). There are indicators that approximately 30% of primary and secondary school students have difficulties in reading comprehension and authentic reading (Education Queensland (EQ), 2000; Masters & Forster, 1996; Program for International Student Assessment (PISA), 2002a, 2002b, 2004). It is possible that inadequate reading-accuracy achievement in primary school early years may be a basis of this lower than expected reading achievement (Stanovich, 1986): Because of its pivotal role in effective reading comprehension, reading accuracy is a gateway skill separating those students who succeed in literacy, from those who struggle. The divide starts small and widens dramatically, due to vast differences in exposure to text and concepts. (Knight & Galletly, In press) This dissertation presents the results of an examination of reading-accuracy test-data gathered from testing children in Years 1, 2, and 3 in a Queensland regional area, using two USA-based sets of reading-accuracy tests, namely the: • Dynamic Indicators of Basic Early Literacy Skills (DIBELS; Good & Kaminski, 2002), and • Test of Word Reading Efficiency (TOWRE; Torgesen, Wagner & Rashotte, 1999). This chapter situates the current research by presenting background information on aspects of reading and reading-accuracy. It then presents the rationale, significance, aim, research questions and objectives, and limitations of the research. It concludes with a summary of the content of each chapter of this dissertation.

Copyright 2007 Susan Galletly PhD Thesis

13

1.2

Background

There are many constructs associated with the research of reading and reading accuracy. Coupled with the fact that these terms are generally not explicitly defined in the literature, this section provides definitions of reading and reading accuracy, and explanation of the constructs of reading accuracy used in this research.

1.2.1 A definition of reading This research is focussed on reading of print, and therefore wider meanings of reading involving materials other than print are not considered. A myriad of definitions for reading of print can be identified in the literature (Rivalland 2000; Queensland Board of Teacher Registration 2001). This dissertation uses the term reading in the commonly used sense of Torgesen et al. (1999): Reading is a complex activity that can be defined in a number of ways, depending upon the particular aspect of the skill or activity one is focusing on. For example, reading can be defined as the construction of meaning from text or as learning from print, if one is focusing on comprehension processes in reading. Reading can also be defined as deciphering print, if one is focusing on the aspect of reading that involves identifying the particular words the author has selected to convey meaning. At the word level, [reading may involve] knowing the meaning of the word that is identified or simply pronouncing it. ... The word ‘reading’ may be used to stand for quite different constructs…depending upon the particular aspect of reading that is being discussed. (Torgesen, Wagner, & Rashotte, 1999, pp.1-2) Using this sense of the term, in this dissertation reading is defined as the skill and/or act of identifying the pronunciation and/or meaning of printed text. The term authentic reading is used in the dissertation for reading of meaningful connected text while focussed on reading comprehension.

1.2.2 A definition of reading accuracy In this dissertation, reading accuracy is defined as the identification of the spoken words and word-parts that correspond to written single words. Reading accuracy involves words being recoded from their written forms to their spoken and/or semantic forms. When reading aloud, this identification involves pronunciation, while when reading silently, it may involve subvocalisation or accessing of a word’s semantic form (meaning). It includes the reading of real words and pseudowords; familiar and unfamiliar words; whole words and word parts; single syllable and multisyllabic words; regular and irregular words; words whose meanings are unfamiliar and familiar; and decontextualised words and words in meaningful text. Reading accuracy has an orthographic knowledge component, which enables the reading of these different types of words. It also has an efficiency component. Efficiency builds as readers move from early hesitant reading accuracy usually with adult support, through visually evident processing during self-learning as children encounter new words; to fluent effortless automatic reading of words without conscious thought (Juel, 1991). The term reading accuracy is not widely used in the reported reading research (Neale, 1999; Galletly, 2004; Knight & Galletly. 2005). The term links directly to the term reading comprehension, Copyright 2007 Susan Galletly PhD Thesis

14

which is used widely and unambiguously in the research literature (Adams, 1990; NRC, 1998; Neale, 1999; National Reading Panel (NRP), 2000). It is preferred to word recognition and word identification, terms used widely in the literature (Adams. 1990; de Lemos, 2002; Catts & Hogan, 2003). These terms are potentially confusing, through emphasising word, rather than reading and applying equally to semantic vocabulary and reading accuracy, e.g., students identify words when asked to “Find three words which mean big?”.

1.2.3 Constructs of reading accuracy used in the research Reading accuracy is a dynamic construct, whose dominant characteristics change over time during reading-accuracy development. Many of the dominant characteristics are skills developed using assessment and instruction, which are observable in assessment and instruction. Reading-accuracy development is intricately related to reading-accuracy assessment and instruction. This section defines reading accuracy and discusses the model of reading-accuracy development used in this research. It details the skills associated with each phase of reading-accuracy development, and indicates those skills which are worthy of assessment, through meeting the criteria listed above. These skills and the model are then used to appraise DIBELS and TOWRE tests in the next chapter, and to consider the research results in later chapters.

1.2.4 Phases of reading-accuracy development Reading-accuracy development is the development of increasingly sophisticated readingaccuracy skills, as children move from being novice to proficient readers. As has been alluded to above, reading accuracy is a broad construct with multiple stages and skills, and is dynamic through its dominant characteristics changing across reading-accuracy development (Ehri, 1995, 1996; Frith, 1985). As such it is possible to consider reading accuracy as occurring through a number of distinguishable phases. The researcher has developed a model of reading-accuracy development based on the literature, for use in the current research. This model is detailed in Figure 1.1 Continuous Phase 1: Phonological awareness, including phonemic awareness

Continuous Phase 2: Orthographic knowledge, including both common lettersounds & all other orthographic units (i.e., spelling patterns)

Temp. Phase 2: Phonemic recoding of commonest lettersounds

Temp. Phase 1: Whole-word reading, & reading of irregular words

Continuous Phase 3: Phonological recoding beyond phonemic recoding of commonest lettersounds (i.e., reading words with diverse types of orthographic units)

Continuous Phase 4: Reading accuracy in authentic reading (i.e., reading of meaningful connected text, while focussed on meaning)

Figure 1.1 The Reading-Accuracy Development model This model, which is herein referred to as the Reading-Accuracy Development model, emphasises phases of reading-accuracy development which lend themselves to assessment of readingaccuracy skills. It can be seen in this figure that the model has four continuous phases and two temporary phases. It is likely that these phases impact the beginning reader in the following Copyright 2007 Susan Galletly PhD Thesis 15

order: • Continuous Phase 1: Phonological awareness. This includes phonemic awareness (Adams, 1990; Bryant, Maclean et al., 1990; Byrne, 1992, 1998; Ehri, Nunes, Willows, & Valeska Schuster, 2001; Treiman, 1992). • Temporary Phase 1: Whole-word reading (Frith, 1985; Goswami, 1992, 2002; Ehri, 1991, 1995, 1996). This includes reading of irregular words whose orthographic units are not yet mastered. • Continuous Phase 2: Orthographic knowledge. This includes knowledge of both commonest letter-sounds and all orthographic units beyond commonest letter-sounds (Dewey, 1970, 1971, 1978; Fry, 2004; Pollo, Treiman, & Kessler, In press; Stanback, 1992; Treiman & Kessler, 2006; Treiman, Kessler, & Bick, 2002; Treiman, Mullennix, Bijeljac-Babic, & Richmond-Welty, 1995; Treiman, Sotak, & Bowman, 2001). • Temporary Phase 2: Phonemic recoding (Byrne, 1998; Ehri, 1996; Ehri, Stahl, & Willows, 2001; NRC, 1998; NRP, 2000). In this dissertation, the term phonemic recoding refers to phonemic recoding only of commonest letter-sounds. • Continuous Phase 3: Phonological recoding. This includes reading of words containing diverse orthographic units. In wider uses of the term, phonological recoding includes phonemic recoding. For purposes of contrast within this thesis, however, the term phonological recoding will refer only to phonological recoding beyond phonemic recoding of commonest letter sounds. • Continuous Phase 4: Reading accuracy used in authentic reading (Chard et al, 2002; Kuhn & Stahl, 2003). The six phases of the Reading-Accuracy Development model interact reciprocally throughout reading-accuracy development (Byrne, 1998; Goswami, 2002, NRC, 1998; NRP, 2000; Treiman & Baron, 1983). As such, they become increasingly sophisticated, while simultaneously supporting increased sophistication of the other phases with which they interact. Reciprocity between the different phases is indicated in the figure through use of bidirectional arrows. Continuous Phase 3, phonological recoding, is built from the two temporary phases, whole-word reading and phonemic recoding. As proficiency is developed in wholeword reading, early orthographic knowledge (letter-sounds) and phonemic recoding, these temporary phases are subsumed into phonological recoding. Their temporary nature is indicated by the two bold arrows in Figure 1.1, which are unidirectional, with no reciprocity indicated. The phases of the Reading-Accuracy Development model and skills related to these phases are detailed in Table 1.1, below. Table 1.1 Phases and skills of the Reading-Accuracy Development model Phases Phonological awareness: ● Includes phonemic awareness. ● Is prerequisite for and accompanies reading-accuracy development.

Copyright 2007 Susan Galletly PhD Thesis

Skills Phonemic awareness skills for reading: ● Identifying initial, final and medial phonemes in words. ● Blending lists of phonemes to make words. ● Segmenting words into lists of phonemes. Phonological awareness skills for reading: ● Rhyme awareness: Ability to rhyme and use rhyming to read words which rhyme with known or given words. ● Syllable awareness: Ability to consider syllables in words.

16

Whole-word reading: Includes ● Fully-whole-word reading without awareness of letters and sounds, and ● Semi-whole-word reading of words with ≥1orthographic units not known. Orthographic knowledge: Includes ● Letter-sound skills. ● Knowledge of English spelling patterns evidenced in reading accuracy skills. Phonemic recoding: Recoding regular words with singleletters saying commonest sounds. Letters are recoded to letter-sounds then the list of letter-sounds is recoded to a pronounceable word. Phonological recoding: Recoding of words containing other orthographic units in addition to commonest letter-sounds.

Reading accuracy used in authentic reading: Fluent reading of meaningful connected text of year-level difficulty.

● Reading of words learned logographically as wholewords prior to phonemic recoding being mastered. ● Reading of irregular words containing orthographic units which the child has not yet learned. Knowing & using orthographic knowledge, including ● Letters’ names, sounds, capitals and lowercase forms. ● Orthographic units of varying frequency and consistency. ● Orthographic contexts, e.g. /ăĕĭŏŭ/ vowels occur only in closed (VC) syllables. Skill: Reading accuracy of words with closed (VC) syllables, single consonants and /ăĕĭŏŭ/ vowels, e.g. up, zif, bitten. Preskills: ● Letter-sound knowledge. ● Phonemic awareness: identifying, blending & segmenting sounds. Skills: ● Orthographic knowledge. ● Phonological awareness beyond phonemic recoding: Rhyme & syllable awareness, advanced phonemic awareness. ● Reading accuracy of words containing diverse OUPCs. Preskills: ● Phonemic recoding. ● Wholeword reading. ● Reading for comprehension of meaningful connected text of appropriate level for year-level with minimal language scaffolding. ● Phonological recoding.

It can be seen from Table 1.1, that there are multiple skills involved in reading-accuracy development, and that different skills are associated with each phase of the Reading-Accuracy Development model. Each of the six phases is explored with regard to its assessable skills in the next chapter.

1.2.5 Reading-accuracy terminology This dissertation uses a considerable number of specific terms which refer to specific aspects of reading accuracy. These terms are detailed in this section, and incorporated in discussion in this chapter and subsequent chapters of this dissertation. As outlined above, reading accuracy is a construct with phases of development. Its word-reading phases are built from two types of words, whole-words, those learned logographically as whole units without consideration of the orthographic units (word parts) they contain, and regular words which are recoded through considering the phonemes and phoneme sequences of the orthographic units in the words. The term orthographic units (literally units of orthography) is used widely in this dissertation to include: • Graphemes: one or more letters which represent a single sound (phoneme). This dissertation at times refers specifically to graphs, which are single letters equating to single phonemes in words, e.g. [b t]) and digraphs, which are two-letter sequences representing single phonemes in words, e.g. [ar ea ch sh ng]. • Multiple letter graphemes, e.g. [igh ough] in light bought). • Spelling units of more than one phoneme, e.g. [el –ance -all], and [br spl] (the latter are also termed consonant blends). • Other word-parts, e.g., syllables [be-gin], rimes [-at, -ook], and part-words [-ook]. Orthographic units are at times termed spelling rules or spelling patterns, depending on the Copyright 2007 Susan Galletly PhD Thesis

17

context of discussion. The term phoneme is used for the single sounds of pronounced words or word-parts. These can be blended together to make the spoken word which a written word represents. A convention is adopted in this dissertation of using square brackets [ ] to represent written words or orthographic units, and diagonal slashes / / to represent the spoken word, phoneme or wordpart, e.g., in The letter [b] says /b/. The relationship of orthographic units to phonemes in this dissertation is termed orthographic unit: phoneme/s correspondences which is abbreviated as OUPC. It is used in place of the commonly used term grapheme: phoneme correspondence (GPC). GPC refers only to single graphemes, whereas OUPC refers to single graphemes as well as units involving multiple graphemes, e.g., [-all] and [-tion]. A convention is adopted in this dissertation, of using the term GPC with regard to phonemic recoding, and the term OUPC with regard to phonological recoding. This decision is based on phonemic recoding always involving GPCs, and phonological recoding often not involving GPCs, i.e., in every instance where a sequence of phonemes is involved, e.g., [-all, -tion]. Most English orthographic units have multiple OUPCs, e.g., [ch] has three: [c]: /ch/, [ch]: /sh/, and [ch]: /k/, as in [chin, chef, school]. English orthographic complexity results in most OUPCs being one: many, e.g. [ch]: /ch sh k/, and many: one, e.g. [sh –ti- -ci- -ssi- ch]: /sh/ (as in shop nation gracious passion chef). Readers develop orthographic knowledge, i.e., understanding and use of English orthography, as part of reading-accuracy development. Orthographic knowledge is knowledge of English orthographic units, usually including the OUPCs for each unit. This knowledge may be conscious or subconscious. Words and orthographic units are at times termed regular and irregular in this dissertation. Word regularity is built from the frequency and consistency of the orthographic units occurring in the words. The term word-form is used in the dissertation to refer to the orthographic structure of words. It is often expressed using the sequences of vowel-letters (V, [a e i o u]) and consonant-letters (C) listed in the words. In this dissertation a convention is adopted of underlining multiple-letter graphemes, such that word-forms of CVC, CVC, CCV and CCC represent words such as [cat car she try] respectively. Whereas some researchers use a single V to denote vowel graphemes such that [wait] and [corn] are considered CVC words (Stuart, Dixon et al. 2003), this is not the case in this dissertation, where every letter in a word will be represented by V or C. As such, [wait] and [corn] are CVVC and CVCC words, respectively.

1.3

Rationale and significance

1.3.1 Rationale As further considered in Chapter 2, the rationale for this research is established through reading accuracy being an important developmental skill and there being relatively limited research on reading-accuracy achievement levels of Australian children in different year-levels. These aspects are briefly expanded upon below.

1.3.1.1

The importance of reading accuracy

Somewhat akin to hand-writing development, reading-accuracy development has finite limits. Both are important in literacy instruction only until sufficient levels of competence and confidence have been reached. In contrast, comprehension, and thus reading comprehension, has no ceiling, and instruction on different aspects of comprehension continues across the schoolCopyright 2007 Susan Galletly PhD Thesis

18

years for children of all levels of reading efficiency. Reading comprehension, rather than reading accuracy, is thus a major focus of reading instruction for all children throughout their schoolyears. Reading accuracy is a major focus for a much shorter time, only until proficiency is reached. Whilst reading accuracy may seem a basic accuracy skill, this is the case only in transparentorthography nations (Aro, 2004; Seymour, Aro, & Erskine, 2003). In English-text nations, English orthographic complexity makes reading-accuracy development a complex progression of challenging skills, which is likely to take at least 3 years for most healthy-progress readers, i.e., those readers achieving at or above year-level, and many more years for delayed readers (Seymour et al., 2003). Reading accuracy is thus an important primary-school curriculum area.

1.3.1.2

Reading accuracy supports literacy development

Perhaps the most important aspect of reading accuracy is it being prerequisite for academic development, through its enabling of reading comprehension, reading fluency, quantity of independent reading, vocabulary growth and written expression (NRC, 1998; Stanovich, 1986). While no research was found on the role of reading accuracy in healthy-progress readers’ development in these areas, numerous studies show children with weak reading accuracy to make correspondingly poor progress in these areas (Berninger, Abbott, Abbott, Graham, & Richards, 2002; NRC, 1998).

1.3.1.3

Weak early reading accuracy predicts weak academic progress

Being prerequisite for academic and literacy development, reading-accuracy development creates compounding disadvantage for children with delayed reading-accuracy development (NRC, 1998; Stanovich, 1986). For these children, reading-accuracy development, learning and instruction are arduous and consume considerable instructional time. In addition, their curriculum content in other learning areas does not wait for their reading-accuracy skills to become adequate. These children thus become increasingly disadvantaged, requiring increasingly extensive remediation. Whereas initial remediation needs were just for reading accuracy, over time these needs include many other aspects of literacy (e.g. reading comprehension, spelling, written expression), and academic aspects using these skills (e.g. science, geography, project and assignment work). To prevent this compounding disadvantage, English-text nations are increasingly focussed on effective early-years reading-accuracy instruction and early intervention. This is seen in federal governments allocating increased funding to early-years reading-accuracy instruction in both the USA (Carlisle, Schilling, Scott, & Zeng, 2004; Maryland State Department of Education, 2006) and the UK (UK House of Commons Education and Skills Committee, 2006). While Australia also allocates large amounts of funding to early-years literacy instruction and early intervention, to date the programs used do not emphasise reading accuracy (Clay, 1993; Education Victoria, 1999; Western Australia Department of Education (EDWA), 2004). Torgesen (1998) suggests that one of the most compelling findings from recent reading research is that children who get off to a poor start in reading rarely catch up to healthy-progress readers. Lyon (1998) discusses healthy-progress middle-school readers reading over 10 million words per year, in contrast to weak readers who read less than 100,000 words per year. Researchers in Australia and overseas have established that children classified as poor readers by Year 2 have a 70-90 percent chance of still experiencing literacy failure in later primary school and high school, that high school intervention programs are largely low in effectiveness, and that the Copyright 2007 Susan Galletly PhD Thesis 19

consequences of continuing literacy failure are far reaching and lifelong (Juel, 1988; Lyon, 1998, 2003; NRC, 1998). Additionally, although research on the area is limited, it is possible that reading-accuracy development is also a key factor supporting social-emotional development, including self-esteem, attention and behaviour skills (Kavale & Forness, 1996; K.J. Rowe & Rowe, 2002; K.S. Rowe & Rowe, 2004). The National Research Council (NRC) report comments that: Academic success, as defined by high school graduation, can be predicted with reasonable accuracy by knowing someone's reading skill at the end of Grade 3 (for reviews, see Slavin et al., 1994). A person who is not at least a modestly skilled reader by the end of third grade is quite unlikely to graduate from high school…. Perhaps not surprisingly, when teachers were asked about the most important goal for education, over half the elementary school teachers chose "building basic literacy skills". (National Research Council, 1998, p.22) In this context, a study by Lyon (1998) concluded from studies of American children that unless effective intervention is provided before children with reading difficulties reach about nine years of age, approximately 75% continue to have difficulties learning to read throughout high school. Lyon summarises ease of learning for American children: The easy journey into the world of reading is only available to about 5% of our nation's children… Another 20% to 30% find reading fairly easy to learn during formal instruction, regardless of the instructional emphasis… Unfortunately, it appears that for about 60% of our nation's children, learning to read is a much more formidable challenge, and for at least 20% to 30% of these youngsters, reading is one of the most difficult tasks that they will have to master throughout their schooling. (Lyon, 1998, p.1)

1.3.2 Significance The significance of the research is established through there being a need for research in Australia that establishes the use and usefulness of rapid-use reading-accuracy tests such as DIBELS and TOWRE, which are examined in this dissertation. This research fills a gap by endeavouring to establish baseline data on reading-accuracy achievement levels of Australian children in Years 1-3 on two sets of reading-accuracy tests. As such the research can inform Australian decision-making on the role and function of reading-accuracy tests in early years of primary school. It is important to allude to the fact that reading-accuracy achievement data also serves useful purposes for monitoring and comparing child progress, efficiency of instruction, and reading-accuracy achievement levels within Australia and internationally. The research is also focussed on orthographic knowledge development within reading-accuracy development, and builds knowledge on this area. Orthographic knowledge development is children’s building of skill with the range of orthographic units encountered in English words. It is established in the next chapter as a pivotal aspect of reading-accuracy development. All the foregoing contributes to the knowledge base on reading-accuracy development, assessment and instruction.

Copyright 2007 Susan Galletly PhD Thesis

20

1.4

Aims, research questions and objectives

This section details the aim, research questions and objectives of the study.

1.4.1 Aims This research explores the reading-accuracy achievement of children in Years 1 to 3 in a regional Queensland town. Its aims are to: • Explore the achievement data generated by the tests regarding: o The applicability of the tests to children of different achievement levels. o Comparability of results with DIBELS and TOWRE norms. o The goodness-of-fit of the DIBELS predictive model. o Relationships between reading-accuracy achievement levels and gender, ethnicity, age-in-year-level, and achievement on state and national assessments of reading. • Explore the qualitative, diagnostic data generated by the tests regarding: o Aspects of orthographic knowledge sampled by the tests. o Indicators of children’s instructional needs. • Consider the characteristics of reading-accuracy tests indicated as most appropriate for use in early years of schooling in Australia. • Consider reading-accuracy development, and orthographic knowledge within readingaccuracy development. • Suggest directions for future research on reading-accuracy development, assessment and instruction.

1.4.2 Research questions The above aims are explored through using the following research questions: Research Question 1: What are the reading-accuracy achievement levels and rates of progress of Years 1, 2, and 3, as measured at mid-year and end-year test-points on DIBELS and TOWRE reading-accuracy tests? Research Question 2: How compatible are the results with the existing DIBELS and TOWRE norms? Subquestion 2.1: How compatible are the DIBELS results with the DIBELS predictive model of reading development? Research Question 3: What are the relationships between children’s reading-accuracy achievement levels and ethnicity, gender, age in year-level, and achievement on state and national reading assessments? Research Question 4: To what extent does the test-data provide information on reading-accuracy development and children’s instruction needs? Research Question 5: To what extent are DIBELS and TOWRE tests deemed suitable for use in Australia?

1.4.3 The research objectives The research questions are addressed through the following objectives: Objective 1: Undertake a review of literature on reading-accuracy development, assessment, and instruction in Australia and internationally, to establish the importance of the research conducted in this research, through: • Establishing the importance of reading accuracy in academic progress, and the importance of Copyright 2007 Susan Galletly PhD Thesis

21

the reading-accuracy constructs measured in this research (RQ 1 to 5). Analysing the research on predictors of reading accuracy development, to establish the DIBELS predictive model as worthy of investigation (RQ 2.1). • Analysing reading-accuracy assessment models and tests for classroom use, to establish DIBELS and TOWRE tests as worthy of investigation (RQ 1 to 5). • Analysing current knowledge on Australian reading-accuracy achievement levels and tests used in gathering such data, to establish needs for research (RQ 1 to 5). Objective 2: To generate the data for this research by: • Assessing children in Years 1-3 at mid-year and end-year test-points using DIBELS and TOWRE (RQ 1 to 5). • Gathering school-held data on these children’s achievement levels on state and national assessments of reading (RQ 3). Objective 3: To analyse the data: • Establishing the importance of reading accuracy in academic progress, and the importance of the reading-accuracy skills measured in this research (RQ 1 to 4). • Establishing the reading-accuracy achievement levels of children in Years 1-3 (RQ 1 and 2). • Comparing the mid-year and end-year results, and rates of progress, to establish achievement trends within and across year-levels (RQ 1). • Establishing the impact of gender within and across year-levels (RQ 3). • Establishing strength of relationship between reading-accuracy achievement levels and achievement levels on state and national reading assessments (RQ 3). • Exploring the results relative to USA DIBELS and TOWRE norms (RQ 2). • Exploring the word-forms read by high, medium, and low achievers for each subtest and test-points (RQ 4). • Investigating the applicability of the DIBELS predictive model, benchmarks and risk categories (RQ 2). • Exploring DIBELS and TOWRE data for: o Indicators of reading-accuracy development (RQ 4). o Qualitative diagnostic data indicating children’s instructional needs (RQ 4). • Establishing the extent to which DIBELS and TOWRE tests are suitable for use in Australia (RQ 5). • Establishing trends which future research might investigate further (RQ 1 to 5). •

1.5

Limitations and scope of the research

1.5.1 Limitations of the research Limitations of the research relate to the sample, the tests and test administration. Sample The limitation of the sample is through the schools involved all being located in one regional Queensland city. Use of schools positioned throughout Australia using consideration of demographics would enable greater generalisation of findings. It is considered that this limitation is not inappropriate for two reasons. Firstly, the research is preliminary research focussed on building baseline data, as a basis for future research in other locations. Secondly, the sample is representative of Queensland schools with respect to achievement on state and national assessments. Copyright 2007 Susan Galletly PhD Thesis

22

Tests There are two limitations relating to the tests used in the research. The first is DIBELS and TOWRE tests not being normed for Australian populations. This limitation is overcome to a large extent through analysis of the data being focussed largely on raw scores, rather than standard scores and percentiles, and the research being preliminary research which may lead to future development of Australian norms for these tests. The second limitation is curriculum differences between USA and Queensland. USA children begin formal reading-accuracy instruction in Kindergarten, at age 5 years. Queensland children begin reading-accuracy instruction in Year 1, at age 6 years. It was considered that the largest impact of this curriculum mismatch would be on applicability of DIBELS and TOWRE norms to the research cohort, which is accommodated, as discussed above. Test administration The research is limited through the administration of several DIBELS tests not conforming to that prescribed by the test-developers. In addition, no checks of test-administration were conducted during testing in schools. It is considered that these limitations are acceptable for several reasons: • The teachers had been trained in administering the tests. • Other research on these tests conducted in USA has used test-data gathered by teachers in school conditions. • Testing by teachers and teacher aides is standard practice for administration of these tests, as per their test manuals. • It is likely that future Australian studies using these tests will use data generated by teacher researchers testing children in their own schools.

1.5.2 Scope of the research In addition to the limitations detailed above, the scope of the research was reduced in two directions. The scope of the research was narrowed through testing being conducted with some but not all phases of the Reading-Accuracy Development model being tested in this research. This occurs because DIBELS and TOWRE tests do not specifically assess the phases of: • Whole-word and irregular word reading. • Phonological recoding, and orthographic knowledge developed within the phonological recoding phase. This reduces the scope of the research to the tested skills and phases, rather than the skills of all phases. The scope of the research was also narrowed through focussing only on reading accuracy, with no inclusion of tests of reading comprehension and language comprehension. As detailed in the next chapter: • Reading accuracy and language comprehension are integrally related in reading comprehension and authentic reading. • Testing to establish students’ particular instructional needs within reading instruction should include separate tests of reading accuracy, language comprehension, and reading comprehension. Copyright 2007 Susan Galletly PhD Thesis

23

1.6

Organisation of dissertation

The organisation of this dissertation is as follows: Chapter 2 – The research context This chapter establishes the research context as a framework for the research. It details premises underlying the research, reading-accuracy skills for testing using reading-accuracy tests, characteristics of rapid-use reading-accuracy tests, and Australian needs for research focussed on reading-accuracy achievement and reading-accuracy tests. Chapter 3 – Overview of DIBELS and TOWRE tests This chapter details the characteristics of the two testsets used in this research: • Dynamic Indicators of Basic Early Literacy Skills (DIBELS; Good & Kaminski, 2002). • Test of Word Reading Efficiency (TOWRE; Torgesen et al., 1999). Chapter 4 –Methodology This chapter explains the research methodology aspects of the research, including the research paradigm, design, method, sample, data gathering, and ethical considerations. Chapter 5 – Exploration of the data This chapter contains the data analysis of the reading achievement data, on areas relevant to the research questions. Chapter 6 – Consideration of the research questions This chapter details the research’s findings with regard to each of the research questions, and discusses issues related to these findings. Chapter 7 – Conclusions and recommendations This chapter formulates the conclusions from the research and presents recommendations for both practice and further research.

1.7

Conclusion

This chapter has introduced this dissertation and the research. It has established definitions of reading and reading accuracy, and the model of reading-accuracy development used in the research. The rationale, aims, significance, research questions and objectives of the research are similarly established, as is the scope of this research.

Copyright 2007 Susan Galletly PhD Thesis

24

Chapter 2 The research context 2.1

Introduction

The previous chapter provided background to the research. It established the importance of reading-accuracy in academic and literacy progress, defined the construct of reading accuracy, and detailed the Reading-Accuracy Development model, which is used in this research. This chapter establishes the research context, which serves as a framework for this research. The purposes of this chapter are as follows: • To outline premises on which this research investigation is based. • To establish the suitability of rapid-use reading-accuracy tests. • To establish Australian needs for research on reading-accuracy achievement of Australian readers, and reading-accuracy tests for school-level use in monitoring achievement in Years 1, 2, and 3, in primary schools. • To clarify orthographic complexity and its role in reading-accuracy development. • To identify important developmental reading-accuracy skills for assessment in readingaccuracy tests. The chapter has two parts. The first details eight premises of reading accuracy framing the research. The second explains aspects of reading-accuracy development framing the research and establishes reading-accuracy skills indicated as important for testing in reading-accuracy tests.

2.2

Premises of reading accuracy

Using the literature it is possible to identify eight premises which can facilitate the directions taken in this research. These premises are considered below in the context of this research.

2.2.1 Premise 1: Aspects of current reading instruction need reexamination Premise 1: Aspects of current reading instruction need re-examination. There have been opposing viewpoints on reading-accuracy instruction for centuries in Englishspeaking nations (Hempenstall, 1997, 2003; Pressley, Allington, Wharton-McDonald, Collins Block, & Mandel Morrow, 2001). Prior to the 1970s, these viewpoints were focussed on options within reading-accuracy instruction (Chall, 1967), notably on whether reading-accuracy instruction should be based on teaching recoding using letter-sounds (phonics) or teaching words as logographic whole words (look-say). In the 1970s, with the advent of Whole Language philosophy, the focus of controversy shifted from options within reading-accuracy instruction to reading accuracy itself now being an option (Gollasch, 1982). Whereas whole-word instruction had de-emphasised phonics (word parts), it nonetheless emphasised reading-accuracy of whole words as well as systematic reading-accuracy instruction using decontextualised words, and assessment and monitoring of reading-accuracy development. In contrast, Whole Language deemphasised reading accuracy itself, considering it a marginal aspect of reading development. It also deemed reading-accuracy instruction using decontextualised words, reading-accuracy tests, and experimental research as invalid and inappropriate practices (Hempenstall, 1996, 2003; Copyright 2007 Susan Galletly PhD Thesis 25

Pressley et al., 2001). Two distinct reading-instruction paradigms can be identified, a skillsemphasis paradigm based on experimental-research findings and a meaning-emphasis paradigm based on Whole Language philosophy, as shown in Table 2.1. Table 2.1. The contrasting premises of Skills- and Meaning-emphasis paradigms Paradigmatic differences Student focus English orthography RA in reading development RA being a natural versus learned skill Reading instruction emphases

Skills-Emphasis Paradigm

Meaning-Emphasis Paradigm

Usually at-risk & delayed readers Makes RA instruction essential A vital core skill RA is a secondary code (speech is primary), so requires careful instruction Emphasis on explicit teaching,of skills, & reading of meaningful texts RC = RA x LC (Gough & Tunmer, 1986): Reading builds from skill in all three As needed, not always, often little Very important Very important ● RA,, RC & LC. ● Meaningful & decontextualised texts ● RA strategies primary. ● Language strategies as backup Very important Standardised tests of RA of decontextualised & meaningful texts Very important Very important Very important, highly valid Very important

Usually healthy-progress readers Makes RA instruction inappropriate A by-product of reading development RA develops naturally, similar to speech, thus does not need explicit teaching Emphasis on constructivism, reading of meaningful texts & language scaffolding Model of RA RC=LC (Gollasch, 1982): development & achievement Reading builds from meaningful reading LC scaffolding of reading High prior to & during reading Separate consideration of RA Minimal separate consideration Use of decontextualised words Unimportant, potentially damaging Instructional emphases ● Comprehension ● Reading of meaningful texts Primary strategy for reading ● Language strategies primary. unfamiliar words ● RA strategies (often 1st letter) as backup Importance of assessment Very important Focus of RA assessment Testing of authentic reading using Running Records, & Miscue Analyses. Standardised tests Rejected as inappropriate & invalid Tests of decontextualised words Rejected as inappropriate & invalid Importance of Exptl research Rejected as inappropriate & invalid Importance of applied-research Rejected as inappropriate if on reading replicating Exptl findings subskills or decontextualised words [RA: Reading accuracy, RC: Reading comprehension, LC: Language Comprehension; Exptl: Experimental]

It can be seen from Table 2.1, that skills-emphasis and meaning-emphasis paradigms have alternative viewpoints on a number of aspects of reading accuracy development, instruction and assessment. Controversy on reading instruction and reading-accuracy due to paradigmatic differences is referred to in the literature as the Reading Wars (EQ, 2000; Louden et al., 2006). Meaning-emphasis reading instruction practices were accepted into mainstream reading instruction in English-speaking nations in the 1970s (Hempenstall, 1997, 2003). This was despite research studies establishing key meaning-emphasis principles of reading-accuracy development and instruction as unfounded and inappropriate (for example, Baker et al. 2002; Hempenstall, 1996, 1997, 2003). Moats (2000, p.8) comments of this paradigm shift, that “between 1975 and 1995, an entire field rushed to embrace a set of unfounded ideas and practices without any evidence that children would learn to read better, earlier”. Australian reading instruction has used reading-instruction practices built from Whole Language philosophy from the 1970s to the current time (EQ, 2002; Galletly, 2002; Queensland Studies Authority, 2002). Whole Language reading practices have remained relatively dominant until very recently, when federal governments in UK, and USA have turned to skills-emphasis academics as advisors. This has resulted in a new paradigm shift, with reading-accuracy instruction and assessment now reintroduced through prescribed curriculum (UK DfES, 2006) and federal mandating (US Government, 2004). The latter has generated new forms of criticism of reading-accuracy assessment and instruction (US Office of Inspector General (OIG), 2007, Paris, 2005a, 2005b). The release of Australia’s report on the National Inquiry into the Teaching of Literacy (NITL, Department of Education, Science & Training (DEST) in December, 2005, intimates that changes Copyright 2007 Susan Galletly PhD Thesis 26

to Australian instruction are in the offing. Current reading instruction practices include (Baker et al., 2002; Hempenstall, 1996, 1997, 2003): • Not considering reading accuracy separately from reading comprehension in reading instruction and assessment (Gollasch, 1982; Smith, 1976, 1994). • Advocating standardised tests of reading as invalid and inappropriate. • Advocating analytic phonics within reading of meaningful texts as the only appropriate form of phonics instruction, and emphasising synthetic phonics with decontextualised words as inappropriate (Gollasch, 1982; Smith, 1976, 1994). These two forms of phonics instruction differ through the role of the student and active recoding. Analytic phonics involves consideration of the orthographic units present in a word, usually in teacher discussion with a group of readers. Synthetic phonics involves children synthesising the list of letters in an unfamiliar word into the word’s spoken form using phonemic or phonological recoding (Ehri, Stahl, & Willows, 2001). • Advocating that authentic reading of meaningful text must always be conducted with language-scaffolding, including language discussions prior to and during reading of highly-predictable text (Clay, 1993; Queensland Department of Education, 1991). Four issues of paradigmatic difference considered in this research relate to: • Readers of different progress rates: Whether practices used with at-risk and delayed readers are valid practices for healthy-progress readers. • The validity of reading accuracy: This is viewed in this research as an important construct of reading development, and of reading-accuracy tests. • Reading accuracy’s separateness from reading comprehension. • The role of language supports in children’s reading: This concerns appropriateness of reading-accuracy instruction and tests using reading of decontextualised words and word-parts. It is therefore considered from the foregoing, that there is a need to promote research and research knowledge on these four issues, and to promote research which rigorously establishes principles of best-practice reading instruction and reading-accuracy instruction for readers at all levels of achievement. Best-practice instruction is defined for this dissertation as optimally differentiated classroom reading and reading-accuracy instruction, such that students at different levels of reading-accuracy development and success are working at an appropriate level of challenge, with instructional intensity focussed precisely on their specific instructional needs. The focus of this dissertation is not on specific types of reading-accuracy instruction. It is instead on reading-accuracy tests as strategic supports of reading-accuracy instruction. It is considered that reading-accuracy tests are in themselves instruction-neutral, and do not specify precise forms of instruction to be used (Kaminski & Cummings, 2007; Kaminski et al., 2006; Kaminski et al., 2007).

2.2.2 Premise 2: Reading-accuracy tests can support the development of reading instruction Premise 2: The use of effective reading-accuracy tests can assist reading-accuracy instruction and increase student reading-accuracy achievement. This section addresses validity issues associated with the use of reading-accuracy tests, and explores the purposes of reading-accuracy tests. It discusses rapid-use reading-accuracy tests as being tests able to achieve these purposes. Finally, it discusses the use of test data in achieving Copyright 2007 Susan Galletly PhD Thesis

27

increased effectiveness of instruction.

2.2.2.1 Validity of reading-accuracy tests The focus of this research is on reading-accuracy tests, hence issues of validity of such tests requires consideration. As outlined above, the use of reading-accuracy tests has been the subject of considerable debate. Paris (2005a, 2005b) considers constrained skills to be those which reach ceiling level in the early years of schooling, and unconstrained skills to be those which continue to develop across all school years. He considers that reading-accuracy skills are constrained and therefore questions the use of reading-accuracy tests, proposing that: • It is inappropriate to test reading-accuracy skills because they are constrained. • Parametric statistics are not appropriate for reading-accuracy test-data because of floor effects (cases not able to achieve on the test, thus having lowest-possible scores) and ceiling effects (cases with mastery of the tested skill, thus having highest-possible scores), and the non-normal distributions they create. • Reading-accuracy tests involving efficiency are invalid additionally because high scores reflect the influence of reader’s working memory capacity, rather than reading-accuracy ability. Paris’ concerns that floor and ceiling effects can strongly influence results and findings seem to be those dealt with in usual statistical practice (Cohen & Lea, 2004; Huck, 2004). His emphasis on reading-accuracy skills being constrained through being of brief duration can be challenged, given considerable research evidence that most English-text readers take many years to develop ceiling-level reading accuracy (Hanley, Masterson, Spencer, & Evans, 2004; Jackson & Coltheart, 2001). In addition, the developmental period for many reading-accuracy skills is of considerable duration (Neale, 1999). As such, they can be considered unconstrained during development. On this basis, it is argued that reading-accuracy tests have validity for use in reading-accuracy development. At the current time, there are perceptions that DIBELS has been inappropriately emphasised as a test-system of choice (Paris, 20051, 2005b; US Office of Inspector General (OIG), 2007). Paris’ writings focus specifically on DIBELS tests, and a website called (NOT the official) DIBELS clearinghouse (Vermont Society for the Study of Education, 2006) seems intent on criticising DIBELS and Reading First testing. As mentioned above, in this research it is considered that reading-accuracy tests are themselves instruction-neutral. As such, the focus of this research is on the tests’ characteristics and the usefulness of these characteristics in assessing readingaccuracy skills.

2.2.2.2 The purposes of reading-accuracy tests Reading-accuracy tests are used to achieve multiple purposes of assessment (Coyne and Harn 2006; Kame'enui 2002; Westwood; 2001b). These purposes are twofold, those focussed on individual children, and those focussed on systemic purposes beyond the level of individual children. Coyne and Harn (2006) and Kame'enui (2002) detail these purposes as follows: School purposes: • Screening student achievement at single test points or spaced intervals, e.g., annually. • Monitoring progress across reading-accuracy development, through assessing each student at multiple test-points, often quite close together, usually using alternate forms of the same test. • Providing diagnostic and achievement data for teachers to use in: Copyright 2007 Susan Galletly PhD Thesis

28

o Forming groups for reading instruction. o Planning specific aspects of the next steps of instruction. o Evaluating students’ response to those specific forms of instruction. Systemic purposes: • Monitoring reading-accuracy achievement at different levels, including class, school, district, state, and nation. • Evaluation of instructional effectiveness at these different levels. • Allocation of resources according to student-achievement levels.

2.2.2.3 The effectiveness of rapid-use reading-accuracy tests In contrast to tests of other literacy skills, reading-accuracy tests require adults to work one: one with individual children. The recent availability of a new test genre, rigorous rapid-use readingaccuracy tests (Deno, 1992; Fuchs & Deno, 1991, 1994), offers potential for testing early-years children as a useful part of reading instruction. These tests are timed and are of brief duration (45 to 60 seconds). They focus both on sampling skill with the content used in the test, and measuring student efficiency with the skill. Processing student test results takes minimal time, with results usually expressed as items per minute, e.g., letters per minute, words per minute. Chapter 3 explores the use of the two sets of rapid-use reading-accuracy tests that are used in this research. Westwood (2001b) suggests a crux for test-data’s usefulness for guiding instruction, namely how well the test-data answers the question “What does the student need to be taught next in order to make progress?” (Westwood, 2001, p.4). Curriculum-Based Measurement (CBM) frameworks, as used with DIBELS reading-accuracy tests, emphasise qualitative diagnostic data being the data most useful for this purpose: For instructional utility... test samples must allow the teacher to quantify a child’s overall proficiency, but also to describe the quality of the child’s performance. When measurement allows for both quantitative and qualitative descriptions of performance, we can rely on the quantitative assessment to inform decisions about when instructional adjustments are necessary. We can use the qualitative description to determine how to adjust that program, that is, to generate useful hypotheses about strategies that might enhance student outcomes (p.20). … Teachers design more successful instructional programs for children when, in addition to quantitative indicators of student proficiency, they also have qualitative descriptions of student patterns of performance. Qualitative descriptions allow teachers to formulate more specific plans about how to improve their instructional routines (p23). (Fuchs & Deno 1994, p.20, 23) A range of rapid-use reading-accuracy tests are available, and have been established as reliable, valid, and useful for efficiently achieving the many purposes for which reading-accuracy tests are used (Compton, Fuchs, Fuchs, & Bryant, 2006; Good & Kaminski, 2002; Hosp & Fuchs, 2005; Torgesen et al., 1999; Wheldall & Madelaine, 2000). Stecker, Fuchs, & Fuchs (2005) review the findings of 30 years of research on the use of CBM as an assessment methodology for enhancing student achievement, and conclude that: • Rapid-use reading-accuracy tests are established as useful in improving instruction and Copyright 2007 Susan Galletly PhD Thesis

29

increasing student achievement when effectively differentiated instruction is achieved. The most optimal instruction for a student is instruction built from that student’s personal test-data, rather than class data or data from children with seemingly similar needs. • Effective decisionmaking is conditional on: o Test-data including useful qualitative diagnostic data and achievement data. o Test-reports including instructional recommendations and skills-analyses. o Teachers receiving guidance from experts in instructional-decision-making when using reading-accuracy test-data for instructional decision-making. From this discussion, it can be concluded that: • Rigorous rapid-use reading-accuracy tests have potential: o To achieve the many purposes for which reading-accuracy test-data is used. o To support improved reading-accuracy achievement, through improved instructional-decision-making in response to test-data. • Best-practice reading-accuracy instruction can be matched to students’ specific instructional needs. • Reading-accuracy tests can support instructional decision-making through providing: o Rigorous achievement data for teacher and systemic use. o Highly specific qualitative data on the content and processes of next steps of instruction needed for each student. •

2.2.3 Premise 3: Reading accuracy is separate to reading comprehension Premise 3: For reading instruction decision-making, reading accuracy needs to be considered separately to reading comprehension. As discussed above, the relationship of reading accuracy and reading comprehension has been a source of continuing debate for many decades (Hempenstall, 1996, 1997, 2003). It is established in the research literature that reading accuracy, language comprehension and reading comprehension are intricately related in the reading process (Catts & Hogan, 2003; Gough & Tunmer, 1986; Hoover & Gough, 1990; Leach, Scarborough, & Rescorla, 2003). The relationship between reading accuracy and reading comprehension can be expressed numerically (Gough & Tunmer, 1986; Hoover & Gough, 1990): Reading comprehension = Reading accuracy x Language comprehension. This model was established as the Simple View of Reading model (Gough & Tunmer, 1986), and is herein termed the Component Model (Joshi & Aaron, 2000). The model emphasises reading accuracy and language comprehension as separate contributors to reading comprehension. It places language as common to, and the basis of, all communication, regardless of the mode of communication, be it print, speech, drama or art. The transfer of meaning (language) is the common factor of communication in these modes. Communication in these different modes is differentiated through the accuracy skills each mode uses to access and express language. Reading accuracy, converting printed words to their spoken or semantic forms, is the code which must be mastered if effective print reading comprehension is to be achieved (Gough & Tunmer, 1986; Nation, 1999). Consideration of children’s separate abilities in reading accuracy, language comprehension and reading comprehension builds understanding of children’s specific instructional needs, which may be for reading accuracy, language comprehension, reading-comprehension strategies, or combinations thereof. In the Component Model presented in Figure 2.1, language comprehension Copyright 2007 Susan Galletly PhD Thesis

30

and reading accuracy are considered as continua from nonexistent (0) to perfection (1). COM PREHENSION 1 Dyslexic Readers

Normal Precocious Readers

Poor reading accuracy Normal Readers

0

1

WORD RECOGNITION

Poor comprehension Generally Poor Readers

Hyperlexic Readers 0

Figure 2.1 Component Model (after Gough & Tunmer, 1986; Nation, 1999) The continuum nature of the Component Model implies progress from no skill (0) through to highly efficient levels of effortless processing without conscious thought, i.e., automaticity (1). This is the case for both language comprehension and reading accuracy. When children’s levels of language comprehension, reading comprehension and reading-accuracy are separately assessed, broad populations of readers show scores scattered throughout the four quadrants of this figure (Catts & Hogan, 2003; Catts, Hogan, & Fey, 2003; Gough & Tunmer, 1986; Hoover & Gough, 1990; Nation, 1999). Using the Component Model, Gough and Tunmer (1986) proposed three categories of reading difficulty, resulting from weakness present in reading accuracy, language comprehension, or both. Each of these categories occupies one quadrant of Figure 2.1 and has its own specific instructional needs distinctly different to those of the other quadrants. The value of separate consideration and assessment of the three components of the Component Model is evidenced in Leach et al.’s (2003) study of USA children with weak reading comprehension. They used separate tests of reading accuracy, language comprehension, and reading comprehension to establish students’ achievement levels in each area. Next, they compared the prevalence of weakness in reading accuracy and language comprehension in early-identified 1st-2nd grade and late identified 4th-5th grade weak reading-comprehenders. A summary of their findings is shown in Table 2.2. Table 2.2 Distributions of early-identified and late-identified weak readers Reading accuracy Early-Identified Late-Identified

49% 35%

Language comprehension 6% 32%

Reading accuracy & Language comprehension 46% 32%

The high levels of reading-accuracy instructional needs in children with weak reading comprehension are clearly evident in this table, with 95% (49% + 46%) of early-identified and 67% (35% + 32%) of late-identified weak reading-comprehenders having reading-accuracy weakness. Copyright 2007 Susan Galletly PhD Thesis 31

While there are no Australian studies separately assessing all three components of the Component model, it is possible that Australian prevalence rates are similar to these American figures. This would suggest that the reading-accuracy instructional needs of a large proportion of weak Australian readers may not have been identified or addressed. From this discussion, it can be concluded that: • Reading accuracy needs to be considered separately to reading comprehension in Australian reading instruction, assessment and curricula. • Efficient tests of reading comprehension, language comprehension and reading accuracy are needed for elucidation of students’ specific instructional needs.

2.2.4 Premise 4: Australian needs for reading-accuracy tests Premise 4: In Australia, there is a need for rapid-use reading-accuracy tests for use in schools and research. As discussed above, Australian reading instruction in schools for the past three decades has been focussed on reading comprehension with limited consideration of reading accuracy as being separate to reading comprehension.

2.2.4.1 Australian reading-accuracy achievement data This section critiques available Australian academic literature on reading-accuracy achievement levels. There is limited statistical data collected on Australian reading and reading-accuracy achievement levels in recent decades. The only widespread assessment of reading-accuracy in recent years was the renorming of the Neale Analysis of Reading Ability (Neale, 1999; de Lemos, 2001) which used just the Neale test and did not include any other reading-accuracy measures for external comparison. In her analysis of the limited available national, state and Neale data from 1993 to 2000, de Lemos (2001) discussed the dearth of reading-accuracy data available and the need for strategic gathering of data on reading-accuracy achievement. With regard to state and national achievement levels of primary school, secondary school and adult readers, only reading-comprehension data is available, such that only the interaction of reading accuracy with language-comprehension achievement can be considered. Reports such as Literate Futures (EQ, 2000) and Mapping Literacy Achievement (Masters & Forster, 1996) discuss rates of reading weakness as including 25% to 33% of non-indigenous children, over 40% of children speaking English as a second language, and over 60% of indigenous children. As discussed above, there are indicators that reading-accuracy difficulties underlie readingcomprehension difficulties in approximately 90% of early-years readers and two thirds of readers in middle primary school (Leach et al., 2003). This would suggest that weak readingaccuracy skills are a major factor for many of the Australian children discussed above. As regards data on predictiveness of low reading-accuracy achievement, and effectiveness of instruction for low-progress readers, one study was located (Waring, Prior et al. 1996). It reported that 72% of Australian children identified in Year 2 as having significant reading weakness were still not reading in the normal range when assessed in Year 6. Indications of widespread reading difficulties are also seen in an Australian study of referrals for paediatric Copyright 2007 Susan Galletly PhD Thesis

32

consultations at tertiary referral hospitals (K.J. Rowe & Rowe, 2002; K.S. Rowe & Rowe, 2004), which found that 20% of referrals were for learning difficulties, and 50% were for behaviour and attention difficulties. It is possible that many of these latter children were also experiencing learning difficulties. As regards secondary school reading levels, no studies of reading-accuracy achievement were found. Australia’s PISA reading results from testing of 15 year old children (PISA, 2002a, 2002b, 2004) in 2000 and 2003 show a consistent pattern of high proportions of very good readers and approximately 30% of reading have weak reading skills. Consideration of the primary and secondary school data, above, indicates that: • There is minimal reported data on Australian reading-accuracy achievement levels. • Reading-comprehension data indicates that 25-33% of early-years’ readers, and approximately 30% of 15 year old readers have considerable reading difficulties. • Using Leach et al’s (2003) findings, discussed above, it is possible to infer that that significant reading-accuracy weakness may be present in over 90% of the early-years readers, and approximately two thirds of the teenage weak readers. It follows that there is a need in Australia to give increased importance to the assessment of reading accuracy.

2.2.5 Premise 5: Knowledge concerning reading-accuracy development and instruction needs to become more widespread Premise 5: Knowledge on reading-accuracy development and instruction is needed in the Australian context. Amongst skills-emphasis advocates, there has been consensus for many decades on important precepts of reading-accuracy development, which have been developed from research studies (e.g., Adams, 1990; Chall, 1967; Chard et al., 1998; NRP, 2000; NRC, 1998). These precepts are as follows: • Children build from print awareness, letter-sound knowledge and phonological awareness to develop understanding and skill with the alphabetic principle, i.e., that the sounds of speech map onto letters and letter groups (Byrne, 1998; Juel, 1991; Liberman, Shankweiler, & Liberman, 1989). Knowledge of the alphabetic principle enables a child to develop efficient word recognition and reading fluency needed for effective reading comprehension. • Children who make healthy progress in mastering early reading skills find reading relatively effortless, and enjoy reading. As a consequence, they read more, and continue to improve their reading skills (Cunningham & Stanovich, 1997; Nation & Angell, 2006; Nicholson & Tan, 1999; Stanovich, 1986). • Children at-risk of reading weakness have difficulty mastering the alphabetic principle. Acquisition of fluent, context-free word identification skills has been shown to be the major stumbling block of reading-accuracy development (Byrne, Fielding-Barnsley, & Ashley, 2000; Chard et al, 1998). Poor readers usually fall further and further behind in reading. This negatively affects their academic achievement in other areas, as well as their self esteem and motivation to learn (Stanovich, 1986). • Effective reading-accuracy instruction supports children to build expanding levels of reading-accuracy competence, which in turn supports development of independent reading and reading comprehension skills (Ehri, Stahl, et al, 2001; Rayner, Foorman, Copyright 2007 Susan Galletly PhD Thesis

33





Perfetti, Pesetsky, & Seidenberg, 2001). Children’s primary strategy for working out unfamiliar words should be reading accuracy, i.e., either recoding the whole written word or decoding the written word into its parts, such as syllables and sounds, then recoding those parts, blending them together to form the spoken word (Share, 1995; Torgesen, 2002). Reading instruction should include extensive reading of meaningful texts and literature to children and by children, in addition to reading-accuracy instruction (NRC, 1998; Pressley, 1998).

The above precepts have been established largely through experimental research studies, i.e., experimental research in controlled circumstances, usually focussed on at-risk and delayed readers. Many of the precepts have not been tested in applied settings (schools and classrooms) and with different groups of achievers, as shown in Table 2.3 (Louden et al., 2006; NRP, 2000; Richardson, 2000; Swanson, Hoskyn, & Lee, 1999). Table 2.3 contrasts research conditions and the rigour of the knowledge base in experimental versus applied research. Table 2.3 Differences in experimental and applied reading-accuracy research Experimental research

Applied research in school conditions

Research conditions Experimental conditions often have close to optimal Usual school conditions have less than optimal ● Controlling for school & classroom effects thus ● Impact from student & instruction factors on student & instruction factors contributing strongly to student results, due to school & classroom effects variance in student results. contributing 50-75% of variance (Louden et al., 2006). ● Adult: child ratios ● Adult: child ratios ● Group structures, e.g. mixed or same ability ● Group structures ● Training in specific aspects of instruction ● Training in specific aspects of instruction ● Attention focussed on delivery of instruction by ● Attention focussed on delivery of instruction (due person conducting the instruction. to needs of groups, behaviour, interruptions) ● Time allocation for length and number of sessions, ● Time allocation for length and number of sessions, planning, data analysis planning, data analysis ● Funding for preferred materials ● Funding for preferred materials ● Professional support ● Professional support Rigour of knowledge base Indicators of optimal reading-accuracy instruction are Little rigour added due to general, not finegrained principles, which are ● Limited reading-accuracy achievement data ● Focussed on decontextualised phonics and available for consideration phonological awareness instruction. ● Little encouragement for replication of experimental ● Focussed negligibly on factors such as embedded studies due to rejection by schools of readingphonics instruction, and time spent reading books . accuracy tests, decontextualised reading-accuracy ● Established and rigorous, for very young and instruction, and postpositivist research. delayed readers. Applied research studies in school settings include ● Indicating likelihood that level of success in ● Studies of multi-factor programs of reading reading-accuracy in early years predicts level of instruction (Slavin, 1999), with limited attention to success in later academic achievement. single factors within programs. ● Useful indicators, not yet rigorously established, for ● Very small number of studies focussed somewhat older readers and younger healthy-progress readers. on developing specific principles of instruction, e.g. ● Highly worthy of replication in school settings. Juel (1988), Seymour & Elder (1986), Sowden & ● Requiring replication in school settings before Stevenson (1994), with findings worthy of, and principles of effective efficient classroom readingrequiring replication. ● Very limited knowledge gained on principles of accuracy instruction, applicable to children with effective efficient classroom reading-accuracy different needs (e.g., different progress rates, instruction, applicable to readers of different progress achievement levels, yearlevels) can be authoritatively stated. rates, achievement levels, and yearlevels.

Copyright 2007 Susan Galletly PhD Thesis

34

It can be seen from Table 2.3 that there are major differences in experimental research conditions and usual school conditions. Replication in applied settings is thus important, as is replication with different groups of achievers, e.g., low, average and high achievers in different year-levels. It can also be seen from Table 2.3 that there are minimal applied research findings on readingaccuracy development and instruction. Despite this lack of established principles of readingaccuracy development and instruction, there is nonetheless a widespread sentiment that there is a rigorous knowledge base on principles of reading-accuracy instruction useful in school conditions with readers of different achievement levels and in different year-levels. This sentiment is evidenced internationally (NRC, 1998, NRP, 2000; US Congress, 2002), and in Australia: There is a solid body of scientific knowledge about how children learn to read, what they should be taught in the course of early reading instruction, different ways in which children find learning to read difficult, and effective methods for helping such children. (Coltheart & Prior, 2007, p.5) A review of the literature revealed research gaps regarding aspects of reading-accuracy development, and instruction which are widely assumed to be established. A myriad of studies were found focussed on development of phonological awareness and the very beginning stages of reading, especially Kindergarten phonological awareness and very early phonemic recoding (Chard et al. 1998; Ehri, Nunes, et al., 2001; Ehri, Stahl, et al., 2001; NRP, 2000; NRC, 1998). Very few studies were found focussed on building understanding of reading-accuracy development of children beyond this very beginning stage (e.g., Duncan & Seymour, 2003; Wheldall & Beaman, 1999). No studies were found on the reading-accuracy development of healthy-progress readers, or on specific aspects of reading-accuracy development in school conditions, e.g. development of skill with words with common vowel digraphs, aspects of efficient transfer of learning from decontextualised instruction to reading of meaningful text. In addition, no studies were found on instructional factors used widely in schools such as time spent reading books of manageable difficulty, and analytic phonics instruction embedded in reading of books, particularly with healthy-progress readers. From this discussion, it can be concluded that at the current time, the research knowledge base needs to be extended to consider: • Resolving issues underlying current paradigmatic differences including the role of reading accuracy in literacy development, and whether healthy-progress readers have the same instructional requirements as low-progress readers, as discussed above. • The adoption of principles of reading-accuracy instruction built from research knowledge in the Australian context, for use in classroom settings with different levels of reading achievers. • A number of research issues concerning reading accuracy and reading-accuracy development.

2.2.6 Premise 6: Australian teachers need well-specified principles of best-practice reading instruction Premise 6: Australia may need more highly specified principles of reading-accuracy instruction than are needed in USA. Copyright 2007 Susan Galletly PhD Thesis

35

Marked cultural differences in development of classroom curriculum exist between Australia (Australian Government, 1992; EQ, 2002; van Kraayenoord & Paris, 1994) and the USA (Maryland State Department of Education, 2006; Simmons & Kame'enui, 2003). American classroom reading instruction relies heavily on commercial instruction programs (Hiebert & Taylor, 2000; Simmons & Kame'enui, 2003; US Government, 2004). At the current time, USA reading programs are required to address all five NRP Big Ideas (phonological awareness, phonics, fluency, vocabulary, reading comprehension). While there is criticism of current criteria for designation of effective programs, almost all criticisms relate to options within commercial programs, rather than commercial programs themselves being an option (US OIG, 2007). In contrast, commercial programs are only one of many options used as the basis of classroom instruction in Australia, where the emphasis is instead on schools and teachers developing their own curricula (Australian Government, 1992; EQ, 2002). Australian curriculum development for classroom instruction at the primary school level generally involves the following stages: • The syllabi and curricula suggest parameters and important aspects of instruction. • Schools and teachers build from these documents, drawing on nonprescribed professional resources of their own choosing to develop their school and yearlevel aims and emphases. • Individual teachers or groups of teachers use these year-level aims and emphases, similarly drawing on diverse professional resources, to develop their own lesson sequences, matched to their children’s needs. Most research literature on reading-accuracy development and instruction is American, and the focus of applied research on classroom reading-accuracy instruction in USA is on the efficacy of commercial reading programs (Simmons & Kame'enui, 2003). With Australian teachers developing their own lesson and curriculum sequences, there are needs for rigorous instructional supports built from Australian based research. From the foregoing, it is concluded that: • Australian reading-accuracy research needs are different to those of the USA. • Given that Australian teachers need principles of reading-accuracy instruction, there is a need for Australia to conduct its own research developing knowledge on principles of best-practice reading-accuracy instruction for readers of different achievement levels and progress rates.

Premise 7: There is currently potential for developing best-practice reading instruction Premise 7: It is an appropriate time to develop best-practice reading-accuracy instruction for Australian schools. The release of the report of Australia’s National Inquiry into the Teaching of Literacy (NITL; DEST, 2005a, 2005b) has marked the beginning of a cusp time in Australian reading instruction. The NITL has made recommendations that call for major state reforms to ensure that research-based reading-accuracy instruction and assessment are implemented by all states. At the current time, the specific requirements and methods of actioning each recommendation are not yet known. This research is closely aligned with three of the twenty NITL recommendations in particular, namely Recommendations 1, 2, and 9. These are detailed in Figure 2.2.

Copyright 2007 Susan Galletly PhD Thesis

36

Recommendation 1: That teachers be equipped with teaching strategies based on findings from rigorous, evidence-based research that are shown to be effective in enhancing the literacy development of all children. Recommendation 2: That teachers provide systematic, direct and explicit phonics instruction so that children master the essential alphabetic code-breaking skills required for foundational reading proficiency. Equally, that teachers provide an integrated approach to reading that supports the development of oral language, vocabulary, grammar, reading fluency, comprehension and the literacies of new technologies. (DEST, 2005a, p.14) Recommendation 9: That the teaching of literacy throughout schooling be informed by comprehensive, diagnostic, and developmentally appropriate assessments of every child, mapped on common scales. Further, it is recommended that ● Nationally consistent assessments on-entry to school be undertaken for every child, including regular monitoring of decoding skills and word reading accuracy using objective testing of specific skills, and that these link to future assessments. ● Education authorities and schools be responsible for the measurement of individual progress in literacy by regularly monitoring the development of each child and reporting progress twice each year for the first three years of schooling. ● The Years 3, 5, 7 and 9 national literacy testing program be refocused to make available diagnostic information on individual student performance, to assist teachers to plan the most effective teaching strategies. (DEST, 2005, p.18)

Figure 2.2 NITL Recommendations 1, 2, and 9 It can be seen from Figure 2.2, that the first two of these recommendations are focussed on reading-accuracy instruction, while the third recommendation contains multiple recommendations on reading assessment. Emphases within the first two recommendations are in accordance with premises established above. These emphases include: • Reading-accuracy instruction being built from rigorous research knowledge. • Reading-accuracy instruction being considered with regard to the instructional needs of readers at different achievement levels, and year-levels. • The need for instruction focussed separately on reading-accuracy instruction, in addition to an integrated approach to reading. The emphases in Recommendation 9 are in accordance with aspects established earlier in this chapter including: • Assessment of specific reading-accuracy skills. • Use of reading-accuracy tests to monitoring reading-accuracy development across the early-years of primary school. • The important role of diagnostic data in supporting teachers’ instructional decisionmaking. • Restructuring of the Year 3 Test in order to provide data to teachers for instructional decision-making. Rapid-use reading-accuracy tests have potential to play a useful role in realising these recommendations. Further potential for improving Australian reading-accuracy instruction is seen in several recent Australian assessment and instruction initiatives, each of which appropriately considers reading accuracy both as a separate entity to reading comprehension, and as integrated within reading comprehension: • The recent second edition of First Steps (EDWA, 2004). • The LLANS (Longitudinal Literacy and Numeracy Study) literacy assessment (Louden et al., Copyright 2007 Susan Galletly PhD Thesis

37

2006; Meiers et al., 2006) for monitoring early-years literacy development. • The Classroom Literacy Observation Schedule (CLOS; Louden et al., 2006) Rapid-use reading-accuracy tests of different reading-accuracy skills have potential for use in conjunction with each of these initiatives.

Premise 8: The importance of orthographic knowledge Premise 8: English orthographic complexity makes reading-accuracy development complex, through its involving development of extensive orthographic knowledge. A nation’s orthography is its conventional spelling system. Alphabetic orthographies offer orthographic knowledge as a catalyst for reading-accuracy development (Chard et al., 1998; Juel, 1991). Many European nations have simple transparent orthographies, with few spelling rules beyond each letter having one sound (Galletly & Knight, 2004; Seymour et al., 2003). Reading accuracy is mastered in Grade 1 by almost all children in these nations, weak readers master reading-accuracy at ceiling level with additional support, and reading-accuracy difficulties of a continuing nature are virtually unknown (Aro, 2004; Aro & Wimmer, 2003; Seymour et al., 2003). Many Asian nations use fully-transparent transitional orthographies, which are learned prior to, and transition learning of the standard orthography. There is evidence that this builds phonemic awareness to ceiling levels, and has been instrumental in increasing literacy levels (Huang & Hanley, 1997). Reading-accuracy development in nations with transparent-orthographies is fully contained within this research’s phase of phonemic recoding, as shown in Figure 1.1 in Chapter 1 (The Reading-Accuracy Development model, p.1: 6). In contrast, phonemic recoding is a beginning phase of English-text reading-accuracy development, however. The largest part of English-text reading-accuracy development takes place in the phase of phonological recoding, and readingaccuracy development is still in progress in later primary school (Hanley et al., 2004; Jackson & Coltheart, 2001). It can be seen, from the foregoing, that the impact of English orthographic complexity is major. It is evidenced in: • Healthy-progress readers taking many years to reach proficient reading accuracy (Hanley et al., 2004; Seymour et al., 2003), • Reading-accuracy intervention with low-progress readers often being only moderately effective (Vellutino, 2000). • Many readers having reading-accuracy difficulties as adults (Torgesen, 2000).

English orthographic complexity The crux of English-text reading-accuracy development is phonological recoding. Within this phase, knowledge of orthographic units as evidenced in reading accuracy moves from approximately 26 units, i.e., the commonest letter-sounds, to many hundreds, as indicated in Figure 2.3. The 45 phonemes of spoken English with common OUPCs

Copyright 2007 Susan Galletly PhD Thesis

38

23 consonant phonemes commonly used by 18 graphs and 4 digraphs, with 219 spellings in total (Dewey, 1971) [b d f g (for /g/as in go) h j k l m n p r s t v w y z] [ch sh th(2) ng] 18 vowel phonemes represented by numerous common and less common vowel graphemes, with 342 spellings in total (Dewey, 1971): Each phoneme is indicated with first its commonest grapheme ([i-e]), then a word with that grapheme for the phoneme (mite), then other common graphemes for that phoneme ([i ie igh y]. [a] mat [e] met [ea] [i] mit [o] mot [u] mut [a-e] mate [a ai ay] [e-e] mete [e ea ee y] [i-e] mite [i ie igh y] [o-e] mote [o oa] [u-e] mute [u ew ue] [ar] car [a] [er] her [ir ur ear] [or] for [aw au al] [ow] now [ou] [oo] foot [ou] you [oo] [oi] oil [oy] schwa(ə) or “neutral” vowel, as in bitter, David 3 less common phonemes of English air hair (ear) ear year (eer)

zh (treasure)

Orthographic knowledge aspects creating confusion for beginning readers Confusable consonant graphemes ● Common graphemes with 2 common phonemes: [g] saying/g/ & /j/ c saying /k/ & /s/ ● Infrequently used single consonants representing consonant blends q (kw) x (ks) ● Single consonants not saying their common sound in digraphs: sh th ch gh ph wh gn kn pn psy tch dge ● Silent consonants, e.g., ghost psyche lamb sick bridge horse gnome knee calm cupboard fetch who ● Consonant [y] used as a vowel, e.g., my party gym ● Consonants [r y w] used as vowel markers, following single vowel letters to form vowel graphemes: [ar er ir or ur ear our ore oor eer ay ey oy uy aw ew ow]. Confusable vowel graphemes (All): ● Many: many correspondences: No vowel grapheme has just one phoneme, e.g. [a] has only one common sound, as in hat, but represents at least 4 phonemes , e.g./ar a ə a-e/ in Ask Dad again later. ● Schwa (ə) uses many different graphemes, e.g. virus, gracious, began, nation High ratios in many: one, one: many, and many-many relationships in OUPCs One phoneme unit: Many graphemes/orthographic units: ● 6 graphemes for /sh/, as in shin station mission social chef sure ● 4 graphemes for the phoneme unit /al/: canal little panel special ● 11 graphemes for /or/: for sore pour raw awe taught bought sauce talk roar war Many phonemes: one grapheme/orthographic unit: ● [p] has 4 phonemes, as in pin phone psyche pneumatic ● [ough] has 8 phonemes or phoneme units, as in bough cough dough tough through thought thorough hiccough Many phonemes: many graphemes relationships: ● [c] is used in graphemes used for 4 phonemes, each grapheme represents multiple phoneme, each phoneme is represented by additional graphemes. These additional graphemes are shown in brackets: cheese hatch/ cat kick school (quay) / Cheryl social (nation hession shoe) / rice scent (sent hissing psyche) ● [a] is used in graphemes used for 14 phonemes, which are written using many different graphemes and orthographic units, as in at / gate later gauge ray great gaol / any said says dead / sea / aisle aye / was / beau go goal mauve / beauty/ ha ask far are half laugh / earn / hear / air (d)are (b)ear / naughty war saw sauce / ago again. (Each of the phonemes and graphemes is also linked to many other graphemes and phonemes. These are not listed due to being too numerous.

Figure 2.3 The complex relationships of English orthography It is seen from Figure 2.3 that English orthographic complexity has many facets. Three facets which particularly impact English-text reading accuracy development are: • Orthographic units having various numbers of letters such that the beginning reader needs to build awareness of the boundaries of each unit, e.g., if not aware that [ough] is one unit, readers may try to read it as o-u-g-h, or ou-gh. • Many OUPCs being one: many and/or many: one, such that in reading unfamiliar words, there is a need to process multiple options for many OUPCs. • The very large amount of orthographic knowledge which needs to be built during English-text reading-accuracy development. Copyright 2007 Susan Galletly PhD Thesis

39

Children’s levels of orthographic knowledge constrain their ability to read unfamiliar words. As more OUPCs are known, children are able to read expanding numbers of unfamiliar words which could not be read previously, due to OUPCs not being known. Crosslinguistic research has established that the differences in reading-accuracy and spelling development between English-text and transparent-orthography nations are due to differences in the degree of consistency of the OUPCs which are mastered in reading-accuracy development (Caravolas, 2004; Goswami, Gombert, & de Barrera, 1998; Pollo, Kessler, & Treiman, In press; Seymour et al., 2002). This places English orthographic complexity and the phase of phonological recoding as the factors impeding reading-accuracy development relative to transparentorthography reading-accuracy development. It is also likely that cognitive load aspects of children and instruction are pivotal in this phase (Cossu, 1999, Gathercole & Pickering, 2000). There is minimal research focussed on the role of orthographic knowledge in reading-accuracy development and instruction, notwithstanding the seminal work of Stuart and Treiman and their colleagues (Pollo et al., In press; Stuart, 2006; Stuart et al., 2003), and there are needs for research in this area. While it is generally acknowledged that skilled readers have expert orthographic knowledge, there is a paucity of research knowledge on stages and sequences of orthographicskill development: Whereas [reading researchers] have spent considerable effort examining properties of semantic and phonological representations and processes, the nature of orthographic knowledge has not been addressed to the same degree. This asymmetry reflects a broader pattern within the study of reading: Important aspects of orthographic processing have been neglected. (Harm & Seidenberg 2004, p. 714) There is additionally limited research on cognitive load aspects of reading-accuracy development (Cossu, 1999; Gathercole & Pickering, 2000). With orthographic complexity likely to be strongly linked to cognitive load aspects of student capacity, and instructional content, there are also needs for research on this area. No research was found on categories of orthographic units for use in reading instruction and assessment. With needs to consider the orthographic units sampled by reading-accuracy tests, the researcher developed orthographic categories for the dissertation from consideration of English orthography and frequent English words. The categories consider whole words, irregular words and regular words, and were trialled using two sets of 200 most frequent words of English text (Stuart et al., 2003; Zeno, Ivens, Millard, & Davucci, 1995). Figure 2.4 shows the categories and proportions of words in each category.

Copyright 2007 Susan Galletly PhD Thesis

40

1-25

26-50

51-75

76-100

101-125

126-150

151-175

176-200

the and a to said in he I of it was you they on she is for at his but that with all we can

are up had my her what there out this have went be like some so not then were go little as no mum one them

do me down dad big when It's see looked very look don't come will into back from children him mr get just now came oh

About got their people your put could house old too by day made time I'm if help mrs called here off asked saw make an

water away good want over how did man going where would or took kipper school think home chip who didn't ran know bear can't again

cat long things new after wanted eat everyone our two has yes play take thought dog well find more I'll round tree magic shouted us

other food fox through way been stop must red door right sea these began boy animals never next first work lots need that's baby fish

gave mouse something bed may still found live say soon night narrator small car couldn't three head king town I've around every garden fast only

Orthographic Categories [W=Wholeword R=Regular words]

No. (%) of words# 1st 100

2nd 100

Z* S* Z* S* Highest-frequency irregular whole-words: 8 10 3.5 5.5 W1 the a to said I of was you (words within the 25 most frequent words). Phonemic recoding using commonest sound of each letter, i.e., single 17 23 14.5 19.5 R1 consonant and vowel letters with vowels using their short vowel sound. Highly-frequent highly-consistent consonant graphemes: s/z final; th sh 12 12 10.5 10.5 R2 ch ck; double final consonant Common consistent vowel graphemes: CV with V saying letter name, 25 22 23 23 R3 e.g. so he), VCe, e.g. mate, [ar er ir or ee ai ie oa] Common irregular words within 100 most frequent words of English, 26 25 13.5 16 W2 e.g., one are what there all have some were Less frequent and/or less consistent graphemes: ore, qu, j/g s/c, no-use 8 7 10 15.5 R4 final e, schwa, silent ed,oo ow ou ea y oi ir; VCC with [o i] with 2 sounds Regular multisyllabic words: Orthographic units: [–le], Vowel graphs 4 1 6.5 3.5 R5 with 2 sounds: letter sound (happy) & letter name (paper). Infrequent relatively-inconsistent graphemes, e.g., 0 0 15.5 4 R6 ear, our, ph, wh, io, schwa (ous, -ain). Less common irregular sightwords (not in 100 most frequent words), e.g. 0 0 3 2.5 W3 through great come often * The two wordsets are labelled Z & S after their developers, Zeno et al. (1995), & Stuart et al. (2003) # Number of words listed equals % of words in that category in that group of 100 words.

Figure 2.4 Orthographic categories for the research It is seen in Figure 2.4 that there are nine orthographic categories: • Three of the categories relate to the phase of whole-word and irregular-word reading. They are indicated with the letter W for Whole-word. • Six of the categories relate to the reading of regular words, i.e., words the reader can read through recoding of OUPCs. They are designated with the letter R for Regular word. The first relates to the phase of phonemic recoding (R1), while the remaining five categories (R2 to R6) relate to the phase of phonological recoding. Copyright 2007 Susan Galletly PhD Thesis

41

The categories and percents of words per category in the 200 most frequent words of English text shown in Figure 2.4 are used in this dissertation for: • Considering skills appropriate for testing with rapid-use tests in this chapter. • Critiquing the tests used in the research in Chapter 3. • Considering the orthographic categories sampled in cohort test-data in Chapter 5. From the discussion of this section, and consideration of patterns of English orthography shown in Figure 2.4, it is considered that: • The amount of orthographic knowledge which children develop within the phase of phonological recoding is considerable and multifaceted. • Reading-accuracy tests which provide useful information on children’s levels of different types of orthographic knowledge will be useful in guiding classroom instruction building orthographic knowledge within reading accuracy. • Aspects of orthographic knowledge worthy of sampling include reading of open and closed syllables, single-syllable and multisyllabic words, irregular words of different levels of frequency, and words with vowels of different levels of frequency and consistency. • With English vowels being the most complex aspect of English-text reading-accuracy development (Dewey, 1970, Fry, 2004), it is considered that reading-accuracy testing should provide relatively specific information on children’s instructional needs for different vowel groups. These vowel groups would include [aeiou] vowels in closed syllables; [aeiou] vowels in open syllables; and common consistent vowel digraphs, including r-vowels, final-e vowels, and other highly consistent and frequent vowel digraphs. Few rapid-use reading-accuracy tests focus on orthographic knowledge beyond commonest letter-sounds, and reasons for this, based in the literature, are not altogether clear. CBM reading tests for reading beyond phonemic recoding are focussed almost exclusively on reading of meaningful text (Stecker et al., 2005), and there is minimal CBM literature on monitoring specific aspects of reading-accuracy development and orthographic-knowledge development beyond phonemic recoding. In their review of CBM research, Stecker et al. (2005) comment that although other reading measures have been investigated, 1-minute oral reading samples are established as reliable, thus are the test commonly used in CBM for reading. CBM tests of mathematics are strongly in contrast to CBM reading tests, being focussed specifically on different units of mathematical knowledge: Because mathematics generally is accepted as more skill specific than reading, content for CBM mathematics tests is derived by determining the grade-level skills deemed important in the student’s curriculum. … Consequently, CBM assessments were developed that represented the most critical computational skills at each grade level. (Stecker, Fuchs, & Fuchs, 2005, p.801) The emphasis of this dissertation is not on specific forms of instruction directly teaching orthographic knowledge. Orthographic knowledge is considered as developing within readingaccuracy development, and the focus of this research is assessment of reading-accuracy skills. Orthographic knowledge also develops within spelling development, thus there are overlaps between spelling and reading-accuracy development and instruction. Copyright 2007 Susan Galletly PhD Thesis

42

2.3

Skills for testing in reading-accuracy tests

In Chapter 1, the construct of reading-accuracy was introduced and defined, and six interacting phases of reading-accuracy development were introduced within the Reading-Accuracy Development model, detailed in that chapter. This section examines aspects of reading-accuracy development relevant to the research using the phases of the Reading-Accuracy Development model. In doing so, lists of reading-accuracy skills have been generated that can form the basis of rapid-use reading-accuracy tests for monitoring children’s reading-accuracy progress.

Phonological and phonemic awareness Phonological awareness, and its subcategory, phonemic awareness, are established as being important aspects of English-text reading-accuracy development. The relationship between phonological awareness and reading-accuracy development is reciprocal, each being active in developing the other, and as such, accelerating the development of readers’ representations (Goswami, 2002). For the purposes of this dissertation, from consideration of the extensive research literature on phonological and phonemic awareness (e.g., Byrne, 1998, Ehri, Nunes, et al., 2001; Stuart, 2006, Treiman, 1992), it is deemed that phonological awareness, including early and later phonemic awareness, and rhyme and syllable awareness, are important aspects interacting with and progressing reading-accuracy development. There is a plethora of research that has established the importance of and sequence of development of phonemic awareness skills prerequisite to reading-accuracy development (e.g., Adams, 1990; Bryant, Maclean et al., 1990; Byrne, 1992, 1998; Ehri, Nunes, et al., 2001; Treiman, 1992). The usual order of skill development is considered as being: • Identifying initial-consonant phonemes of words. • Blending lists of phonemes to make a syllable. • Identifying final-consonant phonemes. • Segmenting single syllables into their sequence of phonemes. There are also studies establishing that rhyme awareness is prerequisite to ability to read by analogy (Bryant et al., 1990; Byrne, 1992, Treiman, 1992), in particular, ability to list words which rhyme with a given word. When tested at different stages of reading-accuracy development, each of these skills is established as a predictor of reading-accuracy development (Treiman 1992; Ehri, Nunes et al. 2001). Blending phonemes is the logical precursor of reading-accuracy, while listing phonemes underlies spelling. Listing is a stronger predictor of reading-accuracy development than is blending, and identifying initial sounds (Treiman, 1992). It is also possible to test advanced phonological awareness skills which readers are not able to do prior to mastering phonemic recoding. These skills usually involve manipulating sounds in syllables, e.g., “Say ‘slip’ without the /p/ (NRC, 1998). All the skills discussed here are assessable using rapid-use tests each sampling one skill multiple times. Consideration of the phonological awareness skills likely to be used in the phonologicalrecoding phase suggests likely skills to be: • The phonemic awareness skills used in phonemic recoding, discussed above. • Advanced phonemic awareness skills including manipulation of phonemes in syllables (NRC, 1998). • Rhyme skills and identification of rimes (Bryant et al., 1990; Goswami, 1992, 2002; Treiman & Zukowski, 1996). • Syllable awareness, and identification and manipulation of syllables in words (Goswami, Copyright 2007 Susan Galletly PhD Thesis

43

1992, 2002; Treiman & Zukowski, 1996). Vowel awareness, and identification and manipulation of vowels in single-and multisyllabic words (Treiman, Kessler, & Bick, 2002). There is extensive research on phonological awareness skills for early reading-accuracy development, and particularly on identifying initial phonemes, rhyme, and blending and segmenting phonemes. In contrast, there is minimal literature focussed on phonological awareness skills for the later phonological recoding phase. •

Whole-word and irregular-word reading As noted in Chapter 1, two types of reading are included in the phase of whole-word reading. These are: • Fully whole-word reading of words learned as whole-words in the first months of reading-accuracy instruction, when minimal or no letter-sound skills are present (Ehri 1992, 1995). • Semi-whole-word reading of words which are learned using partial-logographic strategies, due to their OUPCs not all being known (Frith, 1985). Fully whole-word reading is equivalent to Frith’s (1985) “Logographic” stage, and Ehri’s (1995) “Pre-Alphabetic” stage. These whole-words are words which children encounter frequently in early reading texts or on flashcards. They include both regular and irregular words. While there is minimal research literature on the area, the crux of fully whole-word reading is likely to be that children can make relatively effective long-term memories of words which they have been taught and practised (Baddeley, 2002; Ehri & Snowling, 2004). Fully whole-word reading occurs only for a very short time, if letter-sounds are taught from the same time that early whole-words are taught (Frith, 1985). Once phonemic and phonological recoding skills are developing, fullylogographic whole-word reading is discontinued. From this point, it is likely that words are read using one of two methods (Ehri, 1992; Frith, 1985): • Phonological recoding of all the words’ orthographic units • Semi-wholeword reading where words are learned as whole-words, but knowledge of known OUPCs in the words supports this learning. In this way the word [have] may be learned as a whole-word, but the student’s knowledge of [h]:/h/ is involved in the learning. As such, first words such as [it is was cat] are likely to be learned as fully or semi- whole-words by very beginning readers, and developing readers encountering words such as [yacht colonel] may use semi-whole-word strategies for words such as [yacht choir colonel]. Research on wholeword learning has established that: • Many children learn to read whole words but fail to master phonological recoding, thus explicit instruction is needed to progress children from this phase (Byrne, 1998; Seymour & Elder, 1986). • More progress is made when phonemic awareness instruction is integrated with lettersound instruction (Byrne & Fielding Barnsley, 1995). • It can take many exposures to learn whole-words, e.g. Stuart, Masterson, and Dixon (2000) found British children in first term of Reception could only read a mean of 4.9 words of 16 words which had been identified to them 36 times. • Children learn whole-words more effectively when completely decontextualised: Flashcards are far more effective than repeated reading of texts with prior presentation of target words, and this, in turn, is far more effective than repeated reading of texts without specific emphasis on target words (Stuart et al., 2000). The whole-word and irregular-word reading phase used in the Reading-Accuracy Development Copyright 2007 Susan Galletly PhD Thesis 44

Model includes highly irregular words and also regular words whose OUPCs are likely to be not yet known, at the time the test is used. As such it includes the reading of words which cannot be read using phonological recoding of all OUPCs contained in the words (Frith, 1985). Three of the orthographic categories developed for this research are categories of whole-word reading (See Figure 2.4, above): • Highest-frequency irregular whole-words: words within the 12 most frequent words, which cannot be read using phonemic recoding [the a to said I of was you] or recoding of letter-names in open syllables, e.g., /ī ē/ in [I he]. • High-frequency irregular words: words within the first 100 most-frequent words of English, which are unlikely to be phonologically recoded by beginning readers, e.g., [one are what there all have some were] • Less common irregular words: irregular words not in the first 100 words, e.g., [thought only]. It is seen in Figure 2.4, above, that there are considerable numbers of words of the first two categories in the first 100 most frequent words, with approximately 9% of words being in the first category and 25.5% being in the second category. In contrast there are relatively few of the third type of words in the second hundred (2.75% of words), once words likely to be read with phonological recoding are considered. In connected text, there would be much higher proportions of the words in all three of these categories (Share & Stanovich, 1995). This makes whole-word reading and reading of irregular words an important aspect of English-text readingaccuracy development.

Orthographic knowledge Orthographic knowledge is knowledge of English OUPCs. As detailed in Chapter 1: • Orthographic units are letter-units of various size which correspond to a single phoneme or sequence of phonemes. They include graphemes, syllables, rimes, and letter sequences. • The link of letters to sounds is referred to as: o Grapheme-phoneme correspondences (GPCs), which is used in this dissertation to refer to commonest letter-sounds in phonemic recoding. o Orthographic-unit: phoneme/s correspondences (OUPCs) for phonological recoding. This section considers these two types of orthographic knowledge, each supporting its phase of reading-accuracy development: • Orthographic knowledge of commonest letter-sounds and their GPCs, which enable phonemic recoding. • Orthographic knowledge beyond commonest letter-sounds, including the full range of orthographic units and OUPCs. These enable the reading of words containing orthographic units beyond commonest letter-sounds. Discussion of these two types of orthographic knowledge will be included within discussion of the word-reading phases which they support.

Phonemic recoding Similar to whole-word reading, phonemic recoding is a temporary phase of wholeword reading, and one which is subsumed into phonological recoding once sufficient orthographic knowledge is built. There is a plethora of research establishing the importance of this phase of readingaccuracy development (e.g., Byrne, 1998; NRC, 1998; Ehri, Stahl, et al., 2001). In phonemic recoding, a written word is transformed to its list of phonemes ([bus] → /b//u//s/), then that list of Copyright 2007 Susan Galletly PhD Thesis

45

phonemes is transformed (blended together) to make the spoken word (/b//u//s/ → /bus/). Phonemic recoding is the first phase at which unfamiliar words can be read, thus mastering phonemic recoding strategy marks a conceptual shift in reading-accuracy development. No longer are words each a whole entity to be learned. They are also the sum of their parts, thus words as wholes and parts are now considered. At the phonemic recoding stage, children become able to engage in self-learning, i.e., reading unfamiliar words without being dependent on language scaffolding or someone first telling them the word. Phonemic recoding of commonest letter-sounds is contained within a single category of the orthographic categories used in this research (R1; see Figure 2.4, above). The words in this category are closed syllables, those ending in one or more consonants, not including [r w y], and containing /ă ĕ ĭ ŏ ŭ/ vowels. This is the commonest syllable-type in English text, occurring in approximately 43.3% of syllables, with the vowels making their /ă ĕ ĭ ŏ ŭ/ sounds between 89% (ŭ) and 99% (ĕ) of the time. (Stanback, 1992). As shown in Figure 2.4, approximately 20% of the first 100 and 17% of the second 100 words are in this category. In addition, children will encounter many less-frequent words formed completely from commonest letter sounds in addition to these high-frequency words. Phonemic recoding of commonest letter-sounds is thus an important reading-accuracy skill. In addition, as shown in the ReadingAccuracy Development model, mastery of phonemic recoding is the basis of phonological recoding. Words read using phonemic recoding may be real words or pseudowords, and single- or multiple syllable words (Torgesen et al., 1999). Phonemic recoding builds in accuracy and efficiency, from early hesitant recoding to automaticity. Consideration of the literature on this phase shows it to be assessable using: • Reading of familiar and unfamiliar words using closed syllables, single-letter graphs, and commonest GPCs (Ehri & Snowling, 2004, Good & Kaminski, 2002). • Consideration of successes and errors made in writing words, or in reading visually similar distracters with different vowel letters, e.g., [hut hat hit hot]. The first is assessable using rapid-use tests, while the second is assessable using wholeclass writing tests. The crux of this phase is ability to read unfamiliar words formed completely from single-letter graphs with their most-frequent OUPCs. Consideration of the different methods used to test phonemic recoding (Good & Kaminski, 2002; Liberman et al., 1989) suggests that: • Use of a range of words using most or all single-letter GPCs is useful for this testing, as errors made provide useful information on GPCs not yet known which will benefit by further instruction. • Use of simple wordforms such as Vowel-Consonant (VC) and Consonant-VowelConsonant (CVC) is useful for beginning readers. • Words used may be frequent or less frequent real words or pseudowords. • If real words are used, there is a need to reduce the likelihood of reading not being phonemic recoding and instead being whole-word reading of known words. Rigour can be increased by including visually similar words or rare words. Pseudowords are the most rigorous test of phonemic recoding, through being readable only through effective phonemic recoding. • It is possible that young children may be more confident reading real words, such that it may be valuable to test real words rather than pseudowords.

Commonest letter-sounds: Orthographic knowledge for phonemic recoding Commonest letter-sounds are likely to be the first OUPCs learned ( Treiman, Mullenix, BijeljacCopyright 2007 Susan Galletly PhD Thesis 46

Babic, & Richmond-Welty, 1995; Treiman & Rogriguez, 1999). As with whole-word reading, to the beginning reader, every letter is an entity to be learned logographically (Frith, 1985). There are at least four components to knowledge of each letter: the letter’s name, sound, and lowercase and capital forms. Research in this area has established that: • Letter-name and letter-sound knowledge at school entry strongly predict reading at end of year (Adams, 1990; Evans, Bell, Shaw, Moretti, & Page, 2006; NRC, 1998). In usual circumstances, children learn letter names earlier than letter sounds (Treiman, Tincoff, Rodriquez, Mouzaki, & Francis, 1998), children who know more letter names also know more letter sounds (Stuart, 2006); and knowledge of letter names facilitates the learning of letter sounds (Treiman et al., 1998). Letter sounds are learned more easily if they comprise the initial sound of the letter name, e.g., B – /be/, T – /te/, than if they comprise the final sound of the letter name, e.g., F – /ef/; S – /es/ (Treiman et al., 1998). • Children aged 5 years benefit from and enjoy explicit teaching of letter-sound correspondences (Dixon, Stuart, & Masterson, 2002). Awareness of phonemes and lettersounds is not induced intuitively by most readers, with explicit teaching (explanation and practice) required for this to occur (Byrne, 1998; Seymour & Elder, 1986; Stuart, 2006). Readers vary in the intensity of explanation and practice that they need to make successful progress (Byrne, 1998). • Both letter names and letter sounds are established predictors of later reading-accuracy achievement (Evans, Bell et al., 2006; Treiman, Tincoff, Rodriquez, Mouzaki, & Francis, 1998). As knowledge of letter-sounds, not letter-names, is prerequisite for reading accuracy, it would seem important for letter-sound knowledge to be included in reading-accuracy tests, either as well as or instead of letter-names. Both letter-sounds and letter-names can be tested using rapiduse tests formats. If would also seem valuable to incorporate testing of skill with lowercase and capital forms in these tests, and possibly also letters written using different fonts.

Phonological recoding beyond phonemic recoding Phonological recoding is the reading of words which have at least one orthographic unit which is not a single-letter graph using its commonest GPC (Stanovich, 1986). In this sense, words read with phonological recoding will have at least one digraph, less-frequent GPC or larger orthographic unit. The criteria for reading at this phase is that diverse OUPCs become mastered, enabling the reader to read unfamiliar words containing those OUPCs. As discussed above, this is the phase which differentiates English-text reading-accuracy development from transparentorthography reading-accuracy development, and makes English-text reading-accuracy development difficult for all readers and particularly at-risk readers. It is the largest phase of reading-accuracy development, and is characterised by high conceptual load and thus high cognitive load. Whereas less than 30 GPCs were mastered in phonemic recoding, hundreds of OUPCs are mastered at this phase of phonological recoding, as discussed above. Consideration of the orthographic categories with regard to the 200 most-frequent words (Stuart et al., 2003; Zeno et al., 1995), as shown in Figure 2.4, above, shows that five orthographic categories are used for phonological recoding. Together, these categories contain 35.5% of the first 100 words, and well over 60% of the second 100 words. This increase in the second 100 words reinforces the importance of phonological recoding for reading unfamiliar words across reading-accuracy development. The category for common consistent vowel graphemes (R3) involves a very large proportion of words: 23.5% and 23% of the first and second 100 words respectively. It is likely Copyright 2007 Susan Galletly PhD Thesis 47

that this high proportion of words in R3 indicates that too many different vowel OUPCs are included in this category, such that further orthographic categories may be needed from assessment perspectives. Consideration of methods used to teach or assess phonological recoding (Fry, 2000) shows it to be assessable through readers’ skills including: • Reading of familiar and unfamiliar words, particularly pseudowords, containing specific OUPCs. • Reading of decontextualised orthographic units. • Having students sound out printed real words and pseudowords, while indicating the graphemes being pronounced. • Writing of familiar and unfamiliar words, particularly pseudowords, containing specific orthographic units. • Higher-order tasks involving phonological recoding and orthographic knowledge, e.g. asking students to: o Read or write given words “correctly” as many times as possible using legitimate OUPCs. o Group together all orthographic units which represent a given phoneme. These skills are all assessable, using rapid reading-accuracy tests, or whole-class writing tests. It is likely that various combinations of tests including pseudowords, spelling, and reading of frequent and rare real words using decontextualised words and authentic reading, will be useful in this phase, particularly for more advanced levels of phonological recoding. There would seem a need to consider the cognitive-load of different wordforms used. Simple (CVVC) and complex (e.g., CCCVVCC) word-forms could be used to test accuracy and efficiency in different yearlevels.

Orthographic knowledge beyond commonest letter-sounds The importance of orthographic knowledge becomes apparent when the number of “rare” English words which readers encounter, is considered. In English text, there are many words which occur relatively infrequently: Whereas just over 100 'heavy duty' words (eg the, in, was, etc.) account for around half of all the letter strings appearing in printed school English, a very large number of words exist which appear very rarely in print (Carroll, Davies & Richman, 1971; Nagy & Anderson, 1984). In fact fully eighty percent of English words occur less than once in a million words of running text (Carroll et al, 1971). (Share & Stanovich, 1995, p.15) With a myriad of less-frequent words encountered across reading-accuracy development, ability to read unfamiliar words is an important English reading-accuracy skill. Orthographic knowledge is impacted by word frequency, consistency of OUPCs, and context sensitivity, all of which reduce orthographic complexity of English words. These factors impact the items which should be sampled in reading-accuracy tests. For instance: • Orthographic knowledge of words and OUPCs which are more consistent and occur more frequently is acquired more easily than knowledge of words and OUPCs which are less consistent or less frequent (Pollo et al., In press). This would seem an important factor in selecting items for reading-accuracy tests, e.g. words within the 50 most frequent words for beginning readers, and words not in the first 2000 for upper primary school. • Most highest-frequency words are monosyllabic, and increasing proportions of Copyright 2007 Susan Galletly PhD Thesis 48





multisyllabic words occur as word frequency drops (Dewey, 1970; Stanback, 1992). Given low numbers of highly frequent multisyllabic words, skill in reading multisyllabic words is a potential area of reading weakness. English orthography is highly context-specific, and awareness of context and contextual patterns is part of orthographic knowledge. For vowels it is the letters which follow the vowel letter which contextualise it, often limiting the vowel to a specific OUPC. This is seen when letter [a] is considered in the words [far fare fade fad fall fain fair]. There are 21 possible OUPCs for the single letter [a] (Dewey, 1970). When [a] is considered within each word’s rime (the vowel letter plus following vowel and consonant letters), all words except [fare] are regular with only one possible OUPC. Context-specificity thus reduces orthographic complexity, and is thus an important aspect of orthographic knowledge. As such, rimes are important aspects of orthographic knowledge, particularly for [a e i o u] vowels which would otherwise be relatively irregular. Consonant OUPCs are relatively regular, while vowel OUPCs are relatively irregular and are the basis of English orthographic complexity (Dewey, 1970).

The order in which vowels are taught in reading instruction, and the frequency with which they occur in reading materials used in reading instruction are also likely to influence ability to read words with specific OUPCs (Fry, 2000). This order is likely to be only partly influenced by frequency and consistency effects. It is also likely to be influenced by vowels being taught not as single abstract units but as members of groups of vowels with common characteristics OUPCs (Fry, Kress, & Fountoudidis, 2004; Galletly, 2003), as shown in Table 2.4. Table 2.4 Orthographic knowledge for instructional purposes Unit of orthographic knowledge

Examples

Commonest letter-sounds VC & CVC /ă ĕ ĭ ŏ ŭ/ words Consonant blends & digraphs Rimes Final-e vowels /ā ā ī ō ū/ vowels in open (CV) sylls R-vowels W-vowels Vowel graphemes saying long-vowel sounds: ai, oa, ea, ee, y(e), y(i), ie. oi oy ai ay Ea (2 sounds) Oo (2 sounds) Ou (many sounds, usually [out]) Less common vowel digraphs Regular multisyllabic words Longer multisyllabic words & schwa Less frequent orthographic units

btnae cat Sam vip shop ring clip thin fad fade fall fight mate hope he go paper virus far her sir for fur saw now dew he wait boat meat meet my happy pie boil boy bait bay dead meat good food ouch youth double could sue kidnap butter baby virus different gnome tough nation

Syllable frequency 43.3%

Consistency of OUPCs Consonants >90% Vowels 52%

6.7% 28.9% 10.2%

Vowels 71% 76%[e-e]-100%[a-e] 40% [a]–99% [u] 70% [ir]-99% [ur]

9.5%

Varying

Varying

Consideration of the orthographic characteristics listed in the left-hand column of Table 2.4 shows there to be sets of orthographic units with prescribed OUPCs, which are likely to be taught together (Fry 2000; Fry et al., 2004). Orthographic knowledge of these sets of orthographic units, e.g., r-vowels, final-e vowels, can be tested using rapid-use reading-accuracy tests. From the foregoing, it is argued that orthographic knowledge development within reading-accuracy development is important. Copyright 2007 Susan Galletly PhD Thesis

49

Reading accuracy in authentic reading The end goal of reading-accuracy instruction is efficient authentic reading with readers able to read text fluently and accurately, with reading-accuracy skills sufficiently automatic such that little attention needs to be focussed on reading accuracy (Chard et al., 2002; Kuhn & Stahl, 2003). As such, it is important to assess reading accuracy within authentic reading. Assessment can be conducted through measuring the efficiency of reading accuracy within authentic reading of meaningful text (Good & Kaminski, 2002; Stecker et al., 2005).

2.4

Conclusion

This chapter has established the context for the research. In the first part of the chapter, eight premises which frame the research have been considered. The second part of the chapter considers skills associated with reading-accuracy tests.

Research premises Conclusions concerning the premises which frame the research are briefly considered below. Exploration of Premise 1 (Aspects of current reading instruction need re-examination) has established that there is a need to re-examine principles of reading instruction currently used throughout Australia. In addition, it has briefly discussed the alternative views on what constitutes effective reading instruction, and specific issues underlying these views which warrant research. These issues include: • Whether the instructional needs of healthy-progress readers are the same as those of atrisk readers. • Whether reading accuracy and decontextualised reading-accuracy instruction are important, particularly for healthy-progress readers. • Whether synthetic phonics and reading of decontextualised words are appropriate practices. • Whether reading-accuracy should be dealt with separately to reading comprehension in reading-accuracy instruction and assessment. This latter issue was addressed in exploration of Premise 3 (For reading instruction decision-making, reading accuracy needs to be considered separately to reading comprehension) which argued the case that: • The Component Model is a useful model for considering reading accuracy, reading comprehension, and language comprehension. • Reading accuracy needs to be considered separately from reading comprehension and language comprehension in instruction and assessment, in addition to being considered as integrated within reading comprehension. Exploration of Premise 2 (The use of effective reading-accuracy tests can assist reading-accuracy instruction and increase student reading-accuracy achievement) established the validity and usefulness of rapid-use reading-accuracy tests, and their usefulness in achieving school and systemic purposes, and in supporting teacher instructional decision-making to achieve improved reading-accuracy instruction and achievement.

Copyright 2007 Susan Galletly PhD Thesis

50

Exploration of Premise 5 (Knowledge on reading-accuracy development and instruction is needed for the Australian context) established that there is currently insufficient research knowledge to resolve the issues underlying paradigmatic differences, or to prescribe principles of reading-accuracy instruction for classroom use with all different levels of reading achievers. Exploration of Premises 4 (In Australia, there is a need for rapid-use reading-accuracy tests for use in schools and research) and 6 (Australia may need more highly specified principles of reading-accuracy instruction than are needed in USA) established that Australia needs reading-accuracy achievement data and reading-accuracy tests for school-use and research purposes, and that Australia needs to conduct its own research to develop well-specified principles of reading-accuracy instruction applicable to all different levels of reading achievers. Exploration of Premise 7 (It is an appropriate time to develop best-practice reading-accuracy instruction for Australian schools), established current Australian potential for developing best-practice reading-accuracy instruction. It explored Recommendations 1, 2 and 9 of the NITL report (DEST, 2005), with regard to current Australian needs for reading-accuracy research, and established that rapid-use reading-accuracy tests are likely to prove useful supplements to LLANS assessments, the First Steps (Second Edition) framework, and the Classroom Literacy Observation

Schedule (CLOS)(Louden et al., 2006), all of which are emphasising consideration of reading accuracy separate from reading comprehension. Exploration of Premise 8 (English orthographic complexity makes reading-accuracy development complex, through its involving development of extensive orthographic knowledge) established English orthographic complexity as a key factor impacting English-text reading-accuracy development, and a need for research on aspects of English orthography in reading-accuracy development, instruction and assessment.

Use of reading-accuracy tests In line with the purposes listed at the beginning of this chapter, this chapter has established the overall importance of using rapid-use reading-accuracy tests. Specifically, it has established: • Developmental skills that need to be assessed in reading-accuracy tests. • The suitability and potential of rapid-use reading-accuracy tests for such testing. • Australian needs for: o Rapid-use reading-accuracy tests. o Research to establish principles of best practice reading-accuracy instruction for all different levels of reading achievers. This chapter has also indicated that there are a considerable number of areas needing schoollevel research, including many aspects of reading-accuracy development and instruction. Given the constraints of time, this dissertation only explores the appropriateness of DIBELS and TOWRE tests for use in Australia. As such, it has potential to facilitate research needed in these areas. This chapter has also established that the reading-accuracy tests which will be most useful in different contexts are to a large extent defined by issues within that context, such as curriculum emphases and development. This is particularly the case in the use of DIBELS tests in the Australian context, as detailed in the next chapter. Skills for testing in reading-accuracy tests have been identified from consideration of separate phases of reading-accuracy development. Copyright 2007 Susan Galletly PhD Thesis

51

These will be used in critiquing the tests used in the study in the next chapter and in consideration of the results of the research. With the research context established, Chapter 3 overviews the tests used in the research. Subsequent chapters then detail the methodology, results, discussion and recommendations of the research.

Copyright 2007 Susan Galletly PhD Thesis

52

Chapter 3 Overview of DIBELS and TOWRE tests 3.1 Introduction Previous chapters have established the importance of effective reading-accuracy instruction and reading-accuracy tests. This chapter is an overview and critique of the DIBELS and TOWRE tests used in the present study. The first section of the chapter overviews and critiques the DIBELS tests, and the second does likewise for the TOWRE tests. Parameters of each test-set and issues involved in using the tests in Australian contexts are detailed. The final section of the chapter considers the strengths, weaknesses and specific characteristics of the tests used in the research to establish their appropriateness for use in the Australian context.

3.2 Choice of DIBELS and TOWRE tests A range of rapid-use reading-accuracy tests are currently in use, including tests developed by Fuchs, Fuchs and colleagues (Compton et al., 2006; Hosp & Fuchs, 2005); the Australian Making Up for Lost Time in Literacy (Wheldall & Madelaine, 2000), designed for use with children with reading difficulties; and DIBELS and TOWRE tests used in the current research. DIBELS and TOWRE were selected for use in the current research for several reasons. Existing documentation available on both these tests indicated that • The tests are rigorous rapid-use tests established for use both in schools and in experimental research. • The TOWRE tests are screening tests established for use across both primary and secondary school-years. • The TOWRE test items sample words with diverse orthographic units, As such, they may have potential for providing qualitative diagnostic data on children’s instructional needs regarding orthographic knowledge and phonological recoding. • The DIBELS tests assess a range of early-years reading-accuracy skills, and are established for use across the primary-school years. • The DIBELS tests have a predictive model using data-based decision-rules for supporting instructional decision-making by teachers and schools. The purpose of this chapter is to provide • A case for using DIBELS and TOWRE tests in this research investigation. • A detailed description of the tests. • An overview of the tests.

3.3

Dynamic Indicators of Basic Early Literacy Skills (DIBELS)

This section details the characteristics of the DIBELS test system, then discusses schooling and curriculum differences between Queensland (Australia) and USA which impact the use of DIBELS in a Queensland school context.

Copyright 2007 Susan Galletly PhD Thesis

53

Origin and availability The DIBELS benchmark-assessment and progress-monitoring materials were developed by Good, Kaminski and colleagues at the University of Oregon (Good & Kaminski, 2002a, 2002b; Good, Wallin, Simmons, Kame’enui, & Kaminski, 2002) using a Curriculum-Based Measurement (CBM) framework and principles. The tests are free to use and downloadable from www.dibels.uoregon.edu. Schools are also able, at minimal cost ($1US per child per year), to use the DIBELS online data processing system, which rapidly processes the data entered by each school, and generates a range of summaries and reports for use at teacher, school and district level. The data is then stored in the data processing system. This enables longitudinal tracking by the school, and steady building of sophisticated knowledge on children’s reading-accuracy development and achievement.

Description of DIBELS tests The DIBELS test-set includes one test of language comprehension, one test of reading comprehension and five reading-accuracy tests. Four of these reading-accuracy tests are used in the current research. These are • Phoneme Segmentation Fluency (PSF) test. • Letter Naming Fluency (LNF) test. • Nonsense Word Fluency (NWF) test. • Oral Reading Fluency (ORF) test. Each reading-accuracy test assesses a specific reading-accuracy skill, and has multiple alternate forms, and as such has potential for usefulness in Curriculum-Based Measurement (CBM) Responseto-instruction frameworks of assessment and instruction (Compton et al., 2006; O’Connor, Harty & Fulmer, 2005). Characteristics of the tests used in the research are detailed in Table 3.1, and examples of testforms and stimulus materials are included in Appendix A. Table 3.1 DIBELS tests used in the research Test names & skills Developers: Kaminski & Good (from work by Marston & Magnusson). Skill: Naming letters.

Developers: Good, Kaminski & Smith. NRP Big Idea: Phonological awareness. Skill: Segmenting syllables into phonemes.

Administration features

Reliability & validity

Letter Naming Fluency (LNF) test Alternate forms: Benchmarks. 1-month K alternate-form reliability 0.88. Task: Naming mixed capital & Concurrent criterion-related validity with lower-case letters in 60 secs. Woodcock-Johnson (WJ) Readiness Cluster: 0.70 Test-points: BegK to BegG1 Predictive validity of K LNF with Benchmarks: BegG1: 25, 37. 1) Gr1 WJ Reading Cluster: 0.65, 2) Gr1 CBM Oral Reading Fluency: 0.71. Phoneme Segmentation Fluency (PSF) test Alternate forms: Benchmarks & 20 Two-week alternate-form reliability: 0.88. additional alternate forms. Concurrent criterion-related validity at EndK Task: Listing phonemes of single with WJ Readiness Cluster: 0.54. syllable words in 60 secs. Predictive validity of EndK PSF with 1) MidG1 NWF: 0.62. Test-points: MidK to EndG1. Benchmarks: MidK, BegG1:10, 35. 2) EndG1 WJ Total Reading Cluster: 0.68. 3) EndG1 CBM ORF: 0.62. Nonsense Word Fluency (NWF) test

Copyright 2007 Susan Galletly PhD Thesis

54

Developers: Good & Kaminski. NRP Big Idea: Alphabetic principle. Skill: Phonemic recoding of commonest lettersounds.

Alternate forms: Benchmarks & 20 additional alternate forms. Task: Phonemic recoding of VC & CVC [aeiou] words in 60 secs. Test-points: MidK to BegG2. Benchmarks: MidG1: 30, 50.

One-month alternate-form reliability: 0.83. Concurrent criterion-related validity WJ Readiness Cluster: EndG1:0.36; MidG2: 0.59. Predictive validity of MidG1 NWF with 1) EndG1 CBM ORF: 0.82. 2) EndG2 NWF: 0.60. 3) EndG1 WJ Total Reading Cluster: 0.66. Oral Reading Fluency (G1ORF; G2ORF; G3ORF) test Developers: Good, Alternate forms: Benchmarks & 20 Adequacy of G1-6 CBM & DIBELS ORF: Kaminski & Dill (from additional alternate forms per Alternate-form reliability of different samework by Deno). grade-level from Grades 1-6. grade texts: 0.89 to 0.94. NRP Big Idea: Fluency. Task: Reading grade-level text Criterion-related validity (from 8 separate studies): 0.52 to 0.91 Skill: Reading accuracy in for 60 secs. authentic reading. Test-retest reliabilities: 0.92 to 0.97. Test-points: MidG1 to EndG6. Benchmarks: For each test-point. [Beg: Beginning-Year; Mid: Mid-Year; End: End-Year; K: Kindergarten; G: Grade; NRP: National Reading Panel]

It can be seen from Table 3.1 that all four tests have multiple alternate forms, and established levels of validity and reliability, with reliability increasing when more than one form is used. Alternate-form reliability for the tests ranges from 0.83 to 0.94, and criterion-related validity levels range from 0.37 to 0.91. (Good & Kaminski, 2002a, Good, Wallin, et al., 2002). All DIBELS test constructs except letter naming are National Reading Panel (NRP) Big Ideas (NRP, 2000). (The areas of instruction considered by the panel to be Big Ideas of reading instruction are phonological awareness, phonics, fluency, vocabulary, reading comprehension.) All tests have benchmarks developed from DIBELS norms. DIBELS benchmarks consist of 2 cut-points for each test and test-point, e.g., the DIBELS benchmarks for Grade 1 Oral Reading Fluency (G1ORF) at the End-Grade-1 test-point are 20 and 40. The benchmarks delineate three tiers of a response-to-instruction framework, for broad instructional decision-making linked to student’s risk status: • Children achieving below the lower benchmark have At-Risk status. • Children achieving between the benchmarks have Some-Risk status. • Children achieving at or above the upper benchmark have Low-Risk status. Benchmarks closest to the mid-year test-points for the research cohort are indicated in the table. The DIBELS alternate forms for PSF, LNF, and NWF use stimuli of equivalent difficulty. The Oral Reading Fluency (ORF) tests are actually a series of tests, one for each USA grade-level from Grade 1 to Grade 6. In the current research, Grade 1, Grade 2, and Grade 3 ORF tests (G1ORF, G2ORF, G3ORF) are used. The passages used within each grade-level are of equivalent difficulty, with those of each successive grade-level having successively higher difficulty levels. The G1ORF passages have a Spache difficulty-level range of 2.0 to 2.3, the G2ORF passages range from 2.4 to 2.7, and the G3ORF passages range from 2.8 to 3.1 (Good & Kaminski, 2002b). The DIBELS tests can be used in multiple ways, from use as single-test-point tests, to use in ongoing integrated whole-school programs of assessment, planning and instruction.

DIBELS norms and test schedules Separate system-wide percentiles have been developed for each test at each test-point at which the test is used. Two test-schedules are available for testing at three test-points or four test-points per year. Most schools in the USA which use DIBELS use the 3 test-point DIBELS schedule, with testing conducted at the beginning, middle and end of the school year. The research reported Copyright 2007 Susan Galletly PhD Thesis

55

here also used this schedule, thus the percentiles and benchmarks corresponding to the DIBELS 3 test-points per year schedule will be the ones discussed in this dissertation. In this dissertation, USA year-levels are termed grade (G) and Kindergarten (K), while Queensland year-levels are termed year (Y). In discussion of test-points, grade-levels are used when referring to DIBELS norms, and year-levels are used when referring to the test-points used in the research. Test-points are termed Beginning-year (BegG, BegY), Mid-year (MidG and MidY, or Mg and My), and Endyear (EndG and EndY, or Eg and Ey). Using this format, test-points in tables and figures are referred to as e.g., BegG1, MidY3, and where space is limited as e.g., Bg1, Ey3). The percentiles were developed by Good, Kaminski and colleagues from the test-data entered by all districts, schools and children participating in the DIBELS data-processing service in the 20012002 school-year (Good, Wallin, et al., 2002). Subtests were taken at beginning, middle and endyear for Years K-3. Districts and children assessed at these test-points varied from 287districts and 4460 children at End-Kindergarten to 83 districts with 2343 children at Beginning-Grade-3. DIBELS use in America has continued to burgeon, such that the system currently processes data for over 3,000,000 children. DIBELS benchmarks for each test have been confirmed through longitudinal monitoring of children across reading-accuracy development (Good, Simmons, Kame'enui, Kaminski, & Wallin, 2002), and through comparing DIBELS achievement with achievement levels on independent tests of reading and academic achievement (Buck & Torgesen, 2003; Shaw & Shaw, 2002). The researcher has developed norm-lines for different percentiles of achievers from the available tables of DIBELS percentiles (Good, Wallin, et al., 2002). These are included in Appendix B.

Evidence of floor and ceiling effects As discussed in Chapter 2, there is currently criticism of reading-accuracy tests as being invalid due to floor effects (many cases at lowest-level scores) and ceiling effects (many cases at highestlevel scores)(Paris, 2005a, 2005b). Particular consideration of floor and ceiling effects is thus included in this research. Scrutiny of the DIBELS norm-lines in Appendix B shows no evidence of ceiling effects in any tests except perhaps PSF and LNF, and indicators of mild floor effects at the first test-point for NWF, PSF, LNF, and G1ORF. These will be considered in the analysis of the research data in Chapter 5.

The DIBELS predictive model of reading-accuracy development DIBELS tests can be used as part of the DIBELS predictive model of reading-accuracy development, which is built from principles established in the research on prediction of readingaccuracy achievement. These principles include • Monitoring progress across test-points rather than achievement at a single test-point (Fuchs & Deno, 1991, 1994; Kame’enui, 2002). • Minimising the influence of interchild and home background factors through focussing on response to instruction in addition to achievement at individual test-points (Compton et al, 2006). • Assessing different predictors at different stages of reading-accuracy development (Fletcher et al., 2002; Roth, Speece, & Cooper, 2002). These include o Kindergarten predictors: early phonological awareness and letter knowledge (Catts & Hogan, 2003). o Early Grade 1 predictors: advanced phonological awareness, and phonemic recoding. Copyright 2007 Susan Galletly PhD Thesis

56

Mid-Grade-1 to End-Grade-6: reading of connected text (Catts, Gillispie, Leonard, Kail, & Miller, 2002; O’Shaughnessy & Swanson, 2000). • The predictors used at each test-point being the strongest predictors for that stage of development (Roth et al., 2002). • Each predictor being curriculum-based, i.e., a teachable skill of reading-accuracy development (Compton et al., 2006; Fuchs & Deno, 1991, 1994). The DIBELS predictive model uses tests strategically positioned at different periods of readingaccuracy development, as shown in Figure 3.1. o

Initial Sound Fluency (ISF)

Phoneme Segmentation Fluency (PSF)

Nonsense Word Fluency (NWF)

Oral Reading Fluency (ORF)

Letter Naming Fluency (LNF)

Figure 3.1 The predictive model sequence of tests It can be seen from Figure 3.1 that • Phonological-awareness tests are used sequentially, with Initial Sound Fluency (ISF) predicting Phoneme Segmentation Fluency (PSF). Only PSF is used in the current research as ISF is not used after the Mid-Kindergarten test-point. • PSF and Letter Naming Fluency (LNF) both predict success-levels for phonological recoding of commonest-letter sounds (NWF), but PSF and LNF do not predict success on each other. Table 3.2 shows the USA time-range and test-points at which each of the DIBELS readingaccuracy tests are used, and the benchmarks for each test and test-point. Table 3.2 DIBELS test-points and benchmarks DIBELS test-points Beg K Test-points Benchmarks Test-points Benchmarks Test-points Benchmarks

Mid K

End K

Beg Mid End Beg Mid G1 G1 G1 G2 G2 Phoneme Segmentation Fluency (PSF)

7,18

10,35

10,35 10,35 10,35 Letter-Sounding Fluency (LSF)

15,27

29,40

25,37 Nonsense Word Fluency (NWF)

End G2

Beg G3

Mid G3

End G3

Test-points

2,8

5,13

15,25 13,24 30,50 30,50 30,50 Oral Reading Fluency (G1ORF, G2ORF, G3ORF)

Test-points Benchmarks 8,20 20,40 26,44 52,68 70,90 [Beg-Beginning-year; Mid: Mid-year; End: End-year, K: Kindergarten, G: Grade]

53,77

67,92

80,110

It can be seen from Table 3.2 that DIBELS benchmarks are comprised of two numerals, these being the upper and lower benchmark cut-points used for assigning risk status, as discussed above. It is also seen from Table 3.2 that the tests have different numbers of test-points and that benchmarks rise across most, but not all, test-points: Copyright 2007 Susan Galletly PhD Thesis

57

PSF has 5 test-points and is monitored from Mid-Kindergarten to End-Grade-1. Benchmark values rise from Mid- to End-Kindergarten, then the same benchmarks (10, 35) are used from End-Kindergarten to End-Grade-1. • ORF is tested from Mid-Grade-1 to End-Grade-3 (for this research, and to End-Grade-6 for general DIBELS use). Increased level of difficulty of ORF materials for each gradelevel results in Beginning-year benchmarks not being higher than End-year benchmarks of the previous year: Beginning-Grade-2 benchmarks are very similar to those of EndGrade-1. In like manner, Beginning-Grade-3 benchmarks are lower than those of EndGrade-2. Benchmarks for each test-point allow children to be allocated to one of three risk-levels: At-Risk (AR), Some-Risk (SR), LR (Low-Risk). These risk-levels predict children’s likelihood of success at the next test-point, unless instructional changes are made between test-points: • At-Risk (AR) children o Are at high risk (≥90%) of continuing low progress and thus of failing to achieve Low-risk status on subsequent DIBELS benchmarks and reading success on independent measures. o Have instructional needs for  Intensive intervention in addition to normal classroom instruction.  Very frequent assessment (weekly or fortnightly) to monitor effectiveness of this intervention. • Some-Risk (SR) children o Have some likelihood (50%) of continuing low progress and thus of failing to achieve Low-risk status on subsequent DIBELS benchmarks and reading success on independent measures. o Have instructional needs for  Strategic (mild) additional intervention and/or changes in classroom instruction to ensure reaching the next benchmark.  More frequent assessing (perhaps monthly) to ensure progress rate is healthy. • Low-Risk (LR) children o Have high likelihood of success (and ≤10% likelihood of lack of success) on subsequent DIBELS benchmarks and independent measures. o Have instructional needs for  Continuation of current classroom instruction.  Assessing 3 (or 4) times per year to monitor continued healthy progress and reaching of DIBELS benchmarks. •

The DIBELS risk-categories match to the three tiers of a Response-to-instruction framework (O'Connor et al. 2005). The DIBELS benchmarks equate to the data-driven decision-rules considered essential for successful use of test data to improve reading-accuracy instruction and achievement (Stecker et al., 2005). The test-points shown in Table 3.2, above, are used with all readers (At-Risk, Some-Risk, and Low-Risk). Students with Some-Risk or At-Risk status are tested using this schedule and also individual test schedules involving earlier DIBELS tests for which they have not achieved Low-risk status. In many instances, Some-Risk students are tested monthly rather than 3 times per year, and At-Risk students are tested fortnightly or weekly (Good, Simmons, et al., 2002).

Administration of DIBELS tests Copyright 2007 Susan Galletly PhD Thesis

58

DIBELS tests can be administered by teachers, other professionals and teacher aides who have received training in administering and scoring the tests. Three sets of DIBELS Benchmark Tests with individual response booklets for each child are available for the three test-points (beginning-year, mid-year, and end-year) for each USA grade-level from Kindergarten to Grade6. There is no prescribed order in which the tests are presented, though it is likely most testers would use the order in which the tests are printed in the response booklet. Each test has its own standard instructions, and scoring procedures. In the USA, it is common for teachers to do one to two full days of training prior to using DIBELS tests, with further training sessions available focussed on building expertise in interpreting test-data and reports generated by the DIBELS online dataprocessing system. Each test has a basal criterion, such that testing is discontinued if the child does not succeed on a specified number of items at the start of the test. No ceiling criteria are used, and children continue until testing (for ISF) or the one minute testperiod is completed.

DIBELS usage in schools and research There is considerable evidence of DIBELS tests being used in • School monitoring of reading-accuracy development, and response to instruction (e.g., Carlisle et al., 2004; Maryland State Department of Education, 2006). In early 2005, the DIBELS data system was processing the data of over 3,000,000 children from USA schools which had chosen to use DIBELS. • Experimental and applied research studies using DIBELS ORF as a sole test or one of several reading-accuracy tests (e.g., Chambers, Cheung, Madden, & Slavin, 2004; Chard et al., 2001; Vaughn, Mathes, Linan-Thompson, & Francis, 2005). All of these studies’ findings reinforce the efficacy and rigour of DIBELS tests.

Information from DIBELS test data This section briefly considers the characteristics of the achievement and diagnostic data which DIBELS tests can generate.

DIBELS achievement data DIBELS achievement data is established as being useful for achieving multiple assessment purposes (Coyne & Harn 2006; Fuchs & Deno 1994; Kame'enui 2002). These include • Screening children’s achievement on different reading-accuracy skills. • Monitoring of the achievement and progress of individual children and groups of children, from single classes, to schools, districts, and potentially nations. • Appraising instructional effectiveness at these different levels, and making instructional changes. • Allocating instructional resources, through using DIBELS achievement data in responseto-intervention frameworks.

DIBELS diagnostic data Student response-forms for DIBELS tests enable teacher consideration of children’s successes and errors on each test-item. This provides qualitative diagnostic information additional to the general diagnostic information provided by DIBELS achievement data. As regards specific Copyright 2007 Susan Galletly PhD Thesis

59

diagnostic data guiding next steps of instruction, there are major differences between earlier DIBELS tests (LNF, PSF, NWF) and DIBELS ORF tests. Sampling intensity refers to the number of opportunities which a student has to demonstrate skill on the construct/s being assessed. The three earlier DIBELS tests use high sampling intensity focused on specific areas of instructional content. These tests thus generate clear, unambiguous diagnostic information on children’s instructional needs for next steps of instruction. DIBELS ORF achievement data provides highly specific information on children’s current fluency levels. It is possible that the diagnostic information on reading-accuracy instructional-needs which ORF tests generate may be limited and open to multiple interpretations. This is due to authentic reading being built from language comprehension and reading accuracy. When both language comprehension and reading accuracy are involved in the reading process, it is more difficult to establish whether successes and errors are due to language comprehension, reading accuracy or proportions thereof.

Evaluation of phases of reading-accuracy development The Reading-accuracy development model used in this research has six interacting phases, as presented in Figure 1.1 in Chapter 1 (The Reading-Accuracy Development model, p.1: 6). Chapter 2 established relevant assessable reading-accuracy skills for each of the six phases. DIBELS tests provide information on skills developed during reading-accuracy development. The use of specific DIBELS tests for assessing each of the six phases is briefly discussed here. Phonological awareness This phase is assessed by the DIBELS Phoneme Segmentation Fluency (PSF) test. PSF provides clear unambiguous information on phoneme segmentation, i.e., segmenting the phonemes of individual syllables. Phoneme blending (blending a list of phonemes to make a syllable), not phoneme segmentation, is the critical phonemic awareness skill for reading accuracy (Liberman et al., 1989). Phoneme segmentation is critical for spelling, however, and in addition is a more advanced skill than phoneme blending (Bryant et al., 1990). As such children who achieve LowRisk status on the PSF test are likely to have also mastered phoneme blending. Phoneme segmentation is additionally preferable to phoneme blending as a screening test, because segmenting takes longer to develop than phoneme blending (Bryant et al., 1990). A test of phoneme blending may thus be inefficient for the full range of achievers, due to floor and ceiling effects. With phoneme segmentation not being a critical phonemic awareness skill for reading accuracy, however, it is important to consider the role of best-practice reading-accuracy tests in providing clear unambiguous information on children’s instructional needs for the next-steps of instruction. From this perspective, it is possible that Australian teachers would benefit by the availability of a rapid-use test of phoneme blending to be used with children achieving At-Risk status on PSF. Future research to establish appropriate CBM tests of phoneme blending for Australian use is thus warranted. Whole-word and irregular-word reading DIBELS tests do not assess this phase. Some general indicators of skill in reading irregular words can be gathered from DIBELS Oral Reading Fluency (ORF) tests. The words read by the research cohort in Oral Reading Fluency (ORF) passages will be considered as to whole-words and irregular words which have been read. Orthographic knowledge The first part of the orthographic-knowledge phase is assessed by the DIBELS Letter Naming Fluency (LNF) test. DIBELS tests do not assess later orthographic knowledge. Copyright 2007 Susan Galletly PhD Thesis 60

LNF provides clear information on children’s skills with naming letters. Letter-sounding is the critical skill of phonological and phonemic recoding however. As such, providing diagnostic information only on letter-naming is potentially ambiguous, and could lead to instruction being focussed on letter-names without attention to letter-sounds. For this reason, the current research includes testing of letter-sound knowledge. It does this through repeating the DIBELS test of letter-naming, Letter Naming Fluency (LNF), with children asked to say letter-sounds, instead of letter names. This adapted test is called Letter-Sounding Fluency (LSF). LSF and LNF results will be compared to establish whether LSF might be a useful supplement or possible alternative to LNF. The current research is a preliminary investigation of this test. If results indicate potential usefulness, there would be needs to establish test reliability and validity. Phonemic recoding This phase is assessed by the DIBELS Nonsense Word Fluency (NWF) test. NWF provides clear unambiguous information on children’s skills with phonemic recoding of commonest lettersounds, which is a phase of reading-accuracy development in this research, and is a critical skill of reading-accuracy development (Liberman et al., 1989). With high sampling intensity focussed on phonemic recoding and orthographic knowledge of commonest letter-sounds, NWF is likely to provide clear unambiguous information on child’s instructional needs for the next-steps of instruction. At such, it may be a best-practice test of this area. Phonological recoding DIBELS tests do not specifically assess this phase. Some indicators of phonological recoding skill can be gathered from DIBELS Oral Reading Fluency (ORF) tests. The words read by the research cohort in Oral Reading Fluency (ORF) passages will be considered as to the orthographic units they contain, and the likelihood that they would be read using phonological recoding. Reading accuracy within authentic reading This phase is assessed by the DIBELS Oral Reading Fluency (ORF) tests. These tests provide clear information on fluency levels. Where analysis of successes and errors shows low ORF scores and few errors, needs for fluency building through practicing reading more quickly may be addressed. Where errors are present, the information on next-steps of instruction may be relatively ambiguous due to errors being from language-comprehension weakness, readingaccuracy weakness or an interaction of these areas. As such, ORF is not able to sufficiently address the phases of whole-word and irregular-word reading, phonological recoding beyond phonemic recoding of commonest letter-sounds, and orthographic knowledge beyond commonest letter-sounds. Whilst the foregoing indicates the DIBELS tests cover most of the phases of the Reading-accuracy development model, a possible limitation of DIBELS is that not all phases are specifically addressed by DIBELS tests. Research Question 4 of this research is focussed on reading-accuracy development and children’s needs for instruction to develop the phases of the Reading-Accuracy Development model. There are four phases which are specifically addressed by DIBELS tests: Phonological awareness, Orthographic knowledge (letter-knowledge), Phonemic recoding, and Reading accuracy in authentic reading. DIBELS tests of these phases use high sampling intensity. They thus generate both rigorous achievement data and specific diagnostic data to indicate children’s skills for the phase they assess. It is possible that the words which children read in ORF passages may sufficiently sample the other phases of reading-accuracy development. Table 3.3 shows the numbers of words read by readers at different levels of achievement (Good, Wallin, et al., 2002). Copyright 2007 Susan Galletly PhD Thesis

61

Table 3.3 Numbers of words read at different levels of achievement Numbers of words read 0-10

MidG1

5°3 10°6

EndG1

11-20

21-30

31-40

41-50 G1ORF

20°11 30°16 5°11 10°17

40°21 50°27

60°34

70°44

20°26

30°35

40°45

51-100

80°59, 90°83 50°54, 60°65 70°77, 80°92

>100

95°102 90°112, 95°129

G2ORF

BegG2 MidG2

5°9

10°15

20°26

5°16

10°26

EndG2

30°35

40°44 20°42

5°31

10°48

50°55, 60°66, 70°77, 80°90, 0°108 30°56, 40°68, 50°80 60°90, 70°100 20°68, 30°80 40°91, 50°100

95°125 80°113, 90°133, 95°149 60°109, 70°119, 80°133 90°151, 95°166

G3ORF

70°108, 80°121 90°138, 95°153 50°101, 60°110, 70°122 MidG3 5°29 10°47 80°134, 90°151, 95°165 40°110, 50°120, 60°129 EndG3 10°66, 20°86 70°139, 80°151, 90°170 5°46 30°99 95°183 [Mid: Mid-year; End: End-year; Beg: Beginning-year; G: Grade ORF: Oral Reading Fluency; BegG3

5°20

10°34

20°53, 30°66, 40°77 50°87, 60°97 20°67, 30°82 40°92

The numbers of words read by children at different deciles of achievement are developed from the DIBELS percentile tables (Good, Wallin, et al., 2000). It can be seen from Table 3.3 that there are considerable numbers of words read, even by readers at low levels of achievement. Readers at the 20° correctly read 11 to 26 words in Grade 1, 26 to 68 words in Grade 2, and 53 to 86 words in Grade 3, respectively. With test-scores calculated by subtracting the number of errors made from the total number of words read, it is evident that there will also be error data to consider, in addition to children’s successes with the numbers of words shown in Table 3.3. As such, it is possible that ORF tests will be sufficient tests of the two phases not specifically addressed by DIBELS tests: Whole-word and irregular-word reading, and Phonological recoding. Towards consideration of these two phases of reading-accuracy development, as part of data analysis for this research, the words read in Oral Reading Fluency (ORF) by each year-level cohort will be analysed as to the types of words and orthographic units which are addressed.

The use of DIBELS with Queensland children Aspects needing consideration in using DIBELS in an Australian context include QueenslandUSA differences in vocabulary, and curriculum-differences in early-years of schooling. These aspects are considered in the following sections.

Test vocabulary The DIBELS materials use USA spelling and vocabulary (single words and phrases), thus Australian-USA vocabulary differences needed to be considered. To minimise vocabulary issues, the researcher analysed all wordlists and reading passages in K-G3 Benchmark Tests and alternate-form Progress Monitoring materials, and created a list of possible mismatches. This list was then forwarded to DIBELS developer and researcher, Ruth Kaminski (Good & Kaminski, 2002a, 2002b), and discussions on vocabulary differences were held by phone and email. Copyright 2007 Susan Galletly PhD Thesis

62

Kaminski commented that many of the queried terms had also been queried by many individual American states, such that these concerns were issues of vocabulary complexity rather than cultural difference, e.g., use of cub and insect rather than the more familiar terms bear and beetle, as picture names in DIBELS ISF test. It was decided that no changes would be made in such instances. Where cultural differences were present, e.g. ISF’s use of dime, changes were made using options least likely to reduce the applicability of current DIBELS norms. In some cases single items were changed, while in others, the Benchmark passage for that test-point was omitted and replaced by one of the 20 alternate-form Progress Monitoring measure for the test from the 20 alternate forms available for each measure at each year-level. Given that Australian children are exposed to large amounts of American television and literature, vocabulary differences where American words were likely to be familiar to Queensland children, e.g. cookie (biscuit), were not changed. Australian spelling was used for all passages, e.g. colour/color, favourite/favorite.

USA-Queensland curriculum differences Quite major curriculum and schooling differences exist between Queensland and USA. Queensland, USA and other Australian states have similar school-commencement age, with all children starting school at approximately 5 years of age. USA and NSW (Australia) call the first year Kindergarten while Queensland terms it Preschool. At the time of this research’s implementation, Queensland and USA had similar attendance-requirements in their first school year, both using a 2.5 days per week model, with pupils attending either half-days every day or 5 full days per fortnight for the full school year. In this regard, they both differ from New South Wales and Victoria, where attendance is fulltime. In 2007, Queensland changes from Preschool to Prep, wherein the age of school-commencement will rise by 6 months, attendance will be fulltime, and the adult: child ratio for instruction will rise from the current 1:12.5 to 1: 25. Queensland differs from both USA and some Australian states in curriculum content of the first year of schooling. Whereas New South Wales, Victoria and USA children commence formal reading-accuracy instruction in their first year of schooling, this does not start until Queensland children’s second year of schooling, as detailed in Table 3.4. Table 3.4 Age and curriculum differences between USA and Australia USA

Australian States Qld

Age at start of school Age at start of reading instruction Reading instruction content in first year at school

NSW & Vic

5 yrs 5 yrs

5 yrs 6 yrs

5 yrs 5 yrs

Formal teaching of letter-names & sounds & alphabetic principle.

Much literature read to children. No formal teaching.

Formal teaching of letter- names & sounds & alphabetic principle.

As shown in Table 3.4, Queensland children have equivalent length of schooling with children with one USA grade-level, but equivalent length of formal reading-accuracy instruction with the grade-level below. In this dissertation, in order to support easy distinction of potentially confusable concepts, these two equivalents are termed schooling-equivalents and instructionequivalents, and the terms Grade and grade-level are used referring to USA schooling levels, while the terms Year and year-level refer to Queensland schooling levels, as detailed in Table 3.5.

Copyright 2007 Susan Galletly PhD Thesis

63

Table 3.5 Equivalence of Queensland year-levels and American grade-levels Queensland year-levels

USA grade-levels Schooling-equivalent

Instruction-equivalent

(Same length of schooling)

(Same length of formal instruction)

Queensland Year 1 (Y1)

USA Grade 1 (G1)

USA Kindergarten (K)

Queensland Year 2 (Y2)

USA Grade 2 (G2)

USA Grade 1

Queensland Year 3 (Y3)

USA Grade 3 G3)

USA Grade 2

For the current research, it was decided to use each DIBELS test with children in both its Queensland schooling-equivalent and instruction equivalent. Table 3.6, below, shows the DIBELS measures used at an extended range of test-points compared to USA test-points. Table 3.6 Qld-USA schooling- and instruction-equivalent test-points DIBELS tests

DIBELS test-points Mid K

Phoneme Segment. Fl.(PSF) Letter-Sounding Fl.(LSF) Nonsense Word Fl. (NWF) G1 Oral Reading Fl.(G1ORF)

Y1&& Y1&& Y1&& Y1&&

G2 Oral Reading Fl.(G2ORF)

End K Y1&& Y1&& Y1&& Y1&&

Mid G1 Y1### Y1### Y1### Y1### Y2&& Y2&&

End G1 ### ### ### Y1### Y2&& Y2&&

G3Oral Reading Fl.(G3ORF) [Mid: Mid-year; End: End-year, K: Kindergarten, G: Grade] USA test-points

###

Schooling-equivalent

Mid G2

End G2

Mid G3

End G3

Y2### Y2### Y3&& Y3&&

Y2### Y2### Y3&& Y3&&

Y3### Y3###

Y3### Y3###

&&

Instruction-equivalent

Table 3.6 shows the DIBELS tests along with the test-points used in the current research. USA test-points are shaded, while hatch marks and ampersands indicate the research-cohort’s schooling-equivalent test-points. Consideration of this figure shows the Queensland children are being tested with more DIBELS tests than is standard practice: • Consideration of columns shows that at each mid-year and end-year test-point, all yearlevel cohorts are tested with one more test than is used with USA children. • Consideration of the hatch-marks and ampersands representing school- and instructionequivalents, respectively, shows the research cohort being tested with tests from two USA test-points (Year 1: Mid-Kindergarten and Mid-Grade-1; Year 2: Mid-Grade-1 & Mid-Grade-2; Year 3: Mid-Grade-2 & Mid-Grade-3). Curriculum-mismatch issues meant the USA DIBELS test-points were not likely to match to Queensland test-points. There was no simple formula for deciding which USA test-point norms the Queensland year-levels should be compared against, as the test-points are not evenly spread within each year-level. Given that the USA children have received an additional year’s readingaccuracy instruction, it is possible that the Queensland children will not be as advanced as their USA schooling-equivalent grade-level. Equally, given that Queensland children are older than USA children when they start formal reading-accuracy instruction, and have had a year focussed Copyright 2007 Susan Galletly PhD Thesis

64

on language enrichment including much exposure to children’s literature which has been read to them, it is likely they will not score as low as children in their USA instruction-equivalent gradelevel. It can be seen in Table 3.6, above, that in many instances it is not possible to compare the cohort results with norms for both their schooling and instruction-equivalents. This is due to the narrow range of USA test-points for which the norms have been developed. To accommodate the lack of norms and USA test-point equivalents for many of the measures, and the non-comparability of ORF norms in different grade-levels, as shown in Table 3.6, above, it was decided to • Use schooling-equivalent norms as the norms of primary interest, and instructionequivalent norms as an indicator of possible low achievement. • Use norms from the closest available USA timepoint to schooling- and instructionequivalents. • Use G2ORF and G3ORF in two year-levels, for both schooling- and instructionequivalents. In standard DIBELS administration, the median of 3 ORF scores is used as the test-point raw score, thus use of G2ORF and G3ORF at 2 year levels meant that Year 2 and 3 children would read 6 ORF passages at each test-points. To reduce the number of tests conducted with each child, it was decided to use 2 rather than 3 grade-level ORF passages at each test-point, and use the average of these 2 test-scores as the grade-level ORF raw score. Use of the average of 2 testscores, rather than median of 3 test-scores has been used elsewhere (Roberts, Good, & Corcoran, 2005).

Non-comparability of DIBELS and standardised tests The norms of standardised tests are indicators of representative performance, i.e., the norms are representative of the population of readers for whom the test is developed. In contrast, DIBELS benchmarks are not representative of national achievement, but of likelihood of future success on DIBELS and other reading measures. Lack of comparability of DIBELS norms with norms of standardised tests reduces the possibilities of developing rigorous Australian benchmarks for DIBELS tests using cross-sectional research designs. Future longitudinal research using DIBELS tests at consecutive test-points across several years, and establishing the relationship of children’s DIBELS scores to later achievement on reading comprehension and authentic reading measures would be needed to achieve this purpose. The focus of the current research is thus preliminary research. It is considered that the findings of this research will provide a basis for decision-making as to future longitudinal research.

3.4

The TOWRE tests

This section details the characteristics of the TOWRE rapid-use reading-accuracy tests.

Origin and availability The TOWRE was developed by American reading-accuracy researchers Torgesen, Wagner, and Rashotte (1999). It is published by PRO-ED, and is available from PRO-ED distributors. At the current time, a second edition of TOWRE is under development. Whereas the first edition used in the current research has two alternate forms, the second edition is intended to have four alternate forms. Copyright 2007 Susan Galletly PhD Thesis

65

Description of the TOWRE tests The TOWRE tests measure efficiency of word reading, i.e. the interaction of speed and accuracy aspects of reading accuracy. There are 2 tests, each of which has 2 alternate forms of equivalent difficulty: • Sight Word Efficiency (SWE) which measures efficiency of real-word reading using a single list of 104 words of 1 to 4 syllables. • Phonemic Decoding Efficiency (PDE) which measures efficiency of pseudoword reading using a single list of 63 pseudowords of 1 to 3 syllables. SWE and PDE use highly similar stimulus materials, and administration procedures. Stimulus materials consist of a single A4-sized card containing all the words for that test in columns. This same one-page stimulus list of words for each test is used for all readers, regardless of the age and skills of the child being tested. The TOWRE tests sample different aspects of orthographic knowledge. The words used include regular and irregular words, common and less common vowels and vowel digraphs and orthographic units. The words from both forms of SWE and PDE (Form-A and Form-B) are shown in Appendix B. For SWE, words have been selected according to frequency of occurrence in printed test from beginning elementary school level, as well as length, and complexity and number of syllables. High frequency words are positioned within the words at the beginning of the word list. For PDE, a broad range of grapheme-phoneme correspondences are used, with difficulty systematically raised by increasing the number of phonemes, and complexity and number of syllables. Both forms begin with at least 6 x 2-phoneme words (CV.VC), then move into at least 6 CVC words, before moving onto words with consonant blends, vowel digraphs, and multisyllabic words. The TOWRE tests have high reliability (including overall, internal-consistency and stability-overtime reliability), with high coefficients (≥0.94) indicating low test error such that users can be confident in test results (Torgesen et al., 1999). The tests have strong concurrent validity with other widely used measures of word-level reading skills, including correlations at grade-levels from Grade 1 to Grade 5 of 0.85- 0.94 for PDE & the Word Attack test of the Woodcock Reading Mastery Tests-Revised (WRMT-R), and 0.89 for SWE and the Word Identification test of WRMT-R (Woodcock, 1987). The predictive validity of TOWRE tests for complex measures of achievement has been established through multiple studies Torgesen et al., 1997, 1999), with • SWE being more strongly related than PDE to later passage reading fluency and accuracy, and reading comprehension. • SWE being more strongly related than WRMT-R Word Identification (Woodcock, 1987), which is untimed, to passage reading rate. • PDE being more strongly related than WRMT-R Word Attack test, which is untimed, to passage reading accuracy and comprehension. Timing and frequency of use of TOWRE are left to the discretion of the test user. TOWRE offers multiple data options for reporting and considering test results, including PDE and SWE raw scores, standard scores, percentiles, reading ages, and reading grade-levels. A Total Word Reading Efficiency standard score, developed from SWE and PDE standard scores, is recommended as the test’s most reliable score. Using PDE, SWE or Total Word Efficiency standard scores, students can be allocated to one of seven achievement categories, as shown in Table 3.7. The Total Word Reading Efficiency data option is not used in the current research, which is focussed on results on individual reading-accuracy tests rather than test-sets. As such, data analysis is focussed on separate SWE and PDE results, and not on combined measures. Copyright 2007 Susan Galletly PhD Thesis

66

Table 3.7 Standard Score (SS) ranges used to delineate TOWRE achievement categories SS range

Achievement level

% of normal distribution

131-165 121-130 111-120 90-110 80-89 70-79 35-69

Very superior Superior Above average Average Below average Poor Very poor

2.34 6.87 16.12 49.51 16.12 6.87 2.34

It is seen from Table 3.7 that in a normally distributed sample, approximately 50% of readers are in the Average achievement category, with standard scores between 90 and 110, and that there are 3 categories for higher achievers and three for lower achievers, respectively. The authors recommend that students achieving below the 30˚ on either both tests or just PDE receive additional reading support. This support is recommended even when only PDE is low, given that children who have memorised sight words but not mastered phonological recoding may achieve at average levels on SWE. The difference between SWE and PDE Standard Scores can be evaluated as significant by reference to a differences table. TOWRE does not specify any schedule of testing.

TOWRE norms The TOWRE tests are normed for readers aged 6 to 25 years. These norms were developed from testing of 1,507 persons in 30 USA states during Fall 1997 and Spring 1998 (Torgesen et al., 1999). The age groups used in the TOWRE norms are shown in Table 3.8, along with the USA grade-levels and Queensland year-levels to which the norms apply. Table 3.8 Ages and grades for TOWRE usage Age-group norms 6-0 to 6-5 6-6 to 6-11 USA grade-level Mid-G1 End-G1 Qld year-level Mid-Y1 End-Y1 [Mid: Mid-year; End: End-year, G: Grade; Y: Year]

7-0 to 7-5 Mid-G2 Mid-Y2

7-6 to 7-11 End-G2 End-Y2

8-0 to 8-11 Mid-G3 End-G3 Mid-Y3 End-Y3

It can be seen from Table 3.8 that the TOWRE norms are schooling-equivalent norms for the research sample, with Mid-Grade-1 being equivalent to Mid-Year-1. In this dissertation, TOWRE norms will be referred to using their grade-level equivalent, e.g. the norms for ages 7-6 to 7-11 will be termed the End-Grade2 (Eg2) norms. The TOWRE manual provides age-norms and grade-norms for calculating reading ages, gradelevels and standard scores. These norms use 6mth intervals for readers aged 6 to 8 years, and 12 month intervals for readers aged 8 to 25 years. The researcher developed norm-lines applicable to the test-points used in the research for both SWE and PDE tests from the TOWRE norms. These are included in Appendix B.

Evidence of floor and ceiling effects There are floor effects evident in the norm-line Copyright 2007 Susan Galletly PhD Thesis

for Mid-Grade-1, with approximately 25% of 67

the norming sample achieving raw scores of 0. The test authors discuss the presence of floor effects at this stage, and indicate that TOWRE is not powerful for the full range of reading achievers until readers achieve a reading age of approximately 6.5 yrs. It is possible there may be ceiling effects of the type Paris discusses, wherein in efficiency tests, results of fluent readers may vary less from reading skill than from efficiency of cognitive processing (Paris, 2005a, 2005b). Floor and ceiling effects will be considered in analysis of the research data.

Administration of TOWRE tests The TOWRE tests are designed for use by teachers, psychologists and other professionals and also paraprofessionals trained to administer the tests. Using the test manual, raw scores can quickly be converted into one or more of the range of the TOWRE data-options, including standard scores (Mean: 100; Standard deviation: 15), percentiles, age equivalents, and grade equivalents.

TOWRE usage in schools and research Consideration of the literature shows many instances of use of TOWRE tests for school and research purposes. For school use, their stated purposes are for monitoring growth, for being part of a battery of diagnostic tests, and to replace or supplement other standard diagnostic tests of context-free word reading ability. TOWRE is listed in Reading First’s list of reading-accuracy tests established as rigorous and appropriate for use in USA K-3 schools (Kame’enui, 2002). In these lists, it is included as a test for three of the four Reading First assessment purposes, namely screening, progress monitoring, and outcome assessment. It is not included in the list of tests for diagnostic purposes. There are indications that TOWRE tests are considered efficacious and a preferred test among experimental researchers, particularly in studies involving testing of very large numbers of children. Some studies used both SWE and PDE tests while others used just SWE. In several instances, one or both TOWRE tests were the only reading-accuracy tests used. These studies are in diverse areas of reading research including genetics and effectiveness of online tutor systems. TOWRE has been used in a large multinational longitudinal twin research exploring interrelation of genetics and environment in early literacy (Byrne et al., 2006; Hawke, Wadsworth, & DeFries, 2005). It has additionally been explored in a research of nonconventional procedures for early literacy assessment (Dale, Harlaar, & Plomin, 2005). In this research 5,544 children were assessed using TOWRE administered by phone, and results were compared with results from the children’s teachers’ assessments using UK National Curriculum criteria. It was established that phone administration of TOWRE tests was a valid test procedure, correlating strongly (r=0.69) with teacher assessments, and useful for the full range of achievers. From the foregoing, it is deemed that the TOWRE tests are well established not only as screening tests of reading-accuracy achievement, but also as rigorous efficacious research instruments.

Information from TOWRE data This section discusses the achievement and diagnostic data which can be gathered from use of TOWRE tests. As detailed above, TOWRE tests do not purport to be diagnostic tests and the stated purposes are screening and measuring of student reading-accuracy achievement levels. With student successes and errors marked on the test-form during testing in the same manner Copyright 2007 Susan Galletly PhD Thesis

68

that this is done with CBM tests, it is evident that TOWRE does generate both achievement data and diagnostic data with potential for consideration by teachers in their instructional planning. As such, the tests can be appraised as to their usefulness not just from achievement perspectives but also from diagnostic perspectives.

TOWRE achievement data TOWRE achievement data is rigorous and useful for achieving multiple assessment purposes including screening, outcome measurement, and appraising instructional effectiveness (Kame’enui, 2002; Torgesen et al., 1999).

TOWRE diagnostic data The words used in the TOWRE tests have potential to offer qualitative diagnostic data on three phases of reading-accuracy development: • Whole-word and irregular-word reading. • Phonemic recoding. • Phonological recoding. SWE items sample all three of these phases, while PDE items sample the latter two phases. The level of diagnostic information available from TOWRE tests is dependent on the words and orthographic units sampled in SWE and PDE tests. To evaluate the orthographic units and words sampled by SWE and PDE tests, the words used as test items were considered as to their orthographic characteristics, using the orthographic categories developed for the research and detailed in Figure 2.4 in Chapter 2 (Orthographic categories for the research, p.2:33). Student reading age was used to group items for consideration, and the words read by readers with reading ages from 6.0 to 10.0 years were analysed. Tables showing the orthographic categories of items, proportions of items in each orthographic category, and the OUPCs sampled by each test, are included in Appendix C. As discussed in Appendix C, it was found that the TOWRE tests offer considerable information on phonemic recoding, but insufficient information on the phonological recoding phase. This is due to • Insufficient sampling of many categories of orthographic units. • Forms A and B of SWE being dissimilar to each other from orthographic-category perspectives, such that they could not be used as alternate forms. In addition, the information provided by TOWRE tests is much less specific than that provided by other word reading tests, e.g., DIBELS Nonsense Word Fluency (NWF). The TOWRE tests show floor effects for much of Grade 1, as discussed above and in Appendix C, and phonemic recoding occurs early in reading-accuracy development thus is likely to be predominantly a Grade 1 skill. It is therefore likely that this diagnostic information on phonemic recoding may not be useful for USA Grade 1 and Queensland Year 1 children (as stated by the authors of the test. As such, the TOWRE’s diagnostic usefulness is limited for early-years children.

The use of the TOWRE tests with Queensland children No adjustments are needed for use of the TOWRE tests in Queensland contexts. There are no test-administration issues to be resolved, as both tests are used at all year-levels, and each test uses its same test-form for all year-levels. There are also no vocabulary issues to be resolved, as Copyright 2007 Susan Galletly PhD Thesis

69

the words used in SWE are all words used in Australia.

3.5

Conclusions

The current research is focussed on data generated from testing children in Years 1 to 3 in an Australian context, using DIBELS and TOWRE tests. This section considers potential usefulness of DIBELS and TOWRE tests in Australian contexts.

Periods of applicability of the tests Consideration of the USA grade-levels of test applicability supports use of DIBELS and TOWRE tests in the Australian context. Table 3.9 uses matching of USA grade-levels to Australian yearlevels, to indicate the way these tests might be used in Australia. Table 3.9 Test foci and periods of use Test-sets & tests

Primary school K

Secondary school

MidG/Y2– G/Y1 G/Y2 EndG/Y6 - - - - - - - -* - - - - - - - - --* -----------*

Adulthood

G/Y7-Adulthood

PSF Reading subskills LNF Authentic reading DIBELS NWF Decontextualised reading ORF -----------------------------SWE # TOWRE PDE # [G: Grade. Y: Year; PSF: Phoneme Segmentation Fluency; LNF: Letter-Naming-Fluency; NWF: Nonsense Word Fluency, ORF: Oral Reading Fluency; SWE: Sight Word Efficiency; PDE: Phonemic Decoding Efficiency] * Likely ceiling level for most readers # TOWRE floor effects (in place until reading age 6.25)

It can be seen from Table 3.9 that • Phoneme Segmentation Fluency (PSF), Letter Naming Fluency (LNF), and Nonsense Word Fluency (NWF) are used only in early-years instruction. • Oral Reading Fluency (ORF), Sight Word Efficiency (SWE) and Phonemic Decoding Efficiency (PDE) have potential for use across many years of schooling. TOWRE is normed for secondary school years, and it is possible Grade 6 Oral Reading Fluency passages may be found applicable to early secondary school years. From the foregoing, it can be seen that DIBELS and TOWRE tests have applicability to all primary school years, and TOWRE tests have applicability to all secondary school years. As such, they have potential for measuring achievement and monitoring progress in and across all school years.

The potential use and limitations of DIBELS tests The discussion of this chapter has established DIBELS tests to be rigorous rapid-use tests of phonemic awareness, letter-knowledge, phonemic recoding and reading accuracy within authentic reading. As such, they have potential for use in Australian research and school-use in primary school. The DIBELS predictive model and use of benchmarks and risk categories also have potential for improving children’s reading-accuracy achievement in Australia through bestpractice use of DIBELS test data. This potential lies in possibilities for • Use of DIBELS achievement data within response-to-instruction frameworks, using Copyright 2007 Susan Galletly PhD Thesis

70

DIBELS benchmarks and risk-categories as part of data-based decision rules. Use of DIBELS diagnostic data to guide teachers’ decision making as to children’s specific instructional needs and next-steps of reading-accuracy instruction. The DIBELS tests show usefulness for assessing key phases of reading-accuracy development, including phonological awareness, letter knowledge, phonemic recoding, and authentic reading. It is possible that the test-set is limited through not including tests focussed on letter-sounds, whole-word and irregular-word reading, and phonological recoding: • Assessment of letter-sound knowledge is addressed in this study through a test of Letter Sounding Fluency (LSF). • The two word-reading phases are addressed to a certain extent through the words read in Oral Reading Fluency (ORF) passages being analysed as the phases they are likely to build from. •

The potential use and limitations of TOWRE tests The discussion of this chapter has shown that the TOWRE tests are rigorous rapid-use tests of reading-accuracy achievement. They are normed for readers from ages 6.0 to 25.0 years, and offer multiple data options for expressing student achievement, including raw score, reading age, grade-level, and standard scores, and achievement categories. Their particularly rapid administration, and applicability from later Grade 1 to the end of secondary school, indicates their potential for two purposes. The first is potential as rigorous tests of reading-accuracy achievement for Australian research purposes. The second is potential as rigorous rapid-use tests for screening Australian readers across primary and secondary school. From the perspectives of the current research, a limitation of the TOWRE tests is their being solely tests of achievement. While offering diagnostic information on phonemic recoding, the information they supply on phonological recoding is insufficient. As such, TOWRE tests should be considered more with regard to their role as rigorous rapid-use screening and achievement tests, and not as tests providing diagnostic information.

Needs for additional reading-accuracy tests The foregoing sections have alluded to the limitations of both the DIBELS and TOWRE tests. Specifically, in the context of reading accuracy, there may be needs for establishing additional rapid-use tests for Australian use. These tests include • Tests of letter-sounding to be used in conjunction or as an alternative to Letter Naming Fluency (LNF). • Tests of phoneme-blending, to be used in conjunction with Phoneme Segmentation Fluency (PSF). • Tests of whole-word and irregular-word reading. • Tests of phonological recoding focussed on logical sets of OUPCs.

The use of DIBELS and TOWRE tests in this research The foregoing critique has served as a basis for establishing the reading-accuracy tests from the suite of DIBELS and TOWRE tests, which are used in the research. This research uses the nine tests listed in Table 3.10.

Copyright 2007 Susan Galletly PhD Thesis

71

Table 3.10 Tests used in the research DIBELS tests Letter NamingFluency (LNF) Phoneme Segmentation Fluency(PSF) Nonsense Word Fluency(NWF) Oral Reading Fluency (G1ORF, G2ORF,G3ORF)

Adapted tests Letter Sounding Fluency(LSF)

TOWRE tests Sight Word Efficiency (SWE) Phonemic Decoding Efficiency(PDE)

As discussed at the start of this chapter, DIBELS and TOWRE were chosen for the research because of their rigour, rapid-use administration, and potential for use in Australian schools for monitoring reading-accuracy development and guiding instructional decision-making. It can be seen from Table 3.10 that the nine tests used in the research are the two TOWRE tests and four DIBELS tests detailed in this chapter, and the additional researcher-adapted DIBELS test, LetterSounding Fluency (LSF).

Concluding remarks In line with the purposes listed at its commencement, this chapter has established the appropriateness of using the tests explored in the research. The important characteristics of the tests have been detailed. The tests have been reviewed as to their appropriateness for monitoring reading-accuracy development, and as tests of reading-accuracy achievement. All the DIBELS and TOWRE tests used in the current research are established as rigorous achievement tests. The DIBELS tests are additionally useful in providing diagnostic information on the aspects of reading-accuracy development, on which they are focussed. The next chapter details the methodology of this research.

Copyright 2007 Susan Galletly PhD Thesis

72

Chapter 4 Methodology 4.1

Introduction

Previous chapters have contextualised the research, and overviewed the tests used in the research. This chapter details the methodological aspects of the research. The first section of the chapter discusses the research paradigm and details the research design used in the research. The chapter then describes the research sample, and provides details on the ways the DIBELS and TOWRE tests have been used in the research. The last part of this chapter explains the data collection and analysis procedures that have been used.

4.2

The research paradigm

This research uses a quantitative approach developed within the postpositivist paradigm (Gall, Gall, & Borg, 2003; Lincoln & Guba, 2000; Punch, 2005). Postpositivism takes a middle path between the absolute perspectives of positivism and the social relativism of qualitative paradigms such as critical theory and constructivism. Postpositivism accepts the value of and uses pragmatic aspects of other paradigms including positivism, social activism, constructivism, and researcher participation (Lincoln & Guba, 2000). The pragmatic paradigm (Creswell, 2003; Thomas, 2003) was also considered as a potential paradigm for the research. This was because the pragmatic paradigm emphasises commensurability of quantitative (postpositivist) and qualitative (constructivist/ participatory) paradigms, and the actioning of this research has collaborative and constructivist aspects. In considering pragmatism versus postpositivism, it was considered that the focus of this research and its research questions was strongly on analysis of the data, using postpositivist assumptions, and that the research’s pragmatic aspects related to research methods and design rather than paradigmatic bases. Postpositivism was thus deemed the paradigm for this research. The assumptions of postpositivism (Lincoln & Guba, 2000; Punch, 2005), which frame this research, are briefly considered in the following sections.

Ontology Postpositivism emphasises critical realism, wherein reality is explored, but those involved are aware that perceptions of reality are influenced by researcher characteristics, such as values and beliefs. This reality is informed in large part from the research knowledge base. It is an accepted base of the current research that the researcher’s perception of reality is influenced by values and beliefs. In this respect, the choice of research area, topic, research questions, reading-accuracy tests chosen, and data analysis directions have been shaped by the researcher’s values, beliefs, and perception of reality. The researcher also considered the influence of values, beliefs and perspectives on data-analysis procedures selected for use. This is because it is possible to choose statistical procedures and Copyright 2007 Susan Galletly PhD Thesis

73

options which produce results and findings which are justifiable, but nonetheless less rigorous than those produced using more stringent options.

Epistemology Postpositivism uses a modified dualist/objectivist approach, wherein findings are considered likely to be true, rather than absolutely true. The researcher’s perspective is that postpositivist research produces findings which are indicators of absolute truth. When findings are repeatedly replicated through studies eliminating the impact of possibly influential factors, that absolute truth is approached. Even in this instance, however, it is important to be open to not-yetconsidered factors which may disprove established findings. The current research is preliminary research, concerned with establishing baseline reading-accuracy achievement data and exploring usefulness of test-data. Its findings are considered as indicators for future investigation using similar and alternate designs. This is in keeping with postpositivist epistemological perspectives. From ontological perspectives, as discussed above, the focus of the research was to establish results which were rigorous without resorting to justification. Statistical options and procedures which were chosen to support rigour included: • Use of established statistical procedures based on assumptions of normality only with data established as meeting these assumptions. • Conservative use of significance levels of p