Student achievement and education system performance in a developing country
Jeffery H. Marshall, Ung Chinna, Ung Ngo Hok, Souer Tinon, Meung Veasna & Put Nissay Educational Assessment, Evaluation and Accountability ISSN 1874-8597 Volume 24 Number 2 Educ Asse Eval Acc (2012) 24:113-134 DOI 10.1007/s11092-012-9142-x
1 23
Your article is protected by copyright and all rights are held exclusively by Springer Science + Business Media, LLC. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your work, please use the accepted author’s version for posting to your own website or your institution’s repository. You may further deposit the accepted author’s version on a funder’s repository at a funder’s request, provided it is not made publicly available until 12 months after publication.
1 23
Author's personal copy Educ Asse Eval Acc (2012) 24:113–134 DOI 10.1007/s11092-012-9142-x
Student achievement and education system performance in a developing country Jeffery H. Marshall & Ung Chinna & Ung Ngo Hok & Souer Tinon & Meung Veasna & Put Nissay Received: 13 December 2011 / Accepted: 9 January 2012 / Published online: 4 February 2012 # Springer Science+Business Media, LLC 2012
Abstract The global spread of national assessment testing activities, and the growing pressure to move beyond basic measures of participation in educational monitoring, means that student achievement measures are likely to become increasingly relevant indicators of systemic progress in the developing world. Using data from the CESSP
J. H. Marshall (*) Instituto de Investigación y Evaluación Educativas y Sociales, Universidad Pedagógica Nacional Francisco Morazán, Tegucigalpa, Honduras e-mail:
[email protected] U. Chinna Education Quality Assurance Department, Ministry of Education, Youth and Sport (MoEYS), Phnom Penh, Cambodia U. Chinna e-mail:
[email protected] U. N. Hok General Secondary Education Department, Ministry of Education, Youth and Sport (MoEYS), Phnom Penh, Cambodia U. N. Hok e-mail:
[email protected] S. Tinon Curriculum Development Department, Ministry of Education, Youth and Sport (MoEYS), Phnom Penh, Cambodia M. Veasna General Secondary Education Department, Ministry of Education, Youth and Sport (MoEYS), Phnom Penh, Cambodia M. Veasna e-mail:
[email protected] P. Nissay Education Quality Assurance Department, Ministry of Education, Youth and Sport (MoEYS), Phnom Penh, Cambodia
Author's personal copy 114
Educ Asse Eval Acc (2012) 24:113–134
project in Cambodia, this paper incorporates a standard decomposition framework to go beyond simple comparisons of average achievement levels over time in order to better understand the underlying dynamics of change. The results show that recent improvements in student achievement in Cambodia are attributable in part to changes in the composition of student cohorts, although there is some evidence of a tradeoff between increasing participation rates and average achievement. There is also some encouraging evidence that school quality is improving, especially in lower grades where the leveling off of participation is creating a policy window of opportunity. The framework can be easily applied to a growing body of assessment data in the developing world to aid both inter- and intra-national monitoring of education system progress. Keywords National assessment . Student achievement . International education policy . School quality
1 Introduction Discussions of system performance in developing country education policy circles have, historically, emphasized indicators of participation. One reason is the global reach of the Education for All (EFA) target of universal primary access and completion by 2015. In countries where participation rates have been low there is an obvious need to guarantee basic opportunity for all children, and the EFA primary education goal provides a straightforward benchmark that is relatively easy to monitor. Not surprisingly, a lot of the initiatives funded by international donors and national governments are directly (or indirectly) linked with participation goals. And as a result there is steady progress being made towards achieving them (UNESCO 2011a). This emphasis on participation in the monitoring of systemic progress does not rule out references to school quality. In fact, several Education for All goals are related to learning and quality, although universal primary completion is the only one that is part of the larger United Nations’ Millennium Development Goal framework. However, when policymakers and education officials do refer to progress based on quality their preferred reference points are related to inputs. A recent review of national plans for Education for All-Fast Track Initiative (EFA-FTI) countries finds that most rely on participation and completion rates augmented with input indicators such as pupil-teacher ratios, the availability of learning materials, and teacher qualifications, among others (Saito and van Cappelle 2009).1 Measures of actual student achievement levels are rarely referenced. The general absence of monitoring indicators related to student achievement is explainable by several factors. First, in the poorest countries assessment systems are not always in place, or information on student achievement is only available on an irregular basis. But even when information is more regularly available—either as part of national initiatives or through participation in international studies—there may be a reluctance to build these kinds of results into planning and monitoring documents 1
There is also the Education for All Development Index (EDI) used to track progress internationally. However, the EDI measure for quality—the survival rate to Grade 5—is more related to participation.
Author's personal copy Educ Asse Eval Acc (2012) 24:113–134
115
(Kamens and McNeely 2010). One concern is that low scores can create political problems for governments when the results are widely publicized and commented on, or lead to unreasonable calls by politicians (and others) to make quick improvement. There are also the technical challenges of interpreting the results from student achievement surveys, including the measurement of actual skill and proficiency levels as well as the meaning of changes over time. Given the global spread of national assessment and international testing—and the growing belief that these are necessary features of the institutional landscape in education (Baker and LeTendre 2005)—these kinds of political and (especially) technical constraints are likely to recede in coming years. Furthermore, the parallel rise in pressures for accountability and more effective monitoring of progress, measured on the basis of both intra- and inter-national comparisons, predicts a sort of convergence between testing and global monitoring a la the EFA framework focusing on participation. In other words, systemic performance discussions are likely to increasingly reference student achievement results, even in the poorest developing countries (Saito and van Cappelle 2009). This paper uses data from Cambodia to highlight both the challenges and potential rewards of monitoring systemic performance on the basis of student achievement levels. The data were collected by the Ministry of Education, Youth and Sport (MoEYS) national assessment team, with support provided by the World Bank through the Cambodia Education Sector Support Project (CESSP). Samples of students in grades three and nine were applied standardized tests in different years between 2006 and 2010, and the test scores were equated across years (within grades) to facilitate comparisons. Based on these equated measures the results show that grade three and nine test scores have improved substantially in relatively short periods of time. However, from a policy standpoint it is important to understand why these apparent improvements in achievement are taking place. To address this question we apply a standard decomposition framework borrowed from labor economics (Oaxaca 1973; Blinder 1973). Our results show that part of the improvement is due to (positive) changes in endowments related to household poverty and school quality components. However, these improvements are offset, to some degree, by changes in the composition of student cohorts and the “returns” to certain features of the schooling environment. To our knowledge this is the first paper that addresses the dynamics of systemic performance over time in a developing country setting using an econometric framework. It is clear that monitoring systemic progress solely on the basis of achievement measures should be done carefully given compositional changes that are on-going in many developing country education systems. However, by measuring system performance on a learning output basis it is possible to provide a more complete picture that combines participation and quality inputs with important outputs like achievement. The paper proceeds as follows. The next section provides a brief review of the Cambodian education sector together with some features that may help explain overall performance. Section 3 introduces the data and the methods. Section 4 summarizes the results, beginning with comparisons of test scores over time. This is followed by multivariate analyses of factors that affect achievement and a decomposition of these influences in different time periods to assess the relative importance
Author's personal copy Educ Asse Eval Acc (2012) 24:113–134 Number of Students
116 600,000 500,000 400,000 300,000 200,000 100,000 0
School Year Grade 3 Grade 9 Source: EMIS data, various years Fig. 1 Cambodia total enrollment in Grades 3 and 9, 1999–2009
of family background, school contexts and compositional factors as determinants of overall performance. Section 5 concludes.
2 An overview of Cambodian basic education Cambodia has made impressive inroads in rebuilding an education system that was systematically destroyed during the Khmer Rouge period from 1975–1979 (Clayton 1998; Marshall et al. 2009). This process has gained momentum since the return of democracy in the 1990s, in part due to a large presence of international aid agencies, local NGOs, and bilateral and multilateral donors. The impacts have been most clearly shown in participation rates for basic education, which includes primary (or elementary) and lower secondary education (grades 1–9). For example, between 1999 and 2007 the gross enrollment rate for primary and secondary education increased from 60 percent to over 80 percent (UNESCO 2011b). Figure 1 provides a graphical summary of total enrollments in the two grades analyzed in this study for the period 1999–2009. The changes in participation follow a common pattern in the developing world. Grade 3 enrollment growth was roughly 10 percent per year through 2003, which required a massive ramping up of physical capacity and contracting of teachers to keep pace. But this growth eventually leveled off and, beginning with the 2004–05 school year, total grade three enrollment began to decline nationally. However, for grade 9 the trend throughout this period is one of pure enrollment expansion: between 1999 and 2009 overall enrollment increased more than threefold, from 60,000 to over 180,000 students. These changes in participation have potentially significant consequences for average student achievement levels in Cambodia. A very high percentage of children now reach grade three,2 and the inflection point in 2004–05 is likely related to some 2
Household survey data from 2005 show that the net enrollment rate in Cambodia for primary schooling (grades 1–6) was roughly 80 percent (UNICEF 2008). Allowing for improvement in recent years as well as overage enrollment (which is declining) it seems likely that the grade three survival rate for all children in the country is above 90 percent. However, as more and more of the poorest children enter formal schooling there are concerns about increases in grade failure and dropout rates. These issues highlight the challenges facing countries like Cambodia in terms of enrolling 100 percent of school-aged children, even in early grades.
Author's personal copy Educ Asse Eval Acc (2012) 24:113–134
117
important compositional changes that are taking place. These include significant reductions in poverty levels nationwide, declining birth rates, and children enrolling in school at an earlier age. The result is a potential policy window of opportunity where resources can be spread among a smaller number of better-prepared children, which in turn predicts rising achievement levels. However, for grade nine the situation is different. The expansion in participation is largely a product of building schools in rural areas and providing more scholarships (Filmer and Schady 2008). This on-going transformation from a largely urban and elite grade nine population to a more inclusive and diverse one bodes well for Cambodia’s economic and human development (UNICEF 2008). But applying the same compositional logic as in grade three leads to a prediction of relatively stagnant—or even decreasing—average achievement levels in this grade. This discussion’s emphasis on participation and achievement does not mean that school quality influences operate solely through mechanisms that are affected by the number of children in the system. School and teacher factors clearly have the potential to influence learning independent of changes in factors like class size. In Cambodia in recent years two broad education policy initiatives stand out. The first is the harmonization of government, donor and NGO activities around the goal of Child Friendly Schooling (or CFS: see MoEYS 2007). A recent national review points to a deepening of the understanding and acceptance of the CFS concept throughout the system, but also highlights challenges in insuring quality implementation of the basic tenets (Bernard 2008). The second area is related to teacher support and training. Teachers are receiving more materials and trainings as part of an ongoing professionalization process that also includes significantly higher salaries (Benveniste et al. 2007). The impact of recent quality improvement policies is very difficult to judge, and it is doubtful that Cambodia is currently experiencing a major quality transformation in basic education. Furthermore, the fairly rapid growth of school places in grade nine— between 2005 and 2010 the number of lower secondary schools more than doubled— is likely to put some pressure on human resources at this level. As one of the poorest countries in Asia these built-in challenges are considerable, and funding for quality improvements necessarily competes with participation goals requiring more schools and subsidies for poor families. Nevertheless, there are encouraging signs that quality issues are receiving more and more attention throughout the Ministry of Education, Youth and Sport (MoEYS) structure. Tracking the progress of these efforts—in terms of both participation and quality—is therefore a pressing task.
3 Analytical framework 3.1 Sampling The data come from national samples created in 2006 and 2009 for grade three, and 2008 and 2010 for grade nine. Sampling was based on the Effective Sample Size (ESS) framework commonly incorporated in international student achievement studies (see Ross et al. 2001). Minimum cluster sizes (MCS) were set at 35 and 30 students/school in grades three and nine respectively, while the intra-class coefficient (ICC, also known as rho) was estimated using previous test applications in
Author's personal copy 118
Educ Asse Eval Acc (2012) 24:113–134
Cambodia.3 Based on these criteria the international (minimum) standard ESS of 400 students was achievable with samples of about 150 schools and 5,000 students. However, the actual samples were significantly larger, and included roughly 200 schools and between 6,000 and 7,000 students in each grade (by year). Implementation required a two-stage stratified cluster framework. The sampled population included roughly 5,500 schools in grade three, but only about 900 (in 2008) and 1,100 (in 2010) schools in grade nine. Stratification was based on location (urban and rural), and the schools were chosen using probability proportional to size (PPS) selection. However, in the 2006 grade three sample school size was also incorporated as a strata, and a non-proportional sample was chosen in order to insure minimum numbers of small schools. Weights are used to address this discrepancy in the grade three sampling strategy over time, as well as adjust all other samples since the data available for sampling corresponded to the previous year’s population (the latest data became available after the test applications). Finally, within each school a single section of students was chosen and, if necessary, additional students from one other section were brought in to reach the necessary 30 or 35 students. 3.2 Test design and equating The assessment team supervised test construction using curriculum specialists from MoEYS departments and in-service teachers. After defining a curriculum blueprint the specialists created item banks of between 80–120 multiple-choice items in Khmer language and mathematics. The tests were designed to measure the basic components of the curriculum using official standards, student textbooks and teacher guides. For the second round of testing in 2009 (grade three) and 2010 (grade nine) anchor items were also used to equate the tests across different years. These items were chosen to cover a range of difficulty levels, and when possible were placed in similar positions in the test booklets. The test equating analyzed the second round of data separately using the item parameters obtained in the first round of testing for reference. The final set of anchors was chosen on the basis of item drift, or stability (Cartwright 2007); these results are available upon request. This study only uses the inter-year equated scores with means of 500 and standard deviations of 100. This is a somewhat limited information set, for several reasons. First, the assessment exams also included open-ended questions where students were asked to write paragraphs and complete more complex mathematics problems. These results were discussed in internal reports and were shared with the education stakeholder community in Cambodia, but are not included here. The same is true for proficiency levels and actual multiple choice item results. A consistent theme in the internal reports is that student achievement levels in Cambodia are low, especially in grade three and in mathematics in all grade levels. The reliance in this paper on comparisons over time of an indexed score—rather than comparisons of specific skills and proficiency levels—is a significant limitation in terms of situating the 3
These sources mainly refer to previous MoEYS national assessments augmented with testing information obtained through specific projects. The results consistently demonstrated rhos of 0.20–0.30. Based on this previous information the actual rhos of 0.30–0.35 used in the various samples are somewhat conservative, and are likely to overstate the number of schools and students that are needed.
Author's personal copy Educ Asse Eval Acc (2012) 24:113–134
119
findings in a larger discussion about quality and policy in Cambodia. But the main purpose of this study is to apply a framework for monitoring progress to the kinds of data that are increasingly available in developing countries around the globe. The sensitive nature of the findings is not unusual, especially in countries with little previous experience in sample-based assessments. Building in part on these recent experiences, the Cambodian MoEYS is in the process of creating a new department with responsibility for carrying out quality assurance, including periodic assessments of student achievement. 3.3 Data collection and variables The data were collected during two-day visits to the schools at the end of the school year (which begins in October and ends in June). Table 1 summarizes all of the variables used in the analysis, which are grouped into five categories. The summaries in Table 1 show clear improvement in student achievement levels between the two testing years in each grade; Fig. 2a and b break down these results into national, urban and rural comparisons. In grade three average achievement levels have improved by about a half a standard deviation in only three years. This is a very large improvement in a short period of time, but based on skill levels (not presented) the averages in 2010 remain substantially below expected levels. Also, it is important to note that the improvement in grade three in 2009 is coming off of a very low baseline in 2006. The different rates of improvement by grade are consistent—to some degree —with different compositional dynamics between grades three and nine. The shrinking population in grade three (see Fig. 1) appears to be a result of increasing efficiency (less repetition) and declining birth rates. The 2009 grade three sample is significantly younger, has repeated fewer times, and has higher levels of SES: these factors would predict better average scores. In grade nine the steady growth in participation in recent years suggests a different compositional dynamic, although with gross enrollment rates at about 50 percent (UNESCO 2011a,b) these new entrants are not coming from the poorest households in the country. The 2008– 2010 improvement in the SES measure (based on household possessions and access to services) is not consistent with a “deteriorating” student body in terms of ability. But that may be somewhat misleading: household survey data comparisons (CDHS 2000–2010) show that the average grade nine student in 2010 was less poor on an absolute basis compared with the 2000 average G9 student using an identical SES index, but was more poor on a relative basis compared with population averages (i.e. they were in a lower income quintile in 2010). In other words, the average student’s SES can improve over time even while more and more relatively poor children are entering the system. 3.4 Empirical methodology The empirical work begins with statistical analysis of student achievement by subject and year. The multivariate estimating equation takes the form: 0
0
Ain ¼ bX Xi þ bS Sn þ ð"i ; t n Þ
ð1Þ
IRT equated score on Mathematics exam with 36–39 questions (by form)
Mathematics
Student age in complete years
Number of brothers and sisters reported by student
Number of times student has repeated grades 1–3 or 7–9
Total number of absences for student recorded by teacher
Student age
Number of Siblings
Number of times repeating
Student absences
Grade 3: 1 0 Student pays for extra time with tutor
Teacher is Female
1 0 Teacher is Female; 0 0 Male
Sum of household possessions and access to services like electricity (0–12 range)
Family SES
Teacher-Classroom Characteristics:
Number of books in home (000, 101–10, 2011–25, 30 25–100, 40>100)
Books in Home
0.44
2.5 (2.4)
1.8 (0.71)
0.05 0.16
1 0 Student reports missing class due to distance; 0 0 No
Due to distance
Grade 9: Frequency student meets with tutor (Scale 1–4)
0.14
1 0 Student reports missing class due to working; 0 0 No
Due to having to work
Meets with Tutor
0.33
1 0 Student reports missing class due to family problem; 0 0 No
2.4 (3.8)
0.65 (0.84)
2.6 (0.8)
11.3 (1.7)
0.49
504.5 (99.3)
504.0 (99.8)
0.31
4.6 (3.2)
3.4a (2.7)
0.45
2.2 (0.81)
2.5 (0.7)
0.09
0.20
0.33
5.1b (3.0)
2.3 (0.80)
2.7a (0.6)
0.09
0.22
0.38
–
– 0.40
0.06a (0.4)
2.4a (0.8)
16.0a (1.5)
0.53b
514.4a (110.7)
520.9a (149.8)
2010
0.04 (0.3)
1.6a (0.76)
0.23
0.05
0.16
0.23a
2.8 (4.0)
a
0.52 (0.80)
a
2.4 (0.8)
2.5 (0.8)
16.2 (1.5)
10.3a (1.7) a
0.51
495.7 (100.6)
556.8a (74.0)
0.49
497.9 (100.4)
552.6a (77.2)
2008
2006
2009
Grade nine:
Grade three:
120
Due to family problem
Student has missed school:
1 0 Student is Female; 0 0 Male
Student is Female
Child-Family Background:
IRT equated score on Khmer language exam with 32–35 questions (by form)
Khmer Language
Student Achievement:
Variable:
Table 1 Variable definitions, means and standard deviations (when applicable), grades three and nine 2006–2010
Author's personal copy Educ Asse Eval Acc (2012) 24:113–134
Teacher (self-reported) absences during current academic year
Number of students in classroom according to teacher
Classroom average for student-reporting fighting (scale: 0 0 Never, 1 0 Sometimes, 2 0 Often)
Classroom average for student-reporting frequency they work at chalkboard (scale: 0 0 Never, 1 0 Sometimes, 2 0 Often)
Classroom average for student-reporting frequency they are called on in class (scale: 0 0 Never, 1 0 Sometimes, 2 0 Often)
Classroom average for student-reporting frequency teacher gives them praise (scale: 0 0 Never, 1 0 Sometimes, 2 0 Often)
Teacher reported frequency they have students exchange work (scale: 0 0 Never, 1 0 Sometimes, 2 0 Often)
Teacher absences
Class Size
Frequency of Student Fighting
Frequency Working at Board
Frequency Participate in Class
Frequency Teacher Praises Students
Frequency Students Exchange Work
Ratio of total grade 3 or 9 enrollment in school to total grade 1 (primary) and 7 (secondary) enrollment
Ratio of Grade3/9 to Grade 1/7
10.8 (10.9)
10.6a (10.7)
0.10 –
0.11 –
Total school enrollment for current school year
1 0 School Director is female; 0 0 Male
Sum of teacher-reported teaching materials available in classroom (0–7 scale)
Director is Female
Sum of Teaching Materials
112.6 (79.0)
106.1 (67.9)
1 0 Rural School; 0 0 Urban
0.85
3.4 (1.1)
0.08
280.6 (114.5)
0.86
0.86 (0.38)
School Enrollment
0.85
0.88 (0.35)
0.54 (0.20)
0.97b (0.43)
– 1.29b (0.6)
1.50 (0.6)
0.87 (0.18)
–
1.34 (0.22)
–
–
0.61a (0.20)
1.26b (0.22)
1.88 (0.15)
1.18a (0.38)
1.00 (0.25)
4.1b (1.0)
0.03b
206.6a (102.2)
0.89
0.92 (0.38)
0.91b (0.18)
1.84b (0.20)
1.20 (0.20)
1.21 (0.18)
1.13a (0.34)
0.99 (0.27)
1.16 (0.14)
1.15 (0.14)
46.2 (14.6)
8.9 (8.9)
34.6 (8.6)
2010
1.28 (0.22)
1.30 (0.23)
43.9 (10.2)
11.8 (14.6)
47.3 (16.7)
35.5 (9.4)
36.5b (11.3) 41.6 (12.5)
2008
2009
2006 33.7 (9.8)
Grade nine:
Grade three:
Rural School
School Characteristics:
Percentage of new entrants to school (primary 0 grade 1, secondary 0 grade 7) that are correct age
Correct Age Grade 1/7 Intake
Cohort Indicators:
Teacher age in years
Teacher age
Variable:
Table 1 (continued)
Author's personal copy
Educ Asse Eval Acc (2012) 24:113–134 121
5,579 (195)
4.0 (3.3)
0.73 4,910 (169)
4.0 (2.5)
0.77 4,922 (184)
3.7 (2.1)
0.78
2008
2006
2009
Grade nine:
Grade three:
All averages are weighted (see text for details). a Mean difference significant at p