An Evaluation Using the Regression-Discontinuity Design

1 downloads 0 Views 158KB Size Report
Utilizing the regression-discontinuity research design, this article explores the ... developmental education, policy analysis, program evaluation, regression-.
10091-02_Moss.qxd

9/27/06

10:55 AM

Page 215

Educational Evaluation and Policy Analysis Fall 2006, Vol. 28, No. 3, pp. 215–229

Shaping Policies Related to Developmental Education: An Evaluation Using the Regression-Discontinuity Design Brian G. Moss Wayne State University and Oakland Community College William H. Yeaton University of Michigan Institute for Social Research Utilizing the regression-discontinuity research design, this article explores the effectiveness of a developmental English program in a large, multicampus community college. Routinely collected data were extracted from existing records of a cohort of first-time college students followed for approximately 6 years (N = 1,473). Results are consistent with a conclusion that students’ participation in the program increases English academic achievement to levels similar to those of students not needing developmental coursework. The findings are also consistent with a conclusion that those students in greatest need of developmental English benefit the most from the program. This study provides an inexpensive, inferentially rigorous, program evaluation strategy that can be applied with few additional efforts to assess existing programs and to guide policy decisions. Keywords: college English, developmental education, policy analysis, program evaluation, regressiondiscontinuity, research design, student outcomes

HIGHER education policies supporting open access to higher education have created academic challenges for educators and institutions. By implementing policies of inclusion, many institutions have been faced consistently with large proportions of their student body that are academically underprepared for college-level studies (National Center for Education Statistics [NCES], 1996, 2003). Accordingly, institutions have been obliged to adopt educational strategies that address the needs of the underprepared student. One common strategy has been to offer courses and academic programs that increase student capabilities, which will result in student success. The adoption of this type of strategy has come in two forms: remedial and developmental education. As the number of institutions offering programs to underprepared students has grown, so

have questions regarding the programs’ high financial costs and uncertain effectiveness. Moreover, many researchers have questioned the appropriateness of such programs within the existing curricula of colleges and universities (Bastedo & Gumport, 2003; Dougherty, 1997a; Roueche & Roueche, 1999a; Saxon & Boylan, 2001; Shaw, 1997). The saliency of this issue is underscored because institutions have been held to increasingly higher standards of accountability (Ewell, 1991). Thus, the role of developmental education in higher education has been a subject of considerable controversy. Problems of Open Access in Higher Education Problems associated with open access to colleges and universities can be articulated within

The authors thank Kristen Salomonson, Ignacio Cano, and several anonymous reviewers for their helpful comments on earlier versions of this article.

215

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 216

Moss and Yeaton

a variety of broader debates. One issue that has received continued attention is related to open admission policies and academic preparedness. Dougherty (1997b) argued that the inclusion of high numbers of underprepared students, as a means of counteracting social inequalities, might undermine the academic fabric of higher education and result in unintended, adverse implications. Ultimately, those critical of open admissions argue that such policies fill higher education with greater proportions of ill-prepared students who will eventually force academic standards to be lowered and drain valuable resources. Some suggest that open admissions policies may have resulted in two parallel, but separate, educational systems that have influenced student outcomes adversely. Several state educational systems have implemented policies that strive to remove underprepared students from 4-year institutions and have mandated that these students complete this coursework exclusively at community colleges. Case studies of policies in Massachusetts and New York suggest that delegating these students to community colleges facilitated a structure of educational stratification by separating the least and most prepared students (Bastedo & Gumport, 2003). This stratified educational system was a potential obstacle to the success of underprepared students. In fact, some have argued (Alba & Lavin, 1981) that simply being identified and tracked as underprepared may have deterred positive educational outcomes by increasing time to degree completion, reducing the number of academic credits earned, and decreasing chances of graduation. Defining Developmental Education As open admission policies began to be implemented and the percentage of underprepared students increased dramatically, institutions reacted initially by providing remedial programs. Guided strongly by principles of learning theory, remediation aimed to reduce or eliminate academic deficiencies (Boylan & Saxon, 2000; Casazza, 1999). In this framework, students were viewed as lacking specific skills and abilities in one or more academic areas that demanded repair or alteration. For several reasons, including the potential stigma of remediation, institutions began shifting to the broader, developmental education approach, in which the life circumstances of students were considered more systematically (Kozeracki, 2005). 216

This paradigm shift to developmental education offered a new model with important conceptual differences from remedial education. Whereas remedial education was seen primarily as a deficit model (Casazza, 1999), the developmental education approach emphasized the need for students to become independent and self-regulated learners (Wambach, Brothen, & Dikel, 2000). The developmental education approach utilized a more multidimensional conceptualization, often implementing remediation as only one facet of assisting students. The developmental education method “provide(s) educational experiences appropriate to each student’s level of ability, ensure(s) standards of academic excellence, and build(s) the academic and personal skills necessary to succeed in subsequent courses or on the job. Developmental programs are comprehensive in that they access and address the variables necessary at each level of the learning continuum. They employ basic skill courses, learning assistance centers, supplemental instruction, paired courses and counseling” (American Association of Community and Junior Colleges [AACJC], 1989, p. 115).

Thus, unlike remediation, developmental education is more philosophically comprehensive, addressing a wide range of needs in each student’s life (Kozeracki, 2005). Theoretical Approaches and Potential Effectiveness Although the general tenets of developmental education are clear, many researchers have argued that developmental education lacks a unifying theoretical framework to guide its practice (e.g., Chung, 2005). As a result, developmental education typically appears in many forms in its various applications. For example, Wambach, Brothen, and Dikel (2000) presented a multifaceted theory for developmental education that incorporated student self-regulation, demandingness, and responsiveness within all developmental coursework (Brothen & Wambach, 2002). These authors argued that students can achieve success best through personal autonomy and by enhancing the numerous internal psychological mechanisms germane to self-regulation. Another theoretical approach to developmental education posited that there is a mismatch between developmental students’ primary “discourse” mechanisms and the important discourses that allow them to function effectively and succeed in higher education. Lundell and Collins

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 217

Effectiveness of Developmental Education

(1999) argued that it is the responsibility of developmental educators to understand and then work to minimize these differentials. Similarly, the work of Astin (1985, 1993) suggested that developmental students can maximize their potential best by seeking academic support from the educational community while simultaneously increasing their individual capacities. Finally, borrowing the common theoretical notion of stages, Casazza (1998) argued that within the process of developmental education, “cognitive development occurs in stages, . . . intelligence is not one generalized factor underlying all learning . . . [and] learning is an active process in which collaboration plays a significant role” (p. 18). Thus, based on its theoretical underpinnings, there is good reason to believe that developmental education offers a viable approach to rectifying problems resulting from more open access to higher education. Although the broad conceptualization of developmental education may spawn different operationalizations in different settings, each perspective outlined above emphasizes a more inclusive view of students, one that incorporates aspects of the more holistic goal of personal growth. However, one is compelled to ask whether the potential of developmental education has, in fact, been realized. More specifically, one can ask if the application of these theories actually demonstrates that underprepared students are better prepared for college-level studies. Various researchers have explored the effectiveness of developmental education within the context of open admission policies. Bettinger and Long (2005) found that, after completing a program or course, there were no significant differences in degree completion rates, number of credits earned, level of educational attrition or probability of transferring to a 4-year institution between developmental and nondevelopmental students. Chaffee (1992) found that underprepared students significantly increased critical thinking when involved in developmental education activities. Other research has found that, despite being underprepared, not all developmental students fared poorly and many succeeded when placed within a less restrictive policy environment that allowed for simultaneous, college-level academic work (Weissman, Silk, & Bulakowski, 1997). Therefore, despite the dire consequences some predicted would occur as a result of open admission policies, multidimensional develop-

mental programs that emphasized student needs, instructor availability, course substance, and educational environment were shown to have the best potential for less prepared students to bridge their achievement gaps (Grubb & Cox, 2005; Higbee, Arendale, & Lundell, 2005). Notwithstanding the evidence demonstrating that developmental education has been effective at resolving the problem of open access and underpreparedness, concerns have been raised about the credibility of these findings. Recent reviews argued that more than half of the research conducted on the effectiveness of developmental education contained significant problems due to weak research methodology (Boylan & Saxon, 2000; O’Hear & MacDonald, 1995). Placing doubt on the integrity of the research demands closer scrutiny of the methods employed by developmental education researchers. Quality of Previous Research on Effectiveness After reviewing previous research on developmental education effectiveness, it is apparent that the literature is deficient in several areas. One area of particular concern is the variety of effectiveness measures used to gauge student success. The typical outcome measures found in developmental education programs utilize general descriptions of student behaviors and do not measure precisely the degree to which students actually are being prepared for college-level studies. These measures of effectiveness include the proportion of students who earn pass–fail grades (C or better) in developmental courses; subsequent percentage of successful or unsuccessful grades in any college-level course, or those courses for which the students are meant to be prepared; and the timely or untimely level of completion or noncompletion of developmental coursework (Casazza & Silverman, 1996; Maxwell, 1997; Weissman, Bulakowski, & Jumisko, 1997). Although these relatively descriptive and dichotomous measures may add to our general understanding of student behavior, these imprecise indicators of effectiveness do not take advantage of the availability of interval-level measures that quantify success in college-level studies more accurately. In addition, many of the program outcome variables lack a conceptual fit to the preprogram assessments often provided to students. For example, many of the evaluations use overall grade 217

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 218

Moss and Yeaton

point average as an effectiveness measure, which would conceal true program effects because nonprogram courses also are included. Hence, the ability to assess reliably the impact of the program is masked by the inclusion of other subject material not related to program objectives. Another commonly used outcome measure is the proportion of developmental program completers who successfully complete subsequent coursework. Although this approach is helpful in determining a superficial level of program success, this measure does not address the magnitude of program effects. Other measures deficient in their ability to assess the extent of program participation on student ability include graduation rates, persistence rates, and ratio of credits earned to credits attempted. In summary, although these outcomes describe student behavior of general interest, they do not address the content-specific effects of the program on student achievement. Other methodological weaknesses also characterize existing approaches to developmental education effectiveness research. One of the basic tenets of high-quality evaluative research is the ability to assert that it is the program that causes a change in postprogram outcome variables and that alternative explanations of cause have been eliminated (Weiss, 1998). However, the most common research designs used by developmental education researchers have been the One-group PretestPosttest, One-group Posttest-Only, and One-group Pretest-Posttest with Proxy Pretest Measures designs, which are relatively weak in establishing cause because they lack a control group (Boylan & Bonham, 1992; Good, 2000; Hasit & DiObilda, 1996). A less common but still frequent methodological approach in developmental education is to use the Posttest-Only with Nonequivalent Groups research design (Waycaster, 2001). This approach applies preprogram scores for placement decisions and then uses the high scorers who do not need the developmental program as a comparison group. Whereas this strategy almost certainly is facilitated by the long-standing practice of using placement tests in higher education (NCES, 1996, 2003), Cook and Campbell (1979) have pointed out that, without proper analytic modeling, it would be very difficult to determine program effectiveness. More specifically, this initial group division automatically creates a situation in which a threat 218

of selection bias exists (i.e., the comparison group is not equivalent to the treatment group). Furthermore, because the Posttest-Only with Nonequivalent Groups design compares a low pretest, developmental group to a high pretest, nondevelopmental group, it is likely that the approach will yield a “no-impact” effect. Thus, after 3 decades of trying to understand how effective developmental education programs and policies might be in preparing students, we have a collection of studies that provide little clear-cut evidence regarding their effectiveness, primarily because of the relatively weak research designs being used (O’Hear & MacDonald, 1995; Roueche & Roueche, 1999b). Although the role of developmental education in higher education continues to spark controversy, three issues underscore the need to develop methodologically sound research designs. First, both 2- and 4-year institutions continue to face increasing numbers of students who need developmental education (NCES, 1996, 2003). Second, the demand on higher education to provide financial support for developmental program effectiveness is increasing. Third, at the root of all developmental programs is the goal of increasing student skills to the level of nondevelopmental students. Consequently, it is essential to provide a research framework for answering the fundamental question, “Are developmental students being prepared for college-level studies?” The major purposes of this study are to examine the influence of developmental education on student academic achievement and to provide a methodological framework that meets the evaluative demands of college, state, and federal policymakers. By utilizing previously collected data, the approach presented here represents an empirically sound technique for assessing the effectiveness of developmental education without significant added expense. Method Data Collection Student characteristics, enrollment patterns, and educational outcomes were extracted from the student information system at a large, multicampus community college during approximately 6 years or 23 consecutive academic semesters. This information system contained all official information regarding courses attempted and completed, grades attained, and other pertinent data

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 219

Effectiveness of Developmental Education

collected during the application and registration process. The rationale for selecting this time frame was motivated both by college policies and by research logistics, as outlined below. Sample Selection and Characteristics The 1,473 participants in this study originated from a larger cohort of 2,776 first-time college students. Figure 1 outlines the development of the final sample. From an initial cohort, participants were divided into two groups: one group required by the college to take developmental English (N = 1,782) and a second group not required to take developmental English (N = 994). These two groups followed different educational pathways resulting in many students’ completion of an equivalent, college-level English course, which was the final criterion for their inclusion in this sample. Complicating the actual pathway taken was a college policy allowing entering students a 1-semester grace period before taking the assessment placement tool.1 Until the assessment tool was completed students could not enroll in any English courses; however, if students had been identified as needing developmental coursework, there were no restrictions preventing them from taking courses outside the English curriculum. Under these circumstances, there was sometimes a gap of 1 or more terms between completion of both developmental English and college-level English. Students identified as possessing adequate skills to succeed in college-level English followed the most direct path to inclusion in the final sample. Although not required to take any English courses during the 6-year follow-up period of this study, 824 (83%) had completed the first college-level English course. Thus, this group went directly into college-level English without experiencing the developmental English program. Students who were designated as needing additional English training took a less direct path. The college’s intention was that any student placed into this group be prevented from taking any English courses other than those classified as developmental.2 Originally, the emphasis of the program had been to increase English skill levels, primarily through remediation. However, a pedagogical shift occurred prior to our research that incorporated more developmental features and emphasized a more process-oriented approach to

English competencies. During this study, the college catalog described the developmental program as consisting of courses that emphasized the acquisition of academic literacy defined as “applying reading and writing as processes . . . including prewriting, drafting, revision and editing . . . knowledge of the conventions of the English language, developing strategies for locating and correcting their own pattern of error, [and] demonstrating literacy skills appropriate for different audiences and purposes” (Oakland Community College, 1998, p. 171). In addition, the program contained the more traditional elements that emphasized a strong focus on grammar, spelling, and vocabulary skills. During the years in which study participants experienced the program, the emphasis was twofold: First, to provide English remediation (spelling, grammar, sentence structure, etc); and second, to develop simultaneously the student’s capacity to self-evaluate and to understand the process of literacy. Naturally, the relative emphasis of these two approaches was at the discretion of individual instructors, and given the retrospective nature of this research, we did not collect data on the actual practice utilized within individual classrooms. Of the 1,782 students who scored at the developmental level, 1,133 (64%) completed developmental English, and 649 never enrolled in a developmental English course. Of those who completed the developmental English course, 649 (57%) also completed college-level English. It was this group, combined with the 824 nondevelopmental students that constituted our final sample of 1,473. Other than the systematic difference in placement exams scores (mean = 76.5 vs. 92.6), both groups of students in the final sample had similar demographic characteristics. In the initial fall semester, there was no difference between group mean ages; the students in both groups averaged approximately 20 years of age. Gender distributions were also comparable; the nondevelopmental group included 56.2% females, whereas the developmental group had a somewhat greater percentage of females (61.6%). Differences in race were relatively small as White students made up the vast majority of both the developmental and nondevelopmental groups. However, the developmental group did contain 11 percentage points more African American, Asian, Native American, and Hispanic students. The bulk of this difference 219

9/27/06

10:55 AM

Page 220

FIGURE 1. Flow diagram: sample development.

10091-02_Moss.qxd

220

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 221

Effectiveness of Developmental Education

was attributable to the developmental group’s higher proportion of African Americans (14.5% vs. 6.4%). Initial Assessment of English Competencies Upon entrance to the institution, students were administered the ACT ASSET Basic Skills instrument (ASSET, 1994). The ASSET exam measures three areas of student skill level: writing, reading, and numeric.3 Although there are various applications for the results of the ASSET test, the college used the scores primarily as a course placement tool. The ASSET writing and reading skills tests were combined to assess a student’s level of competence on writing “usage and mechanics, sentence structure, rhetorical skills” (ASSET, 1994, p. 5) and “reading referring and reasoning skills” (ASSET, 1994, p. 8). Both components of the ASSET test (writing and reading) were found to have acceptable levels of test–retest reliability, equivalent-forms, and internal consistency (ASSET, 1994). After students completed each test section, raw test scores were converted to nationally normed, scaled scores. The scaled scores on the reading skills test ranged from 23 to 53 and from 23 to 54 on the writing skills test. Totaling the scaled scores for both the reading and writing tests (hereafter referred to as simply “ASSET score”) provided the college with a lowest possible composite English ASSET score of 46 and a highest score of 107. Based on recommendations from ACT, the college utilized student’s ASSET score to determine level of English competence. Any student with a combined score equal to or less than 85 was considered unprepared for college-level English. Those students who scored 86 or higher were deemed prepared for college-level English and were allowed to enroll in any English class. Employing this cut-score resulted in 64% (1,792 of 2,776) of the study-eligible students being placed into the developmental group and 36% (994 of 2,776) being placed in the nondevelopmental group. Final Assessment of English Achievement The final level of English achievement (dependent variable), for both those in the developmental and nondevelopmental groups, was operationalized by using the grade students received in their first college-level English course. Al-

though the college offers several different collegelevel English courses, this research used any section of the same introductory college-level English class for all students. College-level English grades ranged from 0.0, meaning a failing grade of “F,” to 4.0, an “A” grade (including “+” and “−” grades). Regression-Discontinuity Research Design This study assessed the effectiveness of developmental education using the regressiondiscontinuity (RD) design. Technically, the RD design is a pre–post, comparison group design (Cook & Campbell, 1979). It includes many of the major strengths of the classic experimental design but excludes random assignment of subjects to an experimental or control group. Assignment in the RD design is strictly on the basis of a predetermined, rule-based quantitative criterion (here, the ASSET cut-score). The RD design has been used in several other disciplines, including programs in social welfare, business, and health (Shadish, Cook, & Campbell, 2002). Using widely accepted notation, the RD design is illustrated as follows: O O

C C

X

O O

Here, each row references a different group. The O signifies measurement of the pre- and posttests for each group, and X represents the program that was administered. The C denotes that the groups were assigned by a conditional factor (i.e., participants falling below or at or above a cut-score). The top row indicates the group that received the developmental intervention, and the bottom row shows the group that served as a control. In this study, the program group was composed of those students who scored developmental on the ASSET placement exam (pretest); completed developmental English (the program); and finished a subsequent, college-level English course (posttest). Those in the comparison group were students who scored above the cut-score and completed college-level English. Thus, the conditional assignment forms nonequivalent groups for comparison, as required by the RD design. The major threat to the internal validity (existence of a causal relationship) of this evaluation is selection bias (Cook & Campbell, 1979)— developmental and nondevelopmental students 221

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 222

Moss and Yeaton

a treatment effect has occurred. Whereas many patterns are possible (Trochim, 1990), an additive effect (a constant effect c, applicable to all members of the program group would shift the predicted line by c units) would suggest a “discontinuity.” In the event of an interaction, different program effects would occur at different pretest levels of the independent variable, which would result in different slopes of the treatment and no-treatment regression lines. Thus, unlike weaker designs used to evaluate the impact of developmental programs in higher education (e.g., Boylan & Bonham, 1992; Burley, Butner, & Cejda, 2001; Grimes & David, 1999; Morante, 1986; Umoh, Eddy, & Spaulding, 1994; Waycaster, 2001), selection bias is controlled in the RD design. By positing this distinct pattern of results in the absence of the program and by comparing the actual pattern to it, “. . . the RD

differ in their initial English ability. In fact, the assignment of participants to groups guarantees this initial difference. But the major strength of the RD design is that it does not require equivalence prior to the program. Rather, the design assumes that “in the absence of the program, the pre–post relationship would be equivalent for the two groups” (Trochim, 1990, p. 122). In the RD design, it is the pattern of the relationship established in the nondevelopmental group against which one compares the actual results in the developmental group. In Figure 2, the bold, solid line indicates the pattern of results in the nondevelopmental group, and the dashed line indicates the expected pattern of college-level English grades for developmental students had they not been exposed to treatment. To the degree that actual data are discrepant from this expected pattern, one can conclude that

4

Grade in College-Level English

3

2

1

Group Developmental

Cut-score 0

Nondevelopmental

50

60

70

80

90

100

110

Nondevelopmental: Projected Scores

ASSET Score Key: 4=A, 3=B, 2=C, 1=D, 0=F. FIGURE 2. Regression lines of groups by grade in college-level English and ASSET score (cut-score = 85).

222

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 223

Effectiveness of Developmental Education

design is as strong in internal validity as its randomized experimental alternatives.” (Trochim, 1990, p. 125). Assumptions of the design Trochim (1984, pp. 55–57) has indicated that the inferential quality of the RD design depends on three assumptions: (a) The assignment of participants to groups at the cut-score has been followed; (b) the pattern of the pretest has been specified correctly by the statistical model used; and (c) there is no coincidental factor at the chosen cut-score that would result in program effects. Regarding the first assumption, it was found that seven students assigned by the college to take the developmental class (less than 0.7%) actually took college-level and developmental English at the same time. Three students were assigned to and took nondevelopmental English but later returned to take developmental English (less than 0.4%). In keeping with the advice of Shadish, Cook, and Campbell (2002) to maintain the integrity of treatment assignment (the so-called “intention to treat” principle), we counted all participants in the conditions to which they had been initially assigned (p. 219). Thus, in both cases, these small subsets of students (well less than 1%) were counted in their assigned group. We presume that any potential impact of these crossovers in assessing a treatment effect would be negligible. In the context of the second assumption, one wishes to fit the existing data with a linear regression model but must first rule out the possibility of significant quadratic, cubic, and higherorder terms as well as interactions with these nonlinear terms. Conceptually, one confirms the linearity of the data by first showing that the data are not curvilinear (quadratic, cubic, etc.) and then showing that a linear model fits. As noted by Shadish, Cook, and Campbell (2002), “if interactions or nonlinearities are suspected, the analysis should overfit the model by starting with more polynomial and interaction terms than are probably needed, dropping nonsignificant terms from higher to lower order. When in doubt, keep terms rather than drop them; such overfitting yields unbiased coefficients” (p. 233). Campbell and Russo (1999) have similarly noted, “. . . underestimating (e.g., using a linear model when the true form is quadratic) can

lead to pseudo-effects, whereby the reverse error, overfitting, or using too high a polynomial, should not” (p. 294). While these practices are subject to change, we regard the strategy followed in this article as consistent with current practice. To assess linearity, one will regress the posttest scores (yi) on the modified pretest (x~i ) (derived by subtracting the cut-score, 85, from each pretest score), the treatment variable (zi) (a dummy variable in which 0 or 1 indicates group membership), and all higher order terms and interactions (Trochim, 2004). Algebraically, we used the conventional analytic approach in which one specifies an initial model “two orders of polynomial” (Trochim, 2004) higher than indicated by the data. Thus, we posited a linear model but first tested a more expansive model that included quadratic (x~ 2i ) and cubic (x~ 3i ) terms as well as their interactions (x~ 2i zi and x~ 3i zi). The initial model: yi = β0 + β1 x~ i + β2zi + β3 x~ i zi + β4 x~ 2i + β5 x~ 2i zi + β6 x~ 3i + β7 x~ 3i zi + ei where yi = outcome for the ith score, β0 = coefficient for the y-intercept, β1 = linear coefficient for the transformed pretest, β2 = coefficient for the mean difference between groups, β3 = linear interaction coefficient, β4 = quadratic transformed pretest coefficient, β5 = quadratic interaction coefficient, β6 = cubic transformed pretest coefficient, β7 = cubic interaction coefficient, x~i = transformed pretest: xi − 85 (the cutpoint), zi = Dummy variable for intervention (0 = comparison group and 1 = treatment group), and ei = residual for the ith score. Strategically, to show the applicability of the linear model, we first tested for significant cubic terms in the full model and showed that both cubic terms were nonsignificant. We then tested the reduced model and found that both quadratic terms were nonsignificant. After dropping the quadratic terms we finally examined the linear model yi = β0 + β1x~ i + β2zi + β3x~ izi + ei. A significant β2 would indicate that there was a discontinuity at the cutpoint—a main effect of developmental education. A significant β3 would suggest the presence of a 223

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 224

significant linear interaction term (the slopes of the regression lines in the two groups were significantly different). Finally, regarding the third assumption, we have no reason to believe that, by itself, the choice of 85 used here as the cut-point contributed in any way to the program effect found. Results Effects of Developmental Education on Level of English Achievement A scattergram that contained ASSET scores by college-level English grades by groups was constructed. From this scattergram, a visual examination of the relevant regression lines shown in Figure 2 strongly suggested that a significant linear interaction was present. Those with the lowest pretest scores who completed developmental English coursework significantly increased their English achievement over what was predicted by the linear trend in nondevelopmental students (dashed line). This interpretation is confirmed by the significant linear interaction term in the model (β3 = −.22, p = .01). Hence, the treatment was found to have a greater positive influence on English achievement for those who were the most underprepared. As ASSET scores of developmental students moved closer to the cut-score, the treatment effect diminished. Results from the statistical analysis of the final, linear model indicate that the size of the intercept difference (β2) between the regression line for the developmental and the regression line for the nondevelopmental group (at the cut-score) was

TABLE 1 Unstandardized Coefficients for Regression of English Achievement Variable β0: Intercept β1: Transformed pretest: ~ x (xi − 85) β2: Dummy variable: zi β3: Linear interaction: ~ x izi

B

SE B

2.78* 0.02* −0.02 −0.22*

0.06 0.01 0.09 0.01

*p < .05.

not significant (β2 = −.02, p = .82). The scattergram also suggested the absence of a discontinuity in developmental group compared to the nondevelopmental group. (The y-intercept in the developmental group was 2.76; the y-intercept in the nondevelopmental group was 2.78.) Successfully removing nonsignificant cubic and quadratic terms (as well as their interactions) from the initial model resulted in a significant linear regression (F = 11.05, df = 3, p < .01). Beta coefficients, standard errors, and significance levels are presented in Table 1. As seen in Table 2, although the students who received developmental assistance did not receive an excessive proportion of high grades (A, A−, or B+), they did earn a smaller percentage of failing (F) grades (3.2% vs. 4.5%). In contrast, developmental students achieved consistently higher in the midrange (B, B−, C+, and C), and the cumulative percent for the developmental group (7.5%) was comparable to that in the nondevelopmental group (7.9%) for the unsatisfactory (C−, D+, and D) range.

TABLE 2 Grade Distribution by Group for College-Level English (N = 1,473) Developmental group (ASSET ≤ 85) (n) %

Grade A A− B+ B B− C+ C C− D+ D F Mean

224

(4.0) (3.7) (3.3) (3.0) (2.7) (2.3) (2.0) (1.7) (1.3) (1.0) (0.0) (SD)

10.6 9.2 13.4 20.2 12.5 9.6 11.4 5.5 1.8 2.5 3.2 2.74

(69) (60) (87) (131) (81) (62) (74) (36) (12) (16) (21) (0.90)

Nondevelopmental group (ASSET ≥ 86) (n) % 19.2 14.3 14.9 19.7 8.0 6.1 7.2 2.8 1.6 1.8 4.5 2.96

(158) (118) (123) (162) (66) (50) (59) (23) (13) (15) (37) (0.98)

Overall %

(n)

15.4 12.1 14.3 19.9 10.0 7.6 9.0 4.0 1.7 2.1 3.9 2.86

(227) (178) (210) (293) (147) (112) (133) (59) (25) (31) (58) (0.95)

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 225

Threats to Validity of Findings The following section of the manuscript addresses potential threats to internal validity inherent in the RD design. Selection by maturation In this research context, one possible inferential weakness of the design is the fact that students in the developmental group must postpone taking regular English for at least one semester. To the degree that this postponement resulted in enhanced English achievement in the developmental group, one might infer that a possible increase in English achievement due to maturation could increase the posttest grade. No such increase would be expected in the nondevelopmental group because these students may take a regular English class immediately. To test the possible impact of selection by maturation, several supplementary analyses were conducted. In these analyses, two “time” variables were defined: (a) For all study participants, the number of terms until college-level English was completed (T1); and (b) for those in the developmental group, the number of terms between developmental English and completion of collegelevel English (T2). First, it should be noted that most nondevelopmental students immediately took college-level English (the median number of terms until taking college-level English was 0). In addition, the large majority of developmental students took collegelevel English soon after completing developmental English (the median was two terms). But more importantly, when the correlation between each of the two time variables and grade in college-level English was calculated, they were both nonsignificant (T1 with grade, r = −02; T2 with grade, r = −.01). Thus, there was no evidence suggesting that students who delayed taking college-level English had systematically higher grades. Differential attrition Another potential threat to the internal validity of this evaluation is posed by what Cook and Campbell (1979) termed differential attrition. It is possible that any advantage due to the intervention is attributable to those students who remained in the evaluation (students who completed the developmental class and completed college-level English, N = 649, when compared with those students who completed the developmental class but

did not complete college-level English, N = 484). When these two groups were compared relative to their initial ASSET scores, however, it was found that there was no significant difference (t = −1.2, p =.25). Similarly, ASSET scores of students who scored nondevelopmental and completed college-level English (N = 824) were compared to scores of those who did not complete college-level English (N = 170). Again, there was no significant difference (t = .88, p = .38). We also compared the distribution of important characteristics such as race and ethnicity, gender, and age in the initial sample and in the final sample for both those students who scored developmental and those scoring nondevelopmental. None of the race measures was significantly different in the initial and final sample (and the largest difference was only 1.86%). A significant gender difference of 6.5 percentage points was found in the initial (55.1% females) versus final (61.6% females) sample in the developmental group, although no significant gender difference was found in the nondevelopmental group. Similarly, a small, significant difference in average age (20.4 years in the initial sample versus 19.6 years in the final sample) was found in the developmental group, whereas no significant age difference was found in the nondevelopmental group. Thus, in the few instances in which significant differences were found, the absolute differences were small and inconsistent. Together, these findings lend credence to the conclusion that the attrition from developmental and nondevelopmental groups was not differential and, therefore, did not have a substantial impact on the conclusions made in this study. Discussion This study began by asking the fundamental question, “Are developmental students being prepared for college-level studies?” Our response is a firm but qualified “yes.” The findings of this study point to the effectiveness of developmental English courses in higher education, suggesting that those students who are most in need of developmental education received the most benefit. Our evidence, despite the warnings of some researchers, indicates that open admission policies did not lead to an achievement gap for the developmental English students in this research study. 225

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 226

Moss and Yeaton

Previous research on the effectiveness of developmental education contains methodological errors primarily related to lack of conceptual fit between pre- and postprogram measures, not adjusting for group differences that exist prior to developmental education, and lack of a comparison group (Cook & Campbell, 1979; Higbee, Arendale, & Lundell, 2005; O’Hear & MacDonald, 1995). The presence of these flaws does not allow policymakers to assess validly the benefits of their programs. Our approach in this study resolves these methodological issues by using the RD design as the context of assessment. The RD design essentially eliminates selection bias in initial English ability, the most significant threat to internal validity hampering the inferences made by earlier research in developmental education. Further Considerations Regarding Correct Specification of the Statistical Model Although the findings of this study are compelling, the strongest inferences in the RD design are possible in the context of a main effect rather than an interaction effect (Shadish, Cook, & Campbell, 2002). If the underlying regression surface is curvilinear rather than linear, the presumed presence of an interaction will be suspect (Shadish, Cook, & Campbell, 2002, p. 231). To assess this suspicion better, we provide several additional analyses and explanations. Our regression analyses began with a more complicated model involving both cubic and quadratic terms (and their respective interactions), then eliminated nonsignificant terms until we were left with a linear model. However, because the loss of the nonsignificant, higher-order terms necessarily increased the power of the statistical tests for the reduced, linear model, the process may have resulted artificially in a significant linear interaction. To shed light on this possibility, we examined the percentage of variance explained (R2 value) of each model we tested. For the full model, the R2 was .025; without the cubic terms, the R2 was .024; without cubic and quadratic terms, the R2 was .022. The insubstantial change in these three R2 values indicated that the terms in the more complicated models added little explanatory power. More importantly, adjusted R2 values (.021, .021, and .020, from the most complicated to the simplest model, respectively)—which do not neces226

sarily decrease as terms from the higher order model are eliminated—remained essentially the same. Taken together, this evidence suggests that higher-order terms were relatively unimportant in increasing the explanatory power of the linear model. Next, we directly tested the departure from linearity of the average grade at each ASSET pretest value for both the nondevelopmental and the developmental group. In the nondevelopmental group, there was a nonsignificant deviation from linearity, F (18, 804) = 1.41, p = .12; similarly, a nonsignificant deviation from linearity was found in the developmental group, F (27, 620) = 0.65, p = .92. These analyses both suggest that the regression surface was not curved. Finally, we note the possibility that a “floor effect” in the developmental group might exist if faculty might be inclined to raise the grade of students who earned a D+, D, or F to an inflated C− (a passing grade). Such a tendency would make the scatterplot in the developmental group deviate from linearity, thus contributing to an interaction. Ideally, this possibility could be probed by examining grades and ASSET scores from previous periods when there was no developmental program. Unfortunately, these data were not available because the preprogram measurement did not occur until the program was implemented. In summary, the preponderance of evidence that we have presented is consistent with the viability of our conclusion that the posited linear interaction effect is real. However, in practical terms, the data in this study do not easily lend themselves to a completely unambiguous choice between a linear interaction and a curvilinear regression surface. Thus, we prefer a substantial degree of inferential caution, concluding only that the curvilinear relation is reasonably implausible. Study Limitations and Policy Assessments In this study, students who fell below the cutscore but failed to complete the developmental course pose a potential threat to external validity rather than to internal validity. From an educational policy perspective, one can intervene only with those who opt to make themselves available for the program. As researchers have pointed out, there may be numerous variables that influence students’ enrollment decisions outside of being classified as needing developmental education (Perna, 2000; Tinto, 1975, 1997). Taking our find-

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 227

Effectiveness of Developmental Education

ings into consideration, concentrated attempts to identify these students and to provide additional resources that maintain enrollment in developmental programs may lead to more successful educational outcomes. This study examined data within a single, large, suburban community college with a relatively small percentage of African American, Asian, Native American, and Hispanic students. Student performance in related disciplines was not measured but may also be beneficially affected, especially if those disciplines require the application of the skills learned in the developmental discipline. Subsequent research will be required to determine if similar results are found in more racially and ethnically diverse populations, in disciplines other than English, in groups for which English is not their primary language, as well as in different types of higher education institutions. Although there has been considerable scholarly debate regarding the appropriateness of developmental education in higher education (Bastedo & Gumport, 2003; Shaw, 1997), it appears to be a practical issue that institutions can no longer ignore. By utilizing more rigorous research designs (Cook, 2002), institutional policymakers can eliminate some of the previous research biases that affect decision-making. More importantly, these findings suggest that readily available data can be used to determine developmental education effectiveness without expanding data collection efforts. The data presented in this study (grades, assessment scores) are routinely available to institutional researchers and administrators, and the analytic approach can be applied to other programs that use a cut-score–based criterion to assign students to groups. An evaluative component that allows educational policymakers to discern truly effective programs at little additional expense will be beneficial to all future generations of students. Notes 1

Because of this policy 953 students did not take the placement examination in the original fall semester. Students were also exempted from the placement exam if their composite ACT score exceeded 24 or their SAT composite score was over 950. This restricted the cohort study to only those who took the examination. Boylan, Bliss, and Bonham (1997) posited that this type of policy may underestimate the true demand for developmental education because those students who are least prepared may opt out and abandon their education without being referenced.

2Although debate exists regarding the precise usage of terms developmental education and remedial education (Kozeracki, 2005), we use the former description to encapsulate the activities at the college. 3At that time, the college did not mandate the use of the ASSET numeric score to restrict student access to math courses. Instead, it was used as a self-directive device for students to gauge their own mathematical capabilities. Hence, the numeric score was not considered for this study.

References Alba, R. D., & Lavin, D. E. (1981). Community colleges and tracking in higher education. Sociology of Education, 54, 223–237. American Association of Community and Junior Colleges. (1989). Policy statements of the American Association of Community and Junior Colleges. In American Association of Community and Junior Colleges membership directory. Washington, D.C.: Author. ASSET (1994). ASSET technical manual. Iowa City, IA: The ACT Publications. Astin, A. W. (1985). Achieving educational excellence. San Francisco: Jossey-Bass. Astin, A. W. (1993). What matters in college? San Francisco: Jossey-Bass Bastedo, M. N., & Gumport, P. J. (2003). Access to what? Mission differentiation and academic stratification in U.S. public higher education. Higher Education 46, 341–359. Bettinger, E. P., & Long, B. T. (2005). Remediation at the community college: student participation and outcomes. New Directions for Community Colleges, 129, 17–26. Boylan, H. R., Bliss, L. B., & Bonham, B. S. (1997). Program components and their relationship to student success. Journal of Developmental Education, 20(3), 2–8. Boylan, H. R., & Bonham, B. S. (1992). The impact of developmental education programs. Review of Research in Developmental Education, 9(5), 1–3. Boylan, H. R., & Saxon, D. P. (2000.) What works in remediation: Lessons from 30 years of research. Retrieved June 2, 2006 from http://www.ncde. appstate.edu/reserve_reading/what_works.htm Brothen, T., & Wambach, C. (2002). Developmental theory: The next steps. The Learning Assistance Review, 7(2), 37–44. Burley, H., Butler, B., & Cejda, B. (2001). Dropout and stopout patterns among developmental education students in Texas community colleges. Community College Journal of Research and Practice, 25, 767–782. Campbell, D. T., & Russo, M. (1999). Social experimentation. Thousand Oaks, CA: Sage.

227

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 228

Moss and Yeaton Casazza, M. E. (1998). Strengthening practice with theory. Journal of Developmental Education, 22(2), 14–16, 18, 20, 43. Casazza, M. E. (1999). Who are we and where did we come from? Journal of Developmental Education, 23(1), 2–4, 6–7. Casazza, M. E., & Silverman, S. L. (1996). Learning assistance and developmental education. San Francisco, CA: Jossey-Bass. Chaffee, J. (1992). Critical thinking skills: The cornerstone of developmental education. Journal of Developmental Education, 15(3), 2–8, 39. Chung, C. (2005). Theory, practice, and the future of developmental education. Journal of Developmental Education, 28(3), 2–4, 6, 8, 10, 32–33. Cook, T. D. (2002). Randomized experiments in educational policy research: A critical examination of the reasons the educational evaluation community has offered for not doing them. Educational Evaluation and Policy Analysis, 24(3), 175–199. Cook, T. D., & Campbell, D. T. (1979). Quasiexperimentation: Design and analysis issues for field settings. Boston, MA: Houghton-Mifflin. Dougherty, K. J. (1997a). The community college: Perils and prospects. Community Review, 15 (7), 7–11. Dougherty, K. J. (1997b). Mass higher education: What is its impetus? What is its impact? Teachers College Record, 99 (1), 66–72. Ewell, P. T. (1991). Assessment and public accountability: Back to the future. Change, 23(6), 12–17. Good, J. M. (2000). Evaluating developmental education programs by measuring literacy growth. Journal of Developmental Education, 24(1), 30–38. Grimes, S. K., & David, K. C. (1999). Underprepared community college students: Implications and experiential differences. Community College Review, 27(2), 73–92. Grubb, W. N., & Cox, R. D. (2005). Pedagogical alignment and curricular consistency: The challenges for developmental education. New Directions for Community Colleges, 129, 93–103. Hasit, C., & DiObilda, N. (1996). Portfolio assessment in a college developmental reading class. Journal of Developmental Education, 19(3), 26–28, 30–31. Higbee, J. L., Arendale, D. R., & Lundell, D. B. (2005). Using theory and research to improve access and retention in developmental education. New Directions for Community Colleges, 129, 5–15. Kozeracki, C. A. (2005). Responsive developmental education. New Directions for Community Colleges, 129, 1–4. Lundell, D. B., & Collins, T. (1999). Toward a theory of developmental education: The centrality of “discourse.” In J. L. Higbee & P. L. Dwinell (Eds.), The expanding role of developmental education (pp. 3–20). Morrow, GA: National Association for

228

Developmental Education. Retrieved January 19, 2006, from http://nade.net/documents/mono99/ mono99.1.pdf Maxwell, M. (1997). Improving student learning skills. Clearwater, FL: H & H. Morante, E. A. (1986). The effectiveness of developmental programs: A two-year follow-up study. Journal of Developmental Education, 9(3), 14–15. National Center for Education Statistics [NCES]. (1996). Remedial education at higher education institutions in fall 1995. Washington, DC: U.S. Department of Education. National Center for Education Statistics [NCES]. (2003). Remedial education at degree-granting postsecondary institutions in fall 2000. Washington, DC: U.S. Department of Education. Oakland Community College [OCC]. (1998). Oakland Community College catalog. San Diego, CA: Career Guidance Foundation. O’Hear, M. F., & MacDonald, R. B. (1995). A critical review of research in developmental education, Part I. Journal of Developmental Education, 19(2), 2–4, 6. Perna, L. W. (2000). Racial and ethnic group differences in college enrollment decisions. New Directions for Institutional Research, 107, 65–83. Roueche, J. E., & Roueche, S. D. (1999a). High stakes, high performance. Making remedial education work. Washington, DC: Community College Press. Roueche, J. E., & Roueche, S. D. (1999b). Keeping the promise: Remedial education revisited. Community College Journal, 69(5), 12–18. Saxon, D. P., & Boylan, H. R. (2001). The cost of remedial education in higher education. Journal of Developmental Education, 25(2), 2–8. Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton-Mifflin. Shaw, K. M. (1997). Remedial education as ideological battleground: Emerging remedial education policies in the community college. Educational Evaluation and Policy Analysis, 19(3), 284–296. Tinto, V. (1975). Dropout from higher education: A synthesis of recent research. Review of Educational Research, 45(1), 89–125. Tinto, V. (1997). Classroom as communities: Exploring the educational character of student persistence. Research in Higher Education, 68 (6), 599–623. Trochim, W. M. K. (1984). Research design for program evaluation: The regression-discontinuity approach. Beverly Hills, CA: Sage Publications. Trochim, W. M. K. (1990). Regression-discontinuity design in health evaluation. In L. Sechrest, E. Perrin, & J. Bunker (Eds.). Research methodology: Strengthening causal interpretations of nonexperimental data. Washington, DC: U.S. Department of Health and

10091-02_Moss.qxd

9/27/06

10:55 AM

Page 229

Effectiveness of Developmental Education Human Services, Agency for Health Care Policy and Research. Trochim, W. M. K. (2004). Regression-discontinuity analysis. Retrieved August 2, 2004, from http:// www.socialresearchmethods.net/kb/statrd.htm Umoh, U. J., Eddy, J., & Spaulding, D. J. (1994). Factors related to student retention in community college developmental education mathematics. Community College Review, 22(2), 37–47. Wambach, C., Brothen, T., & Dikel, T. (2000). Toward a developmental theory for developmental educators. Journal of Developmental Education, 33(1), 2–4, 6, 8, 10, 29. Waycaster, P. (2001). Factors impacting success in community college developmental mathematics courses and subsequent success. Community College Journal of Research & Practice, 25(5/6), 403–417. Weiss, C. H. (1998). Evaluation (2nd ed.). Upper Saddle River, NJ: Prentice Hall. Weissman, J., Bulakowski, J., & Jumisko, M. K. (1997). Using research to evaluate developmental programs and policies. New Directions for Community Colleges, 100, 73–80.

Weissman, J., Silk, E., & Bulakowski, C. (1997). Assessing developmental education policies. Research in Higher Education, 38 (2), 187–200.

Authors BRIAN G. MOSS is an adjunct faculty member at Wayne State University’s School of Social Work and a sociology faculty member at Oakland Community College, Department of Behavioral Sciences, 7350 Cooley Lake Road, Waterford, MI 48327; [email protected]. His areas of specialization are research design and analysis, evaluation techniques, and educational and health outcomes. WILLIAM H. YEATON is an evaluation consultant who teaches in the Summer Institute at the Institute for Social Research, University of Michigan, 426 Thompson Street, P.O. Box 1248, Ann Arbor, MI 48106; [email protected]. His areas of specialization are evaluation methods and research design. Manuscript received January 4, 2005 Revision received June 12, 2006 Accepted June 21, 2006

229

Suggest Documents