FIDELITY OF IMPLEMENTATION TO PROCESS ...

2 downloads 0 Views 193KB Size Report
Mason, 1985; Resnicow, Cohn, Reinhardt, Cross, Futterman, Kirschner, Wynder, & Allegrante,. 1992; Taggart, Bush, Zuckerman, & Theiss, 1990); and, they ...
Fidelity of Implementation to Process Paper #3 Running Head: FIDELITY OF IMPLEMENTATION TO PROCESS

Teacher and Student Fidelity of Implementation to Process: Quality of Delivery and Student Responsiveness and Relationships to Classroom Achievement Carol O’Donnell, Sharon Lynch, William Watson, and Vasuki Rethinam The George Washington University

Prepared for the annual meeting of the American Educational Research Association, April 2007, Chicago, IL This work was based on the work conducted by SCALE-uP: A collaboration between George Washington University and Montgomery County Public Schools (MD); Sharon Lynch, Joel Kuipers, Curtis Pyke, and Michael Szesze serve as principal investigators of SCALE-uP. Funding for SCALE-uP was provided by the National Science Foundation, the U.S. Department of Education, and the National Institute of Health (REC-0228447). Any opinions, findings, conclusions, or recommendations are those of the author(s) and do not necessarily reflect the position, policy, or endorsement of the funding agencies. Correspondence to: Carol L. O’Donnell; E-mail; [email protected] Work Phone: 202-9944182 and FAX: 202-994-0692

1

Fidelity of Implementation 2 - 2 Teacher and Student Fidelity of Implementation to Process: Quality of Delivery and Student Responsiveness and Relationships to Classroom Achievement "The bridge between a promising idea and the impact on students is implementation, but innovations are seldom implemented as intended." (Berman & McLaughlin, 1976, p. 349) Background Current emphasis in policy circles on “evidence-based” research and the scale-up of education interventions has created renewed interest for the science education community in the importance of studying fidelity of implementation. To ensure scientific integrity, the Institute of Education Sciences (IES)--the research arm of the U.S. Department of Education (USDOE) established by the Education Sciences Reform Act of 2002--requires intervention researchers to describe how treatment fidelity will be measured, how often it will be assessed, and the degree of acceptable variation during an intervention study (USDOE, 2006). This is critical not only to determine if the intervention is implemented (as an internal validity check), but to determine how fidelity of implementation relates to student outcomes. Without reliable and valid measures of levels of fidelity during implementations of an intervention in classrooms, researchers may have insufficient evidence to support internal validity. This paper contends that valid studies of reform-based interventions must pay careful attention to fidelity of implementation and that such consideration of fidelity is necessary for the restructuring of science education research. Fidelity of implementation is traditionally defined as the determination of how well an intervention is implemented in comparison with the original program design during an efficacy and/or effectiveness study (Mihalic, 2002; see also Berman & McLaughlin, 1976; Biglan & Taylor, 2000; Freeman, 1977; Fullan, 2001; Hord, Rutherford, Huling-Austin, & Hall, 1987;

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 3 Lipsey, 1999; National Research Council [NRC], 2004; Patton, 1978; Scheirer & Rezmovic, 1983; USDOE, 2006). More specifically, it is “the extent to which the user’s current practice matche[s] the...‘ideal’” (Loucks, 1983, p. 4). Fidelity of implementation is a relatively recent construct in K-12 curriculum effectiveness research, but its use in program evaluation can be dated back 30-35 years (Mowbray, Holter, Teague, & Bybee, 2003; Sechrest, West, Phillips, Redner, & Yeaton, 1979). Until the 1970’s, researchers assumed fidelity of implementation would be high during program adoption and that implementers would copy or imitate the specified procedures of an innovation exactly as earlier adopters had used it (Rogers, 2003). This view of fidelity of implementation was made because adopters were "considered to be rather passive acceptors of an innovation, rather than active modifiers of a new idea” (Rogers, 2003, p. 180). However, once researchers recognized that fidelity of implementation to a program’s specified procedures was not always a given and that adopters would adapt an innovation to suit their local needs, “they began to find that quite a lot of it occurred” (Rogers, 2003, p. 180). Notable researchers began to discuss the tension between fidelity and adaptation, and outlined ways of resolving this tension (Blakely et al., 1987; Cho, 1998; Dusenbury, Brannigan, Falco, & Hansen, 2003; Hall & Loucks, 1978; Hedges, 2004; Lee, 2002; U.S. Dept of Health and Human Services, 2002). Reviewers of fidelity of implementation literature began to move beyond the traditional view of fidelity as adherence (i.e. whether the core components, protocols, and materials of the program are being delivered as prescribed), and stated that fidelity of implementation also included quality of delivery (i.e. the manner in which the implementer delivers the program using the techniques, processes, or methods prescribed by the program) and participant responsiveness (i.e. the extent to which participants receive or are engaged by and

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 4 involved in the activities or content of the program) (Dane & Schneider, 1998; Dusenbury et al.; 2003; Resnick et al., 2005). Researchers (Gersten et al., 2005; Lynch & O’Donnell, 2005; Mowbray et al., 2003) divided these criteria into two groups: Fidelity to Structure (i.e. adherence) and Fidelity to Process (i.e. quality of delivery), with participant responsiveness taking on characteristics of both (A. Ruiz-Primo, personal communication, March 23, 2005). Hallmark studies in the health field have underscored the importance of measuring both the structure and process of implementation during effectiveness studies (Connell, Turner, & Mason, 1985; Resnicow, Cohn, Reinhardt, Cross, Futterman, Kirschner, Wynder, & Allegrante, 1992; Taggart, Bush, Zuckerman, & Theiss, 1990); and, they provide evidence that fidelity to process (including quality of delivery and participant responsiveness) account for differential effectiveness in program outcomes (Hansen, Graham, Wolkenstein, & Rohrbach, 1991; see also Hopkins, Mauss, Kearney, & Weisheit, 1988; Mitchel, Hu, McDonnell, & Swisher, 1984; Tortu & Botvin, 1989). Context This paper explores teachers’ quality of delivery (referred to in this paper as “Teacher Fidelity to Process”) and students’ responsiveness to implementation (referred to as “Student Fidelity to Process”) within the context of a large-scale quasi-experiment called the Scaling up Curriculum for Achievement, Learning, and Equity Project (SCALE-uP), a study at The George Washington University funded by the NSF, the USDOE, and the National Institute of Health through the Interagency Education Research Initiative (IERI) (Lynch, Kuipers, Pyke, & Szesze, 2002). SCALE-uP reached about 67,000 students over six years (2001-2007). Theoretical Foundation

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 5 The main purpose of SCALE-uP was to use both quantitative and qualitative research methods to examine the effectiveness and scale-up of three middle school science curriculum units rated according to the American Association for the Advancement of Science (AAAS) Project 2061 Curriculum Analysis, which includes both a Content Analysis and an Instructional Analysis (AAAS, 2003; Kesidou & Roseman, 2002). Using this analysis, Project 2061 identified a narrow selection of reform-based science curriculum materials that align with specific key ideas in earth, life, and physical science (as summarized by the Content Analysis), and rate Satisfactory or Excellent on many of the 22 instructional strategies (as determined by the Instructional Analysis). These instructional strategies fall into seven categories: I.

Providing a sense of lesson and unit purpose

II.

Taking account of student ideas and misconceptions

III.

Engaging students with relevant phenomena

IV.

Developing and using scientific ideas

V.

Promoting student thinking about phenomena, experiences, and knowledge

VI.

Assessing progress

VII.

Enhancing the science learning environment and promoting curiosity for all students

Although the Project 2061 Curriculum Analysis was designed to help evaluators make instructional judgments about written curriculum materials, SCALE-uP used the Curriculum Analysis as a theoretical framework to better understand implementation of the treatment unit in classrooms. Traditional views of fidelity of implementation focus predominantly on adherence to the structural components of the program (or program "script"); this is important to determine if the

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 6 independent variable is implemented as intended (an internal validity check) (c.f. Lastica & O’Donnell, 2007). However, SCALE-uP proposed that in order to assess fidelity of implementation to a reform-based curriculum intervention, a researcher must be guided not just by the program's structure, but also by its program theory--or, "plausible and sensible model of how a program is supposed to work" (Bickman, 1987, p. 5). That is, an effectiveness study should also examine whether processes actually observed in the classroom fit the processes that should be occurring “in theory” (Chatterji, 2004, p. 8; Ochsendorf, Lynch, & Pyke, 2006). The program theory of the Project 2061 Curriculum Analysis is developed from Science for All Americans (1990) and Benchmarks for Science Literacy (1993) and is based on cognitive and social constructivist learning theories, including knowledge organization, the role of prior knowledge in learning, and conditions that facilitate the transfer of knowledge (Kesidou & Roseman, 2002). The instructional strategies in each of the seven categories of the Instructional Analysis may be considered constructivist in nature and are designed to promote conceptual change, as well as help students to distribute cognition as they work together to solve problems and justify their ideas to others (Kesidou & Roseman, 2002). Classroom observations are a means to see program theory in action. Therefore, to better understand Fidelity to Process, SCALE-uP conceptualized “Teacher Fidelity to Process” as a teacher’s ability to recognize the instructional strategies that were rated as Satisfactory or Excellent in the treatment unit and to implement those strategies with high fidelity in the classroom. “Student Fidelity to Process” was conceptualized as a student’s ability to perceive the presence of these strategies. It was SCALE-uP’s contention that if there were instructional strategies rated Satisfactory or Excellent in the treatment unit, it should be possible to use

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 7 classroom observations to measure a teacher’s fidelity to those strategies and student questionnaires to better understand students’ perception of such strategies. Research Design Setting and Study Population SCALE-uP is set in Montgomery County Public Schools (MCPS), Maryland--a large and socio-economically, ethnically, and linguistically diverse school district comprised of almost 140,000 students representing 163 countries and speaking 123 languages, and located in the suburbs of Washington, DC (MCPS, 2006). SCALE-uP divided MCPS’ 37 middle schools into 5 categories based on similar school demographic and achievement profiles and primarily indexed the schools by percent of students eligible for Free and Reduced-price Meal Status (FARMS). SCALE-uP ran seven quasi-experiments in all: two consecutive quasi-experiments for each of the three middle school science treatment units tested during SCALE-uP between 20012005, plus one additional quasi-experiment for the sixth grade unit in 2005-2006. During each set of consecutive quasi-experiments, one school from each demographic category was randomly assigned to the treatment condition and the other to the comparison condition. Middle school science teachers in the treatment schools implemented the treatment unit, while comparison teachers used their "business-as-usual" curriculum materials available in MCPS to teach the same target benchmarks as the treatment. Each SCALE-uP quasi-experiment used student level data to determine the unit’s effectiveness during each iteration, providing sufficient power for analysis when data were disaggregated by demographic subgroup. Because the unit of random assignment (school) was different from the unit of analysis (student), SCALE-uP was considered a quasi-experiment and not a true experiment (USDOE, 2003). Intervention

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 8 This paper explores fidelity of implementation research focused on the sixth grade treatment unit: Exploring Motion and Forces Speed, Acceleration, and Friction during its third year of implementation (2005-2006). ARIES: M&F, developed by the Harvard-Smithsonian Center for Astrophysics (2001), is a six-week physical science curriculum unit designed for the ARIES curriculum program for grades 5-8. The curriculum materials are inquiry-centered and activity-based, with an emphasis on students’ direct experience with phenomena. A substantial amount of materials is required to implement ARIES: M&F. These materials (sliding disks, ramps, marbles, bells, rolling carts, etc.) are often constructed and then used by the students themselves as they explore physical phenomena. In 2003, prior to its first quasi-experiment in SCALE-uP, ARIES: M&F received Satisfactory or Excellent ratings on approximately half of the AAAS Project 2061 Curriculum Analysis instructional criteria (cf., Ochsendorf, Deprey, Lopez, Roudebush, Lynch, & Pyke, 2003), and it contained more Satisfactory and Excellent ratings than traditional textbooks rated by the analysis (see Kesidou & Roseman, 2002 for more information on each indicator and rating scales within Project 2061). For two consecutive years (2003-2005), ARIES: M&F was studied in MCPS (n ~ 3,000 students each year). As reported in a NARST 2006 symposium, results for ARIES: M&F during SCALE-uP Years 2 and 3 (2003-2005) could be considered at best ambiguous (Ochsendorf, Pyke, Lynch, & Watson, 2006). Overall effect sizes were low. Disaggregated data of interaction effects showed that the intervention might be actually increasing achievement gaps. This present study, however, is based upon a third iteration of a quasi-experiment of ARIES: M&F conducted with approximately 3,000 6th graders in 10 new schools during spring, 2006. This new set of five matched pairs of Treatment and Comparison schools were

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 9 demographically and educationally similar to the previous years' schools1. Contrary to prior results, in this iteration, in which fidelity of implementation was carefully monitored, the overall effect favored ARIES: M&F (effect size of approximately .23). Moreover, when the data were disaggregated by SES, ethnicity, gender, or eligibility for ESOL or special education services, students in the ARIES: M&F condition outscored their peers in the comparison and had overall positive effect sizes (as shown in Figure 1); significant interactions suggested achievement gaps were closing (Watson, Pyke, Lynch, & Ochsendorf, 2007). Research Questions Given the conceptual framework developed by SCALE-uP to study fidelity of implementation (see Lynch & O’Donnell, 2007 and Figure 2), the collection and analysis of the Year 4 Fidelity to Process data in ARIES: M&F classrooms were guided by the following research questions: 1. To what extent do teachers demonstrate and students perceive the presence of Project 2061 instructional strategies in the Treatment classrooms? 2. Is there a relationship between Teacher Fidelity to Process and class mean achievement in the ARIES: Motion and Forces classrooms? 3. Is there a relationship between Student Fidelity to Process and class mean achievement in the ARIES: Motion and Forces classrooms? 4. If statistically significant relationships exist, how much of the variation in class mean achievement is predicted by each Fidelity to Process variable?

1

One treatment school did not complete its assessments and this school and its matched school were not included in the final Year 4 analysis. Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 10 Conjectures The SCALE-uP fidelity team conjectured that curriculum materials containing instructional strategies rated Satisfactory or Excellent by the Project 2061 Instructional Analysis would support teachers’ use of these reform-based instructional strategies in the classroom and students’ would perceive the presence of these strategies in the classroom. Moreover, the measures of fidelity of implementation to these instructional strategies would be related to outcomes. Methods Study Sample The fidelity of implementation study sample during Year 4 (2005-2006) of SCALE-uP and reported in this paper consists of 30 Treatment classrooms randomly selected from a total of 44 Treatment classrooms in 5 schools. The sample classrooms were picked randomly using SPSS version 12 so that the results from the fidelity study could be generalized to the full Year 4 study sample in which the effectiveness of ARIES: M&F was being tested. Six classrooms from each school, which had been stratified by socio-economic status and other demographic characteristics, were selected. In order to obtain teacher variability, at least one classroom of each teacher in each school was included in the sample. Demographically, these classrooms included a range of combinations of highly diverse students, collectively representative of all 6th grade classes in MCPS. (Self-contained classrooms of students with disabilities or students learning English were not included in the sample because their teachers were not likely to have attended professional development for ARIES: M&F.) Instrumentation

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 11 Using a combination of classroom observations and student questionnaires, SCALE-uP developed a set of instruments to assess Teacher and Student Fidelity to Process. These instruments included the Instructional Strategies Classroom Observation Protocol (ISCOP); Lesson Flow Classroom Observation Protocol (LFCOP); and Student Responsiveness Questionnaire (SRQ). Table 1 summarizes the data collection procedures for these instruments. Instructional Strategies Classroom Observation Protocol (ISCOP). Working collaboratively with the district's Department of Shared Accountability (DSA), SCALE-uP researchers developed the ISCOP (O’Donnell, Lynch, & Merchlinsky, 2006), a unit-independent measure of Teacher Fidelity to Process. Through the observation of classroom instructional strategies, the ISCOP measures the extent to which the first five Project 2061 instructional characteristics are evident during the enactment of a target idea. Because SCALE-uP believed that only the enactments of the first five categories could be observed during any one lesson, the ISCOP was designed around 20 items that represent Categories I through V from the Project 2061 Instructional Analysis: (I) Identifying sense of purpose (6 items); (II) Taking account of student ideas (5 items); (III) Engaging students with relevant phenomena (2 items); (IV) Developing and using scientific ideas (5 items); and, (V) Promoting student thinking about phenomena, experiences, and knowledge (2 items). Category I is based on a dichotomous scoring rubric (0 = Not Evident; 1 = Evident). Categories II through V are based on a four-point Likertlike scale (0 = Not Evident; 1 = Evident; 2 = Evident More than One Time; 3 = Evident With Emphasis). To establish content validity, which Parrott (1991) defines as “the degree to which a particular area of interest is being adequately covered,” SCALE-uP aligned the 20 items on the

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 12 ISCOP to the Project 2061 variables found in the Instructional Analysis. To attain this alignment, three SCALE-uP researchers not involved with the development of the ISCOP and with experience in the Project 2061 Curriculum Analysis separately examined the 20 items on the ISCOP and aligned them with the 13 criteria in the first five categories of the Project 2061 Instructional Analysis. To achieve one-to-one alignment, inter-item correlations were used to combine some of the 20 ISCOP items into one variable, as shown in Figure 3. The researchers then came together to reconcile alignments. For all criteria, percentage of agreement between raters was .75 to 1.0 before reconciliation. Classroom observers, which included MCPS DSA staff, were experienced in educational classroom observations, had a working knowledge of the motion and forces target idea being observed, and were able to understand the dialogue of the nature of this science idea as outlined by the Project 2061 instructional criteria. Observers were trained on the Project 2061 Instructional Analysis and the ISCOP protocol by the first author and a DSA representative using video data. Inter-rater reliability (IRR)—the average mean percentage agreement between two raters—was calculated twice, once after video training and a second time after re-training and reconciliation, in which observers discussed the rating for each variable. IRR for the ISCOP during Year 4 ranged from .30 to .80, with an average agreement score between raters of .60. Despite retraining, however, one rater with consistently low scores had to be released from the study. Lesson Flow Classroom Observation Protocol (LFCOP). The LFCOP measures how a classroom teacher organizes class time within three categories—topic, centeredness, and lab. More specifically, two members of the SCALE-uP fidelity team (Lynch & Hanson, 2005)

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 13 designed this instrument to enable researchers to look at the percentage of class time spent on activities related to the target ideas (i.e. on topic), activities unrelated to the target ideas (i.e. off topic), the centeredness (teacher, student, or group) of those activities, and whether or not those activities are lab-based. Units rated highly in many Project 2061 categories explicitly direct students to work in groups to explore relevant physical phenomena through data collection, analysis, and reasoning from evidence; therefore, treatment classes were expected to use labbased activities and were expected to be more student-group-centered than teacher-centered. Using the same processes for the ISCOP instrument training, LFCOP observers, who consisted of DSA and GWU researchers, were trained using classroom video. Cohen’s kappa coefficient was used to establish inter-rater reliability, which indicated that the average kappa for the LFCOP topic category was .83; the average kappa for centeredness was .73; and, there was perfect agreement within the lab category. Generally, a kappa greater than .7 is considered satisfactory (Howell, 1997). Student Responsiveness Questionnaire (SRQ). SCALE-uP hypothesized that students' engagement could depend upon the extent to which they participated in activities that are indicative of the presence of Project 2061 instructional strategies in the classroom. The higher a teacher’s fidelity to the Project 2061 instructional strategies, the higher the students’ perceptions should be that these strategies occurred during the teaching of ARIES: M&F. Therefore, two SCALE-uP fidelity team members developed the SRQ (Lynch & Watson, 2006) to describe students’ direct experience with curriculum materials and capture whether students experience the presence of the Project 2061 instructional strategies during the curriculum implementation. The 18 items on the SRQ were developed using the language of Categories I through V of the Project 2061 Instructional Analysis. Students scored each item along a 5-point Likert-like

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 14 scale, from Not True to Very True. Because the development of this instrument was based on each of the first five Categories of the Project 2061 Instructional Analysis, SCALE-uP hypothesized the presence of five factors in the SRQ. To test for these five factors, factor analysis, internal consistency reliability, and face validity were employed. Using the full set of classrooms whose students completed the SRQ (n = 797 students from both treatment and comparison classrooms), the rotated component matrix for the factor analysis conducted using the 18 items of the SRQ revealed the presence of three factors (shown in Table 2): Sense of Purpose (corresponding to Project 2061 Category I), Thinking about Scientific Phenomena (Categories II and V), and Using and Talking about Scientific Phenomena (Categories III and IV). Together, these three factors accounted for a combined 53.46% of the variance in the data. The factors revealed by the factor analysis were shown to have adequate internal consistency to be used for data analysis. Cronbach's alpha statistics for Factor 1, Factor 2, and Factor 3, were .80, .83, and .77, respectively. Because the factors showed adequate internal consistency, scale scores for each factor were computed by dividing the sum of the scores for each item in the factor by the total number of items in the factor. Motion and Forces Assessment (MFA). An analysis and development procedure developed in collaboration with the AAAS Project 2061 (Deboer, 2005; Stern & Ahlgren, 2002) was used to develop the Motion and Forces Assessment (MFA) prior to Year 2 of the SCALE-uP study. The Year 4 MFA consisted of 10 items (6 constructed response and 4 selected response) that provided the students with 4 different physical phenomena to respond to questions about motion and force. Raters coded student responses using a rating guide that categorized student

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 15 responses according to their alignment with a scientifically appropriate understanding of the target benchmark (average inter-rater reliability = 0.82). Data Collection Procedures ISCOP, LFCOP, SRQ, and MFA data were collected from 30 classrooms of 11 teachers in 5 treatment schools. ISCOP and LFCOP observers were simultaneously sent into each classroom selected for observation in order to minimize classroom interruption and to assess both components of Teacher Fidelity to Process during the same lesson. DSA personnel made initial contact with the classroom teachers in January 2006 and coordinated all observations (see Merchlinsky & Hansen-Grafton, 2007 for more information). Lesson observations were not conducted at random, but were constrained by schedules of the observed classroom teachers and the availability of DSA and SCALE-uP staff. Observers were required to capture an entire lesson, and some lessons extended beyond one class period on a given day. Observations for each class period lasted 45-100 minutes, depending on the schedule of “regular” versus block class periods. Classroom observations began February 1, 2006 and were completed at the end of March 2006. (Comparison classrooms were also observed using the same instruments, and teachers in both the treatment and comparison were interviewed for background and experience; but these data are beyond the scope of this paper. See Lynch, O’Donnell, Hatchuel, & Rethinam, 2007.) After each classroom finished implementing the treatment unit, DSA distributed the MFA and SRQ to each school and each teacher administered the SRQ and MFA to his/her classroom. The SRQ was given to students after the MFA outcome assessment was completed. GWU researchers then scored and aggregated individual student data for both the SRQ and MFA to the classroom level.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 16 Findings and Analyses ARIES: M&F Instructional Analysis The Instructional Analysis for ARIES: M&F revealed relative strengths in the areas of lesson/activity purpose, providing a variety of vivid experiences, helping the teacher to identify students’ ideas, developing scientific ideas, and encouraging students to explain their ideas. The process also, however, revealed relative weaknesses in the areas of conveying unit purpose, taking account of student ideas, guiding student thinking, demonstrating use/practice of scientific knowledge, and assessing progress. (Refer back to Figure 2 for a summary of the instructional analysis for ARIES: M&F; for additional information, please see Ochsendorf et al., 2003.) Teacher Fidelity to Process ISCOP. Classroom observations (n = 30) revealed that in most cases (except for Category V), teachers had higher mean fidelity to the ISCOP items that were rated Satisfactory or Excellent in ARIES: M&F by the Project 2061 Instructional Analysis than to the ISCOP items that were rated Poor (see Figure 4). In addition, 8 of the ISCOP items that aligned with the 13 Project 2061 instructional criteria were positively significantly correlated with classroom mean achievement; and, more than half of these items had been rated as Satisfactory or Excellent in ARIES: M&F on the Project 2061 Instructional Analysis (see Figure 4). (Corresponding measures in the comparison classrooms indicated that patterns were the opposite--that is, correlations between the observed instructional strategies and outcomes in the comparison condition tended to be non-significant and trending toward negative relationships. See Lynch, O'Donnell, Hatchuel, & Rethinam, 2007 for more information on results from the comparison condition.)

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 17 In an accompanying regression analysis controlling for prior student knowledge using science Grade Point Average, 66.6% of the variance in treatment classroom mean achievement can be accounted for by the linear combination of teachers' fidelity to the 8 ISCOP items rated Satisfactory or Excellent in the treatment unit (p < 0.01). The null hypothesis that the population multiple correlation equals zero was rejected, as tested by the statistically significant F test [F (9, 14) = 5.469, p < .01]. Teachers' fidelity to having Students justify their ideas proved to be a significant unique predictor of students' understanding of motion and forces. Standardized regression coefficients indicated that for every one rating scale unit that teachers' fidelity to having Students justify their ideas increased, the overall MFA classroom mean score was predicted to change .687 standardized rating scale units, holding all other ISCOP variables constant (p = .013). Stepwise regression indicated that teachers' fidelity to having Students justify their ideas and Conveying lesson/activity purpose together explained 55.7% of the variance in achievement (p < .05); encouraging Students to justify their ideas accounted for 48.1% of this variance on its own (p < .01). These findings suggest that teachers' fidelity to the instructional strategies rated Satisfactory or Excellent in the ARIES: M&F unit strongly predicted student outcomes in Motion and Forces classrooms. (Although the Project 2061 instructional strategies were observed to be equally present in comparison classrooms, they did not appear to positively relate to outcomes. See Lynch, O'Donnell, Hatchuel, & Rethinam, 2007 for more on the comparison condition.) LFCOP. Descriptive analysis indicates Treatment teachers were on-topic 98% of the time, 46.39% of which was teacher-centered, 12.99% of which was individual student-centered, and 39.15% of which was student group lab activity. While the percent of teacher-centeredness was low (relative to the Comparison condition), it correlated significantly with classroom mean

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 18 achievement (r = .49, p < .05). In an accompanying regression analysis, controlling for prior science GPA, teacher-centeredness explained only14% of the variance in classroom mean achievement in the treatment condition. The LFCOP indicates the importance of the teacher's instructional moves in the classroom environment and student group lab activity. Student Fidelity to Process When examining the overall means of Student Fidelity to Process in the treatment classrooms, as shown in Table 3, it appears treatment students have the highest perceptions of fidelity to Sense of Purpose (M = 4.18; SD = .17), followed by Using and Talking abut Scientific Phenomenon (M = 3.99; SD = .24). Students in the treatment classrooms had lowest fidelity and highest variability to Thinking about Scientific Phenomena (M = 3.44; SD = .31). In addition, a regression analysis controlling for prior science GPA indicated that Thinking about Scientific Phenomena was negatively related to outcomes. The higher the mean classroom achievement for a class, the lower its aggregate score on the Thinking about Scientific Phenomena scale. Although SRQ results suggest that, from the students' point of view, treatment classrooms experienced thinking about scientific phenomena, regression analysis indicates that greater thinking about phenomena was associated with lower class mean achievement. When examining the SRQ items individually, and not as factors, self-report SRQ data aggregated to the class-level indicated that most of the SRQ variables tended to be negatively related to outcomes. Only 1 of the 8 SRQ variables that align with the Project 2061 instructional criteria positively correlated with classroom mean achievement in the treatment condition: Conveying lesson/activity purpose (rs = .549, p < .01), which received a Satisfactory rating on the Instructional Analysis. Contrary to classroom observation data, survey data indicated students'

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 19 perceptions of Encouraging students to explain their ideas, which received a Satisfactory rating on the Instructional Analysis, was negatively correlated with classroom mean achievement (rs = -.475, p < .05). Discussion Regression results show that Teacher Fidelity to Process as measured by the ISCOP accounts for a high variance in classroom mean achievement, with observation data supporting the importance of teachers' fidelity to instructional strategies rated Satisfactory or Excellent in the treatment unit, particularly in allowing students opportunities to justify their ideas. In addition, Lesson Flow data revealed that 39% of ARIES: M&F class time had students working in groups on lab activities, with no teacher present in the group. Students in groups could choose to participate in the instructional strategies embedded in the ARIES: M&F lessons, or they could ignore or circumvent them. However, regression analysis indicates that teacher's instructional moves in the classroom environment at a moderate level were important and predicted student outcomes. Although the self-report Student Fidelity to Process data collected using the SRQ indicates that students perceived the presence of the Project 2061 instructional strategies during the implementation of the treatment unit, students' perception of these strategies was not related to outcomes--results that were inconsistent with the observed Teacher Fidelity to Process. Prior classroom observations and classroom video data suggested that fidelity of implementation is not solely the teachers’ purview. Therefore, it was expected that students in treatment classrooms would self-report awareness of Project2061-ness and that these means would be correlated with classroom mean achievement, but this did not happen. Although the SRQ seems to be useful for identifying students' perceptions of Project2061-ness in their classrooms, the negative

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 20 relationship between students’ perceptions of thinking about scientific phenomena and student outcomes in the treatment condition is puzzling. It might indicate that thinking deeply about scientific phenomena caused confusion for students that were never resolved. Alternatively, it might mean that students who had less understanding of the target idea before instruction engaged in deeper thought and used more sophisticated learning strategies than peers who had more prior knowledge, but still scored lower on the posttest. Despite these uncertainties, however, Student Fidelity to Structure does show merit in considering the student in the fidelity model, as the previous paper showed (Lastica, O'Donnell, & Lynch, 2007). Contributions and General Interest Science curriculum materials promoted by the National Science Foundation (NSF) and other reformers such as the American Association for the Advancement of Science (AAAS) and the National Science Resources Center (NSRC), are consistent with curriculum standards and guidelines that have been published by the National Research Council (NRC, 1996) and AAAS (1993). Each contain reform-based instructional strategies that support teachers’ abilities to assess students’ prior knowledge; engage students as active participants in their own learning; and, promote students’ scientific thinking and ideas about phenomena, experiences, and knowledge (Kesidou & Roseman, 2002) in the context of science education content standards (AAAS, 1993; NRC, 1996). Although the study of fidelity as adherence to the “script” of these reform-based programs is necessary (and must be established to control for threats to internal and construct validity, as shown by Lastica, et al. 2007), it is insufficient to only assess Fidelity to Structure in today's reform-based constructivist classrooms. Fidelity to Process must also be considered. To do this, a researcher must establish or identify the criteria for measuring a teacher's fidelity to the

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 21 program theory of an intervention. A content and instructional analysis should be used to identify the treatment's critical components and program theory a priori. Observations can then occur in classrooms during implementation, coupled with interviews or other self-report data. Finally, all of this should be analyzed for relationships to outcomes. Moreover, the student as recipient of the intervention plays a critical role in the success of the intervention and must be included in any fidelity of implementation study. Fidelity does not reside in the teacher alone. To address this argument, SCALE-uP conceptualized Teacher and Student Fidelity to Process to signify teachers' fidelity to the instructional practices embedded in the treatment unit, and students' perceptions that these strategies are at play during implementation. However, distinguishing between a teacher’s good teaching practices and his/her fidelity to the high quality instructional processes outlined in a reform-based curriculum unit has been described as an endogeniety problem (Gersten, 2007). This is because good teaching practices may precede implementation of reform-based curriculum materials that contain instructional strategies designed to support teachers. High quality teaching and fidelity of implementation to the instructional processes rated as Satisfactory or Excellent in a reform-based curriculum unit may appear to be one in the same (Gersten, Carnine, & Williams, 1982). To better understand the distinction of such effects, the SCALE-uP fidelity team has attempted to demonstrate that the use of reform-based programs with instructional practices rated Satisfactory or Excellent by the Project 2061 Curriculum Analysis are related to achievement in middle school science classes in MCPS that use reform-based curriculum materials. This paper suggests that although changes to teachers’ instructional practices may be difficult (Hiebert, 1999; Richardson, 1990), such practices can be attained if coupled with changes in curriculum interventions that support such practices and that these changes are related to student

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 22 achievement. Although materials alone cannot change teacher practice, they can provide scaffolding for teachers trying to create a classroom environment different from that observed in traditional classrooms (Riordan & Noyce, 2001). This paper builds on existing literature such as this by suggesting that, when controlling for students’ prior knowledge, fidelity of implementation to the instructional processes embedded in reform-based curriculum materials may predict classroom mean achievement in a curriculum effectiveness study. The multi-method model used by SCALE-uP to assess Fidelity to Process in a large-scale study illustrates fidelity of implementation as quality of delivery and student responsiveness (Dane & Schneider, 1998) and operationalizes “process” using the Project 2061 program theory. Although SCALE-uP fidelity researchers recognize their limitations in applying statistical analyses to behavioral ordinal rating scales and using single-level analyses on multilevel data, the results from this study are promising. In a large quasi-experiment, in which the Treatment condition outperformed the Comparison condition overall and with almost all subgroups, evidence is provided that indicates the intervention was implemented with fidelity (an internal validity check) and that Fidelity to Process is related to student outcomes--a fashion that appears to be unique to studies of science curriculum material implementation.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 23 References American Association for the Advancement of Science (AAAS). (1990). Science for all Americans. New York: Oxford University Press. AAAS. (1993). Benchmarks for science literacy. New York: Oxford University Press. AAAS. (2003). Project 2061 middle grades science textbooks: A Benchmarks-based evaluation. Retrieved June 1, 2004, from http://www.project2061.org/tools/textbook/mgsci/index.htm. Berman, P., & McLaughlin, M. W. (1976). Implementation of educational innovation. The Educational Forum, 40, 345–370. Berman, P., & McLaughlin, M. W. (1976). Implementation of educational innovation. The Educational Forum, 40, 345–370. Bickman, L. (Ed.) (1987). Using program theory in evaluation. New Directions for Program Evaluation Series (no. 33). San Francisco: Jossey-Bass. Blakely, C. C., Mayer, J. P., Gottschalk, R. G., Schmitt, N., Davidson, W. S., Roitman, D. B., & Emshoff, J. G. (1987). The fidelity-adaptation debate: Implications for the implementation of public sector social programs. American Journal of Community Psychology, 15(3), 253-268. Chatterji, M. (2004). Evidence on “what works”: An argument for extended-term mixed-method (ETMM) evaluation designs. Educational Researcher, 33(9), 3-13. Cho, J. (1998, April). Rethinking curriculum implementation: Paradigms, models, and teachers’ work. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 24 Connell, D. B., Turner, R. R., & Mason, E. F. (1985). Summary of findings of the School Health Education Evaluation: Health promotion effectiveness, implementation, and costs. Journal of School Health, 55(8), 316-321. Dane, A.V., & Schneider, B.H. (1998). Program integrity in primary and early secondary prevention: Are implementation effects out of control? Clinical Psychology Review, 18, 23-45. Deboer, G.E. (2005). Standard-izing test items. Science Scope, 28(4), 10-11. Dumas, J., Lynch, A., Laughlin, J., Smith, E., & Prinz, R. (2001). Promoting intervention fidelity: Conceptual issues, methods and preliminary results from the EARLY ALLIANCE prevention trial. American Journal of Preventive Medicine, 20(1S), 38–47. Dusenbury, L., Brannigan, R., Falco, M., & Hansen, W. B. (2003). A review of research on fidelity of implementation: Implications for drug abuse prevention in school settings. Health Education Research Theory and Practice, 18(2), 237-256. Freeman, H. E. (1977). The present status of evaluation research. In M. Guttentag (Ed.) Evaluation Studies Review Annual II. Beverly Hills, CA: Sage Publications. Fullan, M (2001). The meaning of educational change. New York: Teachers College Press. Gersten, R., Carnine, D., & Williams, P. (1982). Measuring implementation of a structured educational model in an urban setting: An observational approach. Educational Evaluation and Policy Analysis, 4(1), 67-79. Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, C., & Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, (71)2, 149-164.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 25 Hall, G. E., & Loucks, S. F. (1978). Innovation configurations: Analyzing the adaptations of innovations. Austin: The University of Texas, Research and Development Center for Teacher Education. Hansen, W. B., Graham, J. W., Wolkenstein, B. H., & Rohrbach, L. A. (1991). Program integrity as moderator of prevention program effectiveness: Results for fifth-grade students in the adolescent alcohol prevention trial. Journal of Studies on Alcohol, 52(6), 568-579. Harvard-Smithsonian Center for Astrophysics. (2001). ARIES: Exploring Motion and Forces: Speed, Acceleration, and Friction. Charlesbridge Publishing. Hedges, L. (2004, April). Designing studies for evidence-based scale up in education. Paper presented at the Annual Meeting of the American Educational Research Association, San Diego, CA, April, 2004. Hiebert, J. (1999). Relationships between research and the NCTM Standards. Journal for Research in Mathematics Education, 30, 3–19. Hopkins, R. H., Mauss, A.L., Kearney, K.A. & Weisheit, R.A. (1988). Comprehensive evaluation of a model alcohol education curriculum. Journal of Studies on Alcohol, 49, 38-50. Hord, S. M., Rutherford, W. L., Huling-Austin, L., & Hall, G. E. (1987). Taking charge of change. Alexandria, VA: Association for Supervision and Curriculum Development. Howell, D.C. (1997). Statistical methods for psychology: fourth edition. Belmont, CA: Duxbury Press. Kesidou, S., & Roseman, J.E. (2002). How well do middle school science programs measure up? Findings from Project 2061’s curriculum review. Journal of Research in Science Teaching, 39(6), p. 522-549.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 26 Lastica, J. & O’Donnell, C. (2007, April). Considering the role of fidelity of implementation in science education research: Fidelity as Teacher and Student Adherence to Structure. In C. O’Donnell (Chair), Analyzing the relationship between Fidelity of Implementation (FOI) and student outcomes in a quasi-experiment. Symposium conducted at the Annual Meeting of the American Educational Research Association, Chicago, IL. Lynch, S. J. & O’Donnell, C. L. (2007, April). The relationship between fidelity of implementation and student outcomes in a quasi-experiment: A conceptual framework. In C. O’Donnell (Chair), Analyzing the relationship between Fidelity of Implementation (FOI) and student outcomes in a quasi-experiment. Symposium conducted at the Annual Meeting of the American Educational Research Association, Chicago, IL. Lee, O. (2002). Science inquiry for elementary students from diverse backgrounds. In W. G. Secada (Ed.), Review of Research in Education, Vol. 26 (pp. 23-69). Washington, DC: American Educational Research Association. Lipsey, M. W. (1999). Can rehabilitative programs reduce the recidivism of juvenile offenders? An inquiry into the effectiveness of practical programs. Virginia Journal of Social Policy & the Law, 6, 611-641. Loucks, S. F. (1983, April). Defining Fidelity: A cross-study analysis. Paper presented at the Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada. Lynch, S. J., & Hanson, A. (2005). Lesson Flow Classroom Observation Protocol (LFCOP). Unpublished document. Washington, DC: The George Washington University.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 27 Lynch, S. J., Kuipers, J.C., Pyke. C., & Szesze, M. (2002). NSF/IERI proposal, Scaling up highly rated science curricula in diverse student populations: Using evidence to close achievement gaps. Washington, DC: The George Washington University. Lynch, S. & O’Donnell, C. (2005, April). The evolving definition, measurement and conceptualization of fidelity of implementation in scale-up of highly rated science curriculum units in diverse middle schools. In S. Lynch (Chair), The role of fidelity of implementation in quasi-experimental and experimental research designs: Applications in four studies of innovative science curriculum materials and diverse student populations. Symposium conducted at the Annual Meeting of the American Educational Researchers Association, Montreal, Canada. Lynch, S., O'Donnell, C., Hatchuel, E., Rethinam, V. (2007, April). A model predicting student outcomes in middle school science classrooms implementing a highly-rated science curriculum unit. Paper to be presented at the Annual Meeting of the National Association for Research in Science Teaching, New Orleans, LA. Lynch, S. J., & Watson, W. A. (2005). Student Responsiveness Questionnaire (SRQ). Unpublished document. Washington, DC: The George Washington University. Merchlinsky, S., & Hansen-Grafton, B. (2007). Evaluation and science specialists' role in collecting fidelity of implementation data in a large school district. In C. O’Donnell (Chair), Analyzing the relationship between Fidelity of Implementation (FOI) and student outcomes in a quasi-experiment. Symposium conducted at the Annual Meeting of the American Educational Research Association, Chicago, IL. Mihalic, S. (2002, April). The importance of implementation fidelity. Boulder, Colorado: Center for the Study and Prevention of Violence.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 28 Mitchel, M.E., Hu, T.-W., McDonnell, N.S. & Swisher, J.D. (1984). Cost-effectiveness analysis of an educational drug abuse prevention program. Journal of Drug Education, 14, 271291. Montgomery County Public Schools (MCPS). (2006). MCPS Schools at a Glance 2005 - 2006. Unpublished document Montgomery County, MD: Department of Reporting and Regulatory Accountability Montgomery County Public Schools. Mowbray, C., Holter, M. C., Teague, G. B., & Bybee, D. (2003). Fidelity criteria: Development, measurement, and validation. American Journal of Evaluation, 24(3),315-340. National Research Council (NRC). (1996). The National Science Education Standards. Washington, DC: The National Academies Press. NRC. (2004). On evaluating curricular effectiveness: Judging the quality of K-12 mathematics evaluations. Committee for a Review of the Evaluation Data on the Effectiveness of NSF-Supported and Commercially Generated Mathematics Curriculum Materials. Mathematical Sciences Education Board, Center for Education, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press. Ochsendorf, R., Deprey, N., Lopez, R., Roudebush, D., Lynch, S., & Pyke, C. (2003). Using the Project 2061 Curriculum Analysis to rate a middle school science curriculum unit: ARIES-Exploring Motion and Forces): SCALE-uP Report No. 10. Washington, DC: George Washington University, SCALE-uP. Ochsendorf, R., Lynch, S., & Pyke, C. (2006). Rating a science curriculum unit: Perspectives on the program theory. Manuscript submitted for publication. Ochsendorf, R., Pyke, C., Lynch, S., & Watson, W. (2006, April). The impact of a middle school motion and forces curriculum unit on student outcomes: Results from consecutive quasi-

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 29 experimental studies. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA. O'Donnell, C. L., Lynch, S., & Merchlinsky, S. (2005). Instructional Strategies Classroom Observation Protocol (ISCOP). Unpublished document. Washington, DC: The George Washington University. Parrott, A. C. (1991). Performance tests in human psychopharmacology (2): Content validity, criterion validity, and face validity. Human Psychopharmacology, 6, 91-98. Patton, M. Q. (1978). Utilization—Focused evaluation. Beverley Hills, CA: Sage Publications. Resnick, B., Bellg, A. J., Borrelli, B., DeFrancesco, C., Breger, R., Hecht, J., Sharp, D. L., Levesque, C., Orwig, D., Ernst, D., Ogedegbe, G., & Czajkowski, S. (2005). Examples of implementation and evaluation of treatment fidelity in the BCC Studies: Where we are and where we need to go. Annals of Behavioral Medicine, 29, 46-54. Resnicow, K., Cohn, L., Reinhardt, J., Cross, D., Futterman, R., Kirschner, E., Wynder, E. L., Allegrante, J. P. (1992). A three-year evaluation of the "Know Your Body" program in inner-city schoolchildren. Health Education Quarterly, 19(4), 463-480. Richardson, V. (1990). Significant and worthwhile change in teaching practice. Educational Researcher, 19, 10-18. Riordan, J. E., & Noyce, P. E. (2001). The impact of two standards-based mathematics curricula on student achievement in Massachusetts. Journal for Research in Mathematics Education, (32)4, 368–398. Rogers, E. (2003). Diffusion of innovations. New York: Free Press. Scheirer, M. A., & Rezmovic, E. L. (1983). Measuring the degree of program implementation: A methodological review. Evaluation Review, 7, 599-633.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 30 Sechrest, L., West, S. G., Phillips, M. A., Redner, R., & Yeaton, W. (1979). Some neglected problems in evaluation research: Strength and integrity of treatments. In L. Sechrest, S. G. West, M. A. Phillips, R. Redner, & W. Yeaton (Eds.), Evaluation studies review annual (Vol. 4, pp15-35). Thousand Oaks, CA: Sage. Stern, L. & Ahlgren, A. (2002). Analysis of students’ assessments in middle school curriculum materials: Aiming precisely at benchmarks and standards. Journal of Research in Science Teaching, 39, 889-910. Taggart, V. S., Bush, P. J., Zuckerman, A. E., & Theiss, P. K. (1990). A process evaluation of the District of Columbia "Know Your Body" project. Journal of School Health, 60(2), 60-66. Tortu, S. & Botvin, G. J. (1989). School-based smoking prevention: The teacher training process. Prevention Medicine, 18, 280-289. U.S. Department of Education (2006). Institute for Education Sciences: Mathematics and Science Education Research Current Funding Opportunity: CFDA Number: 84.305A and B. Retrieved May 15, 2006 from http://www.ed.gov/about/offices/list/ies/index.html. U. S. Department of Health and Human Services. (2002). Finding the balance: Program fidelity and adaptation in substance abuse prevention, A state-of-the-art review. Washington, DC: Federal Center for Substance Abuse Prevention (CSAP). Watson, W., Pyke, C., Lynch, S., Ochsendorf, R. (2007, April). Understanding the effectiveness of curriculum materials through replication. Paper to be presented at the Annual Meeting of the National Association for Research in Science Teaching, New Orleans, LA.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 31 Table 1 Instruments, Data Collection Procedures, Variables, Scale, and Reliability Employed by SCALEuP to Assess Teacher and Student Fidelity to Process Teacher

Student

Instrument

Instructional Strategies Classroom Observation Protocol (ISCOP)

Lesson Flow Classroom Observation Protocol (LFCOP)

Student Responsiveness Questionnaire (SRQ)

Construct

Fidelity to Project 2061 instructional strategies

Fidelity to lesson flow

Students’ perceptions of presence of Project 2061 instructional strategies

Variables

20 items aligned with Project 2061 instructional criteria

7 items (on/off topic; teacher/group/student; lab/non-lab)

18 items; 3 subscales aligned with Project 2061 Categories I; III/IV; and II/V

Scale

Likert-like 0 = Evident 3 = Evident with Emphasis

Frequency of item observed every 2 minutes

Likert-like 1 = Not true 3 = Somewhat true 5 = Very true

Reliability

0.63 to .80*

0.74**

Subscale 1 = 0.78 Subscale 2 = 0.83 Subscale 3 = 0.81***

Unit of analysis

Classroom

Classroom

Classroom (student responses aggregated to class level)

n

24

24

24

Collection Timeframe

Observation during implementation

Observation during Implementation

Questionnaire administered after implementation

*Average mean percentage agreement between two raters **Cohen's κ statistic for inter-rater reliability ***Cronbach's α for scale reliability

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 32

Item

Proposed Scale

Table 2 Proposed Scales and the Highest Actual Factor Loadings for Factor Analysis of 18 Student Responsiveness Questionnaire (SRQ) Items

5

6

7 8 9 10 11 12 13

Taking Account of Student Ideas

4

Engagement with Relevant Phenomena

3

Developing and Using Scientific Ideas

2

Providing a Sense of Purpose

1

15 16

Promoting Thinking

14

Proposed scale

Factor 2

I understood the purpose of the whole unit about motion and forces I understood the purpose of each lesson in the unit I understood how the lessons connected to one another to teach me about motion and forces Things I learned and did before this unit helped me to learn about motion and forces I made predictions about what I thought was going to happen in each lesson

I used new science words when I wrote and talked about motion and forces The materials we used in class - like graphs, pictures, and movies - made sense to me I practiced what I learned about motion and forces in class The work I did to learn about motion and forces got more challenging as I learned more I explained my ideas about motion and forces by talking with other students I explained my ideas about motion and forces by writing them down

Factor 3

.764 .717 .698 .475 .609 .480

I did activities to help me understand the difference between my ideas about motion and forces and how scientists think about motion and forces I did lots of different kinds of activities in science class to learn about motion and forces I was interested in the activities in this unit I paid attention during the unit on motion and forces

I kept track of how my ideas changed from the beginning to the end of the unit on motion and forces 17 I compared my ideas with the ideas scientists have about motion and forces Percent of variance explained

Carol O'Donnell

Factor 1

.684 .641 .567 .512 .533 .533 .591 .668

.606 .756 .751 39.33

8.39

5.93

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 33 Table 3 Means and Standard Deviations for each SRQ scale for the Treatment condition. _________________________________________ SRQ Scale Treatment M (SD) _________________________________________ Sense of Purpose 4.18 (.17) Using and Talking about Scientific Phenomena

3.99 (.24)

Thinking about Scientific 3.44 (.31) Phenomena __________________________________________ *p < .01

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 34 Figure 1. Effect sizes ARIES: Exploring Motion and Forces Year 4 (2005-2006)

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 35

Figure 2. Fidelity to Process conceptual model

:

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 36

Instructional Categories

I. Identifying a Sense of Purpose a. Conveying unit purpose b. Conveying lesson/activity purpose c. Justifying lesson/activity sequence II. Taking Account of Student Ideas a. Attending to prerequisite knowledge and skills b. Alerting teacher to commonly held student ideas c. Assisting teacher in identifying own students’ ideas d. Addressing commonly held ideas

SRQ Item(s) that Align with the 2061 Criteria

Excellent=●; Satisfactory=◒; Poor=○ n/a = Not analyzed

ISCOP Item(s) that Align with the 2061 Criteria

Instructional Analysis Ratings

ARIES: M&F Instructional Analysis (2003) *

Figure 3. Instructional Analysis ratings for ARIES: M&F and alignment of the ISCOP and SRQ items to the 16 Project 2061 instructional criteria

n/a***

n/a***

Ib, Ic, If ** Cronbach's alpha = .829 Inter-item correlation = .590 Ie

2



IIc

4

IIe

n/a***



IIa, IIb** Cronbach's alpha = .786 Inter-item correlation = .650 IId

○ ◒ ◒





3

5 6

III. Engaging Students with Relevant Phenomena a. Providing variety of phenomena



IIIb

7

b. Providing vivid experiences



IIIa

8



IVa, IVb** Cronbach's alpha = .627 Inter-item correlation = .556

10



IVc

11

IVe

n/a***



n/a***

12



Vb

14, 15**

n/a***

17

Va

16

IV. Developing and Using Scientific Ideas a. Introducing terms meaningfully

b. Representing ideas effectively c. Demonstrating use of knowledge d. Providing practice V. Promoting Student Thinking about Phenomena, Experiences, and Knowledge. a. Encouraging students to explain their ideas b. Guiding student interpretation and reasoning c. Encouraging students to think about what they’ve learned



○ ○

*See Ochsendorf et al., 2003 for a full explanation of how these ratings were derived. **When more than one ISCOP or SRQ item aligned with one of the Project 2061 instructional criteria, means of the ISCOP or SRQ items were calculated for data analysis. ***Items marked n/a were either not analyzed by the ISCOP or SRQ, or there was no ISCOP or SRQ item aligned with this particular Project 2061 criterion. Note: Three of the 20 ISCOP items (Ia, Id, and IVd) and three of the SRQ items (1, 9, and 13) are not listed because content validity indicated that they did not align closely with any of the shown Project 2061 criteria.

Carol O'Donnell

Fidelity to Process_AERA_2007

Fidelity of Implementation 2 - 37

Figure 4. ISCOP and SRQ Treatment means and their relationship to student outcomes on the Motion and Forces Assessment (MFA) for each instructional criterion rated Satisfactory or Excellent in ARIES: M&F (2005-2006) Instructional Criteria

ARIES: M&F Ratings Excellent=● Satisfactory=◒

Poor=○

ISCOPa ISCOP Rating M(SD)

SRQb

ISCOP/ MFA

SRQ Rating M(SD)

SRQ/MFA Correlation (rs)

Correlation

(rs)

Satisfactory or Excellent Ratings I. Identifying a Sense of Purpose b. Conveying lesson/activity ◒ purpose c. Justifying lesson/activity ◒ sequence II. Taking Account of Student Ideas c. Assisting teacher in identifying ◒ own students’ ideas III. Engaging Students with Relevant Phenomena a. Providing variety of phenomena ●

.63(.41)

.425*

4.09 (.945)

.135

.53(.51)

.438*

4.28 (.954)

-.067

2.18(.90)

.232

3.42 (1.363)

-.667**

1.53(1.2)

.556**

4.53 (.864)

-.045



2.77(.50)

.104

3.84 (1.231)

-.660**



2.42(.68)

.669**

3.87 (1.167)

1.50(1.31) .199 ◒ V. Promoting Student Thinking about Phenomena, Experiences, and Knowledge. a. Encouraging students to explain ◒ 1.97(1.22) .729** their ideas Poor Ratings

4.26 (.969)

.246

3.53 (1.30)

-.475*

b. Providing vivid experiences

IV. Developing and Using Scientific Ideas a. Introducing terms meaningfully b. Representing ideas effectively

II. Taking Account of Student Ideas a. Attending to prerequisite knowledge and skills b. Alerting teacher to commonly held student ideas d. Addressing commonly held ideas

.549**



1.29(1.12)

.30

3.75 (1.220)

-.425*



.79(.98)

.53**

n/a

n/a



.75(1.15)

.44*

4.03 (1.127)

-.270

IV. Developing and Using Scientific Ideas c. Demonstrating use of knowledge

1.54(1.10) .46* n/a n/a ○ V. Promoting Student Thinking about Phenomena, Experiences, and Knowledge. c. Encouraging students to think ○ 2.67(.70) .15 3.28 (1.368) -.455* about what they’ve learned * p < .05 ** p < .01 a Category I on the ISCOP was dichotomous (0=Not Evident; 1 = Evident); all other categories were scored on a four-point Likert-like scale. b 5-point Likert-like scale (1 = Not true; 5 = Very true). Note. Project 2061 Criteria Ia Conveying unit purpose, IVd Providing practice, and Vb Guiding student interpretation and reasoning were not assessed by the ISCOP.

Carol O'Donnell

Fidelity to Process_AERA_2007

Suggest Documents