Evaluating treatment integrity across ... - Wiley Online Library

bs_bs_banner

Journal of Research in Special Educational Needs doi: 10.1111/j.1471-3802.2011.01229.x

· Volume 14 · Number 3 · 2014

164–169

Evaluating treatment integrity across interventions aimed at social and emotional skill development in learners with emotional and behaviour disorders jrs3_1229

164..169

John J. Wheeler1, Michael R. Mayton2, Julie Ton4 and Joshua E. Reese3 1 East Tennessee State University; 2West Virginia University; 3Tennessee Tech University; 4Downey Unified School District, Downey, CA

Key words: Treatment integrity, EBD, single-subject research design.

This study contributes to the existing literature on treatment integrity (TI) by presenting TI findings across interventions aimed at the development of social emotional skills in learners with emotional and behavioural disorders. Social and emotional skills were selected as the target for our investigation given the significance of these skills in relation to the academic and behavioural success of learners and the challenges most often faced in these skill areas by students with emotional and behaviour disorders (E/BD). The study analysed single-subject experimental studies from 2000 to 2009 in two leading journals in the field of emotional and behaviour disorders: Behavior Disorders and The Journal of Emotional Behavioral Disorders. The degree to which studies operationally defined independent variables and evaluated and reported measures of treatment integrity and associated risk factors is reported. Thirty-three studies met the inclusion criteria for the present study and TI was evaluated across six variables (1) year published, (2) dependent variable(s), (3) independent variables(s), (4) participant characteristics, (5) treatment agent and (6) assessment of TI. Results indicated that approximately 49% of the studies monitored and reported TI, meaning that they provided a description of the TI procedure and resultant data. Findings from the study point to the need for attention to TI both in the description of methods used and in the reporting of TI data.

Students with emotional and behaviour disorders (EBD) are faced with a number of social and emotional skill challenges that may impede academic performance (Kauffman, 2001; Locke and Fuchs, 1995), access to social development (Van Acker and Wehby, 2000) and ultimately their quality of life (Zigmond, 2006). Given the global significance of these skills in the development of a child, social and emotional skill development should be targeted for intervention early on such that later and potentially larger problems can be minimised through the refinement of these formative skills. In-school and post-school data on student behaviour for students identified as EBD have been quite alarming for 164

some time. High rates of in-school suspensions and expulsions are prevalent, especially as these children transition from elementary to middle and high school (Cheney, Flower and Templeton, 2008). Students labelled as EBD also have a significant dropout rate (55%), which is higher than that of other disability categories (Wagner, 2006). As reported from the literature, learners with EBD have a critical need for effective interventions that focus specifically on social and emotional skill development. The hope is that these interventions will be grounded in empirical evidence, thus emphasising the importance of quality-driven research that adheres to a high level of methodological soundness. In a 2005 special issue, the journal, Exceptional Children addressed the importance of evidence-based practice (EBP) for students with disabilities. Papers contained in that special issue defined quality indicators for EBP in special education across quantitative methods (Gersten, Fuchs and Compton et al., 2005) and single subject methods (Horner, Carr and Halle et al., 2005). Horner et al. (2005) reinforced the importance of treatment integrity (TI) within singlesubject experimental research. TI is defined as the degree to which an intervention is applied as it was intended (Peterson, Horner and Wonderlick, 1982). The importance of TI in the treatment research has received more attention within the literature in recent years. Historically, this has not been the case in the field of special education where there has been a lack of emphasis concerning the reporting of TI (or treatment fidelity) data. It has become an increasingly important topic within the literature, and perhaps this is in some ways attributable to the emergence of the EBP movement. Illustrations of this are numerous, as based on reviews of extant literature. Gresham, Gansle and Noell (1993) conducted a comprehensive review of treatment studies published in the Journal of Applied Behavior Analysis between the years 1980 and 1990. The results from their investigation found that only 16% of the studies reviewed (n = 158) assessed the accuracy of the independent variable, whereas two thirds of the studies did not complete the review. A subsequent study by Gresham, Macmillan and Frankenberger et al. (2000) examined the same question as it pertained to intervention studies in the area of learning © 2012 The Authors. Journal of Research in Special Educational Needs © 2012 NASEN

Journal of Research in Special Educational Needs, 14 164–169

disabilities across three major journals spanning a period of 5 years. The results indicated a yield of 18.5% of studies that had assessed TI. This trend was also supported in a study conducted by Wheeler, Baggett and Fox et al. (2006) that analysed treatment studies in the area of autism (n = 60) across nine journals from 1993 to 2003. Only 11 of the studies (18%) reviewed had operationally defined the independent variable and assessed TI. McIntyre, Gresham and DiGennaro et al. (2007) reviewed school-based intervention studies (n = 142) conducted with children ages 0 to 18 years of age from 1991 to 2005, and only 30% of these studies reported in The Journal of Applied Behavior Analysis provided data regarding TI. In 2009, Wheeler and colleagues reviewed TI (among other variables) from a sample of 163 behavioural intervention studies across five journals in the area of intellectual disabilities. In this study, only 36% (n = 58) fully reported data on TI. The trend supporting the assessment of TI within behavioural treatment studies does appear to have changed in a positive direction over time, but the inconsistencies and lack of reporting is still of concern and not acceptable. The rationale for assessing and reporting TI relates to the importance of having an operational definition of the independent variable to ensure reliable outcomes and allow for replicability (Johnston and Pennypacker, 1993). The findings from the reported research investigations are alarming when one considers that the field of applied behaviour analysis has been fervent in promoting adherence to sound methodological practices. From a research standpoint in terms of quality assurance, TI represents sound practice and in turn is professionally responsible. The reporting of TI provides a sense of trust that the investigator was closely monitoring the application of the intervention. Additionally, if and when TI is reported at low levels, the investigator has the responsibility of reassessing the definitions provided for the independent variable(s) and, in turn, retraining interventionists. As noted by Stichter and Conroy (2004), the external validity of research findings in the field of EBD relies upon sound research practice. This statement is universal for effective research practice. As this applies to interventionbased research within the field of EBD, researchers, teachers and practitioners require the descriptions of effective interventions for the EBD population under specified conditions. To achieve replicated outcomes, the field of EBD must conduct thorough evaluations and report TI in addition to experimental results. Given the importance of treatments aimed at the development of social/emotional skill development for children with EBD (Gresham, Cook and Crews et al., 2004), this inquiry is considered most relevant. This paper contributes to the existing literature on TI by presenting TI findings across interventions aimed at the development of social/emotional skills in learners with EBD. Specifically, the investigators examined published empirical © 2012 The Authors. Journal of Research in Special Educational Needs © 2012 NASEN

behavioural studies from 2000 to 2009 within two leading journals in the field of EBD in the USA: Behavior Disorders (BD) and The Journal of Emotional Behavioral Disorders (JEBD). This study examined the degree to which studies operationally defined independent variables and evaluated and reported measures of TI and associated risk factors. Method Criteria for inclusion Studies that met the following inclusion criteria were reviewed for this project: Published experimental studies involving (1) a single subject research design; (2) interventions targeting social/emotional skills and/or self-regulation and (3) participants with EBD or at-risk for a behavioural disorder. The papers that met inclusion criteria were published between January 2000 and June 2009. The rationale for the time span selected for the review was purposeful, as the authors sought to (1) review only the most current, intervention-based research in the area of social/emotional skills among students with E/BD; and (2) perform a 10-year, state-of-the-field review beginning with the new century. The two aforementioned journals that were selected for the study are regarded as leading publications in the field of EBD and were thus selected as the best representative sample of specific journals from the EBD literature. Investigators first completed an electronic search via PsychLit and ERIC databases and then completed a yearby-year search of the online archives of BD and JEBD. The electronic search included combinations and variations of the following terms: single-subject, social/emotional skills, self-regulation, emotional behavioural disorder and intervention. Studies were excluded if they contained (1) mixed methods (i.e., single-subject research design along with any other type of experimental approach); (2) mixed participant samples (i.e., inclusion of one or more participants with a diagnosis in addition to or other than EBD); and/or (3) participant data that were consolidated/summarised in a way that prevented relevant characteristics from being determined. Procedure After papers were identified and criteria were satisfied, investigators coded each paper on the following variables: (1) year published, (2) dependent variable(s), (3) independent variable(s), (4) participant characteristics, (5) treatment agent and (6) assessment of TI. Variables 2 through 4 were rated as ‘acceptable’ or ‘not acceptable’ according to the Horner et al.’s (2005) single-subject quality indicator guidelines. Variables 5 through 6 were evaluated using guidelines from the McIntyre et al.’s (2007) treatment agent and TI coding methods. The dependent variable was evaluated by its (1) operational definition, (2) measurement procedure, (3) validity and precision, (4) repeated measurement, (5) inter-observer agreement and (6) social significance. The independent variable was evaluated by its description and systematic manipulation. Participant characteristics were evaluated by description of participants and the participant selection process. 165


Treatment agents were described as (1) teacher, (2) professional (non-teacher), (3) paraprofessional, (4) parent or sibling, (5) researcher or research assistant, (6) peer tutors, (7) self (8) multiple, (9) other or (10) not specified. Treatment agent categories were replicated from McIntyre et al. (2007). TI was evaluated by (1) description, (2) reporting of results and (3) level of risk. Level of risk was determined as no risk, low risk or high risk. Guidelines for level of risk were replicated from McIntyre et al. (2007), with (1) ‘no risk’ used to indicate studies in which TI procedures were reported and/or TI data were provided; (2) ‘low risk’ used to indicate studies in which TI was not monitored and TI data were not reported, yet the type of intervention used was less likely to be subject to inaccurate/inconsistent delivery (e.g., an intervention delivered by mechanical or computerised means); and (3) ‘high risk’ to indicate studies in which TI procedures and data were not reported, though reporting was deemed necessary due to the nature of the treatment provided. Inter-observer reliability Inter-observer reliability (IOR) was performed by two coders across 27% of randomly selected papers. IOR was determined by adding all points of agreement, dividing by the number of agreements plus disagreements, and multiplying by 100 [Agreements / (Agreements + Disagreements) ¥ 100]. Reliability was reported at 98.7%. Results General study characteristics Thirty-three studies met the inclusion criteria for the present study (BD = 20; JEBD = 13). (A full list of the articles reviewed is available upon request from the first author.) The number of participants per study ranged from 1 to 9 and averaged 3.9 (median = 3; SD = 2.24), with 46% of the studies originating in the South, 27% in the Midwest, 15% in the Northeast and 9% in the West (3%, or one study, did not specify region). Generalised category groupings of dependent and independent variables can be seen in Table 1.

Of the 198 separate ratings within this category, only two (both within the same study) were found to be not applicable in regard to the evaluation standards set forth by Horner et al. (2005). Independent variable. Across the two evaluation components of the independent variable(s) within each study (e.g., systematic manipulation), 100% received a rating of ‘acceptable’. Of the 66 separate ratings within this category, only two (both within the same study) were found to be not applicable in regard to the evaluation standards set forth by Horner et al. (2005). Participant characteristics. Across the two evaluation components of the participant characteristics within each study (e.g., the participant selection process), 100% received a rating of ‘acceptable’. Of the 66 separate ratings within this category, only three (two within the same study) were found to be not applicable in regard to the evaluation standards set forth by Horner et al. (2005). McIntyre et al. criteria Treatment agent. Of the 10 McIntyre et al. (2007) treatment agent designations, the following five were not represented in the findings of the current study: (1) paraprofessional, (2) sibling (3) self, (4) other and (5) not specified. The vast majority of the designations represented across studies were that of ‘teacher’, comprising 73% of all reported treatment agents. The next most common designation was that of ‘researcher’, which comprised only 6% of the reported treatment agents. See Figure 1 for TI ratings across these designations. TI assessment. About half (49%) of all studies assessed were found to be within the category of ‘monitored and reported’, indicating that these studies provided both a description of TI procedure and the resulting data. The next largest grouping was categorised as ‘not monitored, not

Figure 1: Treatment integrity by treatment agent

Horner et al. criteria Dependent variable. Across the six evaluation components of the dependent variable(s) within each study (e.g., operational definition), 100% received a rating of ‘acceptable’.

Table 1: Major types of dependent and independent variables within studies Dependent variable types

Independent variable types

Challenging behaviour (45%)

Self-management (25%)

Engagement / on-task (31%)

Function-based intervention (25%)

Social interaction skills (14%)

Social skills instruction (18%)

Academic behaviour (10%)

Other behaviour strategy (18%) Curricular / academic intervention (14%)

166

© 2012 The Authors. Journal of Research in Special Educational Needs © 2012 NASEN


Figure 2: Treatment integrity and risk levels across journals and all studies

Figure 3: Treatment integrity by year. The black line represents the average number of studies reporting treatment integrity per year

reported’ (36%), a category indicating that the studies within this grouping provided neither a description of TI procedure nor any associated data. Comprising the remaining 15% were studies designated as ‘monitored only’, a grouping indicating that TI procedures were reported but no data were provided.

Figure 4: Percentage of studies reporting treatment integrity across investigations. Boxes within bars represent the total number of studies investigated

Risk level assessment revealed a somewhat better picture, with 61% of treatments within studies receiving a rating of ‘no risk’. The remaining 39% was divided approximately equally between treatments designated as ‘low risk’ (18%) and ‘high risk’ (21%). See Figure 2 for TI across journals and all studies, as well as a graphical representation of risk assessment ratings. TI by year Except for an unexpected spike occurring in 2007, the trend of failing to report TI across the previous 10 years seems to be neither increasing nor decreasing but maintaining at a somewhat steady rate. Of course, this means that the other side of the equation, TI reporting, also seems to be following the same trend (refer to Figure 3). Discussion and conclusions This study sought to investigate the level and quality of TI reporting among intervention studies in the area of social/ emotional skill development for students with EBD. As previously noted, the literature has been longitudinally weak in the reporting of TI, though in more recent years the rate of TI reporting has begun to improve. Though based upon a comparatively smaller sample than that used by similar studies of TI, the results of the current study suggest a higher frequency of TI reporting in the reviewed EBD intervention literature as compared with previous findings in this area of inquiry (refer to Figure 4). However, when © 2012 The Authors. Journal of Research in Special Educational Needs © 2012 NASEN

viewed across the 10-year review period of the current study, the rate of TI reporting appears to be somewhat stagnant rather than indicative of the increasing, though sometimes modest, trend found in related TI assessment literature. In regard to indicators of external validity, the sample of EBD literature was exemplary in the quality of components related to dependent and independent variables as well as the reporting of important participant characteristics. 167


However, external validity relies upon all the pieces of sound research practice being firmly in place, and, in addition to the indicators of external validity discussed above, sound research practice involves measuring TI in order to achieve outcomes that can be reliably replicated by consumers. Fifty-one percent of the studies reviewed did not fully report TI, and 39% of these studies were found to be at some level of risk (low or high) for inaccurate/incomplete implementation of the treatment as intended. In addition, TI was least likely to be reported in studies in which the reported intervention agents were designated as ‘teachers’ and ‘multiple’, suggesting that the most frequent intervention agents should perhaps be the focus of instruction in monitoring and gathering data on TI. Factors contributing to inconsistency in reporting TI within studies accepted for publication should be more closely investigated and may include (1) cost and time considerations (e.g., the need for more highly trained personnel or outside observers/researchers); (2) editorial considerations, such as constraints regarding the length of a manuscript or failure by peer reviewers to require measures of TI; and/or (3) training considerations (e.g., a lack of knowledge and skill due to incomplete pre-professional training). Until we know the reasons that TI measures are not being universally investigated, implemented and reported within published research, we cannot begin to systematically intervene with effective and ameliorative intent. It is important to note two major limitations of this investigation that should be taken into consideration when evaluating this study. The first of these limitations is that the study chose to do a comparison of the literature across two of the most prominent journals in the USA in the area of E/BD. These were selected as a means of purposeful sampling to determine the extent of studies in the area of social and emotional skill development and the degree that these investigations examined treatment integrity. Second, social and emotional skills were selected as the target interventions in studies given their significance in the lives of many students identified with E/BD in terms of promoting the academic and behavioural success of these students. Given this purposely limited range of focus, the resulting sample size was n = 33 papers, and the study was restricted in terms of its geographic exclusivity to the USA and the range of years examined, that is, 2000–2009. Despite these limitations, the present study attempts to provide an examination of where treatment integrity is with respect to the field of E/BD in two of the leading journals in the area from the USA. As the field moves along the path of refining EBP, it seems imperative that TI be a consistent practice among researchers as they seek to discern the efficacy of social/emotional treatments for students with EBD. It would also seem that as the field seeks to better understand and synthesise effective treatments, the goal of improved research practice can only be achieved if external validity is addressed through pervasive improvements in design and implementation. Such methodological rigour is needed and should be consistently adhered to by researchers and journal editors as a 168

means of fostering a healthier research climate, ultimately allowing for replication and systematic refinement of our approaches for better addressing the social and emotional needs of students with EBD. Along these lines, studies must provide substantial data to allow for replication if we are to refine and advance our understanding of EBP. Clearly, if we are to advance our understanding of EBP in the design and refinement of treatments directed toward the development of social and emotional skills in learners with EBD, we must have some degree of uniformity in conducting and reporting research in this area.

Address for correspondence John J. Wheeler, Center of Excellence in Early Childhood Learning and Development, East Tennessee State University, Claudius G. Clemmer College of Education P319 Warf-Pickel Hall | PO Box 70685, Johnson City, TN 37614, USA. Email: [email protected]

References Cheney, D., Flower, A. & Templeton, T. (2008) ‘Applying response to intervention metrics in the social domain for students at risk of developing emotional or behavioral disorders.’ The Journal of Special Education, 42, pp. 108–26. Gersten, R., Fuchs, L., Compton, D., Greenwood, C. & Innocenti, M. (2005) ‘Quality indicators for group experimental and quasi experimental research in special education.’ Exceptional Children, 71, pp. 149–64. Gresham, F. M., Cook, C. B., Crews, D. S. & Kern, L. (2004) ‘Social skills training for children with emotional and behavioral disorders: validity considerations and future directions.’ Behavioral Disorders, 30, pp. 32–46. Gresham, F. M., Gansle, K. A. & Noell, G. H. (1993) ‘Treatment integrity in applied behavior analysis with children.’ Journal of Applied Behavior Analysis, 26, pp. 257–63. Gresham, F. M., MacMillan, D. L., Frankenberger, M. E. & Bocian, K. M. (2000) ‘Treatment integrity in learning disabilities intervention research: do we really know how treatments are implemented?’ Learning Disabilities Research and Practice, 15, pp. 198–205. Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S. & Wolery, M. (2005) ‘The use of single-subject research to identify evidence-based practices in special education.’ Exceptional Children, 71, pp. 165–79. Johnston, J. M. & Pennypacker, H. S. (1993) Strategies and Tactics of Behavioral Research. (2nd edn). Erlbaum, NJ: Hillsdale. © 2012 The Authors. Journal of Research in Special Educational Needs © 2012 NASEN


Kauffman, J. M. (2001) Characteristics of Emotional and Behavioral Disorders of Children and Youth. (7th edn). Upper Saddle River, NJ: Prentice Hall. Locke, W. & Fuchs, L. S. (1995) ‘Effects of peer-mediated reading instruction on the on-task behavior and social interaction of children with behavior disorders.’ Journal of Emotional and Behavioral Disorders, 3, pp. 92–9. McIntyre, L. L., Gresham, F. M., DiGennaro, F. D. & Reed, D. D. (2007) ‘Treatment integrity of school-based interventions with children in the Journal of Applied Behavior Analysis 1991–2005.’ Journal of Applied Behavior Analysis, 40, pp. 659–72. Peterson, L., Horner, R. H. & Wonderlick, S. (1982) ‘The integrity of independent variables in behavior analysis.’ Journal of Applied Behavior Analysis, 15, pp. 477–92. Stichter, J. P. & Conroy, M. A. (2004) ‘Measurement, validity, and science: a call for elucidating precision

© 2012 The Authors. Journal of Research in Special Educational Needs © 2012 NASEN

and rigor in EBD research.’ Behavioral Disorders, 30, pp. 5–6. Van Acker, R. & Wehby, J. (2000) ‘Exploring the social contexts influencing student success or failure: introduction.’ Preventing School Failure, 44, pp. 93–6. Wagner, M. (2006) ‘The mismatch between the transition goals and school programs of youth with emotional disturbances.’ Journal of Emotional and Behavioral Disorders, 2, pp. 90–112. Wheeler, J. J., Baggett, B., Fox, J. & Blevins, L. (2006) ‘Treatment integrity: a review of intervention studies conducted with children with autism.’ Focus on Autism and Other Developmental Disabilities, 21, pp. 45–54. Zigmond, N. (2006) ‘Twenty-four months after high school: paths taken by youth with diagnosed with severe emotional behavioral disorders.’ Journal of Emotional and Behavioral Disorders, 14, pp. 99–107.

169