682905 research-article2016
JPAXXX10.1177/0734282916682905Journal of Psychoeducational AssessmentWalker et al.
Article
Confirmation of the Data-Driven Decision-Making Efficacy and Anxiety Inventory’s Score Factor Structure Among Teachers
Journal of Psychoeducational Assessment 2018, Vol. 36(5) 477–491 © The Author(s) 2016 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/0734282916682905 journals.sagepub.com/home/jpa
David A. Walker1, Todd D. Reeves1, and Thomas J. Smith1
Abstract The implementation of data-driven decision-making practices (DDDM) is a key component of contemporary teachers’ professional practice. As such, the measurement of DDDM and related constructs is important for multiple purposes in both research and practice (e.g., identifying teacher needs around DDDM, and monitoring teacher change in response to DDDM interventions). With the present study, we examined the score factor structure and reliability of the Data-Driven Decision-Making Efficacy and Anxiety Inventory (3D-MEA), an existing measure of data-driven decision-making–related self-efficacy and anxiety. Prior work with this instrument has provided some internal structure and reliability evidence in the context of teachers from the Pacific Northwest. Confirmatory factor analysis of 3D-MEA scores from a sample of Midwestern teachers replicates the initially hypothesized five-factor internal score structure. Our study also affords evidence of high score reliability within this population. Limitations, implications, and future directions are discussed. Keywords factor analysis, measurement, reliability, validity, self-efficacy, personality/individual differences, teachers, participants Contemporary teachers are inundated with data of various forms, academic and otherwise. Teachers also gather considerable amounts of data in the context of their own classrooms through both formal (e.g., teacher-developed tests) and informal (e.g., observation) means. Given this context, the implementation of data-driven decision-making practices (DDDM) is not surprisingly a key component of teachers’ professional practice (Mandinach & Gummer, 2013). With DDDM, teachers are required to access such data, analyze and interpret it, and combine their interpretations with other forms of knowledge (e.g., content, pedagogical) to make decisions about their practice (Hamilton et al., 2009; Marsh, 2012). Theoretically, the implementation of DDDM will better target teachers’ work to their students, resulting in more effective instruction for students and greater student achievement (e.g., van Geel, Keuning, Visscher, & Fox, 2016).
1Northern
Illinois University, DeKalb, IL, USA
Corresponding Author: David A. Walker, Northern Illinois University, College of Education, 325 Graham, DeKalb, IL 60115, USA. Email:
[email protected]
478
Journal of Psychoeducational Assessment 36(5)
The efficacious implementation of such DDDM practices, however, is nontrivial. Many teachers are insufficiently prepared vis-à-vis their data use knowledge and skills (Means, Padilla, DeBarger, & Bakia, 2009; Volante & Fazio, 2007). Teachers also often lack self-efficacy in their ability to engage in DDDM practices (Dunn, Airola, Lo, & Garrison, 2013b; U.S. Department of Education, 2008, 2010), which, in accord with self-efficacy theory (Bandura, 1997), can constrain successful implementation of DDDM practices. At the same time, DDDM and the uses of data and statistics can induce stress and anxiety among in-service (i.e., practicing teachers) as well as preservice teachers (i.e., student teachers; Dunn et al., 2013b; Jimerson, Choate, & Dietz, 2015; Piro, Dunlap, & Shutt, 2014). Teacher anxiety pertaining to DDDM too can potentially interfere with the implementation of data use within the K-12 education system. In light of these challenges, the current study investigates teacher self-efficacy and anxiety related to DDDM. In particular, the present study re-examines the score internal structure and reliability of an existing instrument intended to measure these constructs, the Data-Driven Decision-Making Efficacy and Anxiety Inventory (3D-MEA; Dunn et al., 2013b). While highly favorable evidence concerning the character of 3D-MEA scores exists in the literature, the populations in which such evidence was gathered were limited to teachers in the Pacific Northwest. Given that education is constitutionally under the purview of local jurisdictions (states and school boards) in the United States, cross- and even within-state variation in education policies, practices, norms, and discourses might influence how 3D-MEA test-takers interpret and respond to the instrument’s items. In response, we evaluate with this study the 3D-MEA instrument’s psychometric score properties in a sample of teachers from the Midwest. In doing so, we hope to contribute to the literature concerning the measurement of these important constructs (i.e., teacher self-efficacy and anxiety around DDDM) in both practice and research. It is important to be able to measure status and/or growth in DDDM self-efficacy and anxiety for various purposes. For example, professional development providers might desire to measure those DDDM realms in which teachers feel least self-efficacious or most anxious to tailor trainings. In addition, efforts to build teacher capacity for DDDM that target self-efficacy and anxiety may need sound measures of these constructs for pretest–posttest research/evaluation purposes. Researchers similarly might seek to research factors that facilitate or constrain adoption and implementation of DDDM practices (Dunn et al., 2013b). Most broadly, proper measurement of these constructs among key DDDM actors (teachers) can help the field bolster DDDM initiatives.
Theoretical Framework and Literature Review The Nature of DDDM The present study is grounded in prior research and theory related to DDDM as well as teacher self-efficacy and anxiety affiliated with this practice. DDDM has been theorized as a complex and multifaceted process (Coburn & Turner, 2011; Hamilton et al., 2009; Means et al., 2009) in which an actor (a) accesses or collects data; (b) filters, organizes, or analyzes data into information; (c) combines information with expertise and understanding to build knowledge; (d) responds, takes action, or adjusts one’s practice; and (e) assesses the effectiveness of these actions or outcomes that result (Marsh, 2012). The underlying theory is that by informing with data decisions related to (say) instructional goals, methods, and time allocation, teachers can better target their instruction to students, ultimately resulting in higher levels of achievement (Carlson, Borman, & Robinson, 2011; van Geel et al., 2016). Specific actions taken on the basis of data include selecting suitable instructional methods, prioritizing which content to teach or emphasize, reteaching, and designing supports for those students in most need of them (Mandinach & Gummer, 2013; Marsh, Pane, & Hamilton, 2006; Reeves & Honig, 2015).
Walker et al.
479
Mandinach and Gummer (2016) recently outlined a hierarchical framework for the knowledge and skills necessitated during different phases of the data-driven decision-making process (e.g., identifying questions/framing problems, transforming data into information). Containing domain components, subcomponents, elements, and subelements, their framework makes very specific assertions such as the following: (a) to identify and examine data that might help address a particular issue of interest, teachers need to be able to identify possible sources of data, which itself requires an understanding of the purposes of different data; and (b) to transform data into information, teachers need to be able to understand how to interpret data, which in turn requires an understanding of data displays and representations. Relevant to the present study, teachers also need to know how to use technology-based data systems to support DDDM (e.g., Dunn et al., 2013b; Wayman, 2005). Two other important points about data use are worth noting. First, DDDM by teachers is a process that invokes not only knowledge and skills related to “data” but also relies on teacher expertise in assessment, content, and pedagogy (Mandinach & Gummer, 2016). Second, decision-relevant and, thus potentially usable, data are not limited to data collected via cognitive assessments, whether classroom-based or external (e.g., DeLuca & Bellara, 2013; Mandinach & Gummer, 2013; Reeves & Chiang, 2016). Recent attention has been given to systematic consideration of noncognitive and socioemotional factors within the U.S. educational system as well (e.g., attendance, behavior, grit, and school climate; Duckworth & Yeager, 2015; Mandinach & Gummer, 2016).
DDDM Teacher Self-Efficacy Teacher self-efficacy and anxiety related to DDDM can influence use of DDDM practices (Reeves, Summers, & Grove, 2016). A component of social cognitive theory, Bandura (1997) defined self-efficacy, in general, as “beliefs in one’s capabilities to organize and execute courses of action required to produce given attainments” (p. 3). Crucially, social cognitive theory posits that beliefs about one’s ability to do something—such as engage in DDDM—affect his or her behavior and effectiveness in performing that behavior (or set of behaviors). Indeed, research has found that measures of teacher self-efficacy are related to both teaching practice and student achievement (Bandura, 1977; Tschannen-Moran, Hoy, & Hoy, 1998). Self-efficacy for DDDM, in particular, has been defined as “teachers’ beliefs in their abilities to organize and execute the necessary courses of action to successfully engage in classroom-level DDDM to enhance student performance” (Dunn et al., 2013b, p. 87). Dunn et al. have further theorized that self-efficacy for DDDM is a multidimensional construct—comprising dimensions of self-efficacy for data identification and access, self-efficacy for data technology use, self-efficacy for data analysis and interpretation, and self-efficacy for application of data to instruction—and have provided empirical evidence for said contention. A recent study by Reeves et al. (2016) found that in-service teachers with a higher self-efficacy for data application to instruction used data more frequently than their counterparts. Moreover, and importantly, DDDM self-efficacy appears malleable; studies with preservice teachers observed changes in preservice teachers’ DDDM selfefficacy in response to intervention (Reeves, 2016; Reeves & Honig, 2015).
DDDM Teacher Anxiety Another factor that can interfere with DDDM is teacher anxiety about engagement in that process. DDDM anxiety is “trepidation, tension, and apprehension related to [one’s] ability to successfully engage in DDDM” (Dunn et al., 2013b, p. 90). Anxiety around data use and statistics is well-documented among both in- and preservice educators and those in human service fields, generally (Jimerson et al., 2015; Onwuegbuzie, 2004; Piro et al., 2014; Williams, 2010). According to self-efficacy theory, anxiety is an important negative influence on self-efficacy (Bandura, 1977, 1997; Dunn et al., 2013b), and can also inhibit the enactment of desired
480
Journal of Psychoeducational Assessment 36(5)
behaviors (Aydin, Uzuntiryaki, & Demirdogen, 2011; Bandura, 1988). With this in mind, the present study examines both teacher anxiety concerning DDDM as well as DDDM self-efficacy.
Prior Work With the 3D-MEA The present study focuses on the psychometric score properties of an extant measure of teacher self-efficacy and anxiety related to data-driven decision making, the 3D-MEA Inventory (Dunn et al., 2013b). In this section, we review 3D-MEA score validity and reliability evidence from prior studies. Validity. Prior research has provided validity evidence pertaining to 3D-MEA score internal structure (i.e., number of factors underlying scores and their intercorrelations) as well as 3D-MEA score relations with other variables. Dunn et al. (2013b) rigorously evaluated the 3D-MEA scores’ internal structure through exploratory factor analysis (EFA) and confirmatory factor analyses (CFA). Participants in their study were 1,728 K-12 teachers from 193 schools participating in a DDDM professional development program at the state level, and whom had varying levels of DDDM training and experience. First, Dunn et al. conducted an EFA with a random half sample (n = 864) to explore 3D-MEA scores’ internal structure. While the 3D-MEA was designed to measure four dimensions, the EFA suggested a five-factor structure. In particular, self-efficacy for data analysis, interpretation, and application as a single latent construct was not supported. Instead, two distinct factors were apparent: one interpreted to represent self-efficacy for data analysis and interpretation and one interpreted to represent self-efficacy for application of data to instruction. Second, the authors conducted a CFA with a different random subset of participants (n = 864) to cross-validate the score factor structure discovered during the EFA. The CFA was largely indicative of a good fit of the five-factor score structure per standard guidelines such as the comparative fit index (CFI) and the nonnormed fit index (Hu & Bentler, 1999). The correlations among the factors ranged from .37 to .62 suggesting that the factors were related, but distinct, as expected. Standardized factor loadings ranged from 0.67 to 0.90. In terms of how 3D-MEA scores correlate with other variables, Dunn, Airola, Lo, and Garrison (2013a) provided discriminant validity evidence via correlations between 3D-MEA subscale scores and the Teachers’ Sense of Efficacy Scale (Tschannen-Moran & Hoy, 2001), a general teacher selfefficacy measure, that expectedly ranged from −0.02 to 0.27. In addition, Dunn et al. (2013a) found theory-predicted relationships between 3D-MEA factor scores and teacher concerns about DDDM implementation in the classroom (e.g., concerns about DDDM collaboration). Reliability. Several prior studies reported evidence of 3D-MEA score reliability. Overall, this evidence suggested acceptably high 3D-MEA score reliability. Dunn et al. (2013b) examined reliability in conjunction with both their EFA and CFA and reported internal consistency reliabilities, as estimated by Cronbach’s alpha, ranging from .81 to .92. Furthermore, maximal reliabilities were estimated within a CFA framework and ranged from .74 to .92. In a separate study, Dunn et al. estimated cross-scale internal score consistency at 0.73, and reported internal score consistency estimates that ranged from 0.84 to 0.92 for the subscales. Finally, Dunn, Airola, and Garrison (2013) provided internal consistency estimates for the subscale scores that were high as well, ranging from 0.84 to 0.92.
The Present Study The aforementioned body of work has proved valuable in that it both (a) provided rigorous reliability and validity evidence concerning 3D-MEA scores and (b) helped shed light on the nature
Walker et al.
481
of DDDM self-efficacy and its facets. However, work with the instrument to date has been limited only to research with teachers from the Pacific Northwest. While the existing evidence for 3D-MEA score validity and reliability is quite favorable, the authors (Dunn et al., 2013b) rightly called for further measurement work with different populations (e.g., teachers in different states) to confirm, or disconfirm, the initially established score factor structure and reliability. Our study, in response, sought to address the following research questions concerning the 3D-MEA: Research Question 1: To what extent can the 3D-MEA model’s observed, five-factor structure be validated to fit with the theoretical structure comprised of self-efficacy for data identification and access (identification), self-efficacy for data technology use (technology), self-efficacy for data analysis and interpretation (interpretation), self-efficacy for the application of data to instruction (application), and data-driven decision-making anxiety among preK-12 teachers in the Midwest? Research Question 2: To what extent do the 3D-MEA subscale scores exhibit internal consistency (reliability) among preK-12 teachers in the Midwest?
Method Participants This study’s participants (n = 365) were preK-12 teachers located in a Midwestern state who served in instructional roles and participated in a larger study concerning data use. The sample was 78.90% female and 97.70% White (2.60% Hispanic/Latino). The participants’ mean age was 39.78 years (SD = 11.25) and the mean years of teaching experience was 13.40 (SD = 8.97). The sample represented prekindergarten (1.10%), K-5 elementary (32.0%), 6 to 8 middle (39.90%), 9 to 12 high school (22.70%) teachers,1 and included teachers of all key subject areas (for nongeneralist teachers) as well as a variety of specialty subjects (e.g., life skills, economics). The sample encompassed teachers from 56 districts and 74 schools, and the schools in which they served varied in terms of geographic locale (7.30% city, 79.90% suburban, 9.0% town, and 3.80% rural).
Instrumentation The 3D-MEA’s (Dunn et al., 2013b) 20 items are intended to assess four dimensions of datadriven decision-making self-efficacy and one dimension of data-driven decision-making anxiety among teachers. The three-item self-efficacy for Data Identification and Access (identification) subscale purportedly assesses “teachers’ beliefs in their abilities to identify, access, and gather dependable, high-quality student data” (p. 90). The three-item self-efficacy for Data Technology Use (technology) subscale is intended to assess “teachers’ beliefs in their ability to successfully utilize district and state data technology resources” (p. 90). The three-item self-efficacy for Data Analysis and Interpretation (interpretation) subscale assesses teachers’ “beliefs in their abilities to successfully analyze and interpret student data” (p. 90). The six-item self-efficacy for Application of Data to Instruction (instruction) subscale is designed to assess teachers’ “beliefs in their abilities to successfully connect or apply their interpretation of data findings to classroom instruction in order to improve student learning” (p. 90). The five-item data-driven decisionmaking Anxiety subscale is intended to assess “a teacher’s self-judgment of his or her sense of trepidation, tension, and apprehension related to their ability to successfully engage in DDDM” (p. 90). While Dunn et al. (2013b) originally designed the 3D-MEA to measure only three selfefficacy dimensions (efficacy for data identification and access; efficacy for data technology use; and efficacy for data analysis, data interpretation, and the application of data to instruction), their data did not bear this out empirically.
482
Journal of Psychoeducational Assessment 36(5)
The 3D-MEA is a task-specific self-efficacy and anxiety measure focused on DDDM by teachers. The development of the instrument was grounded in theory and prior research related to DDDM (e.g., Volante & Fazio, 2007; Wayman, 2005). In designing the 3D-MEA, its developers began with an initial pool of 35 items and engaged in a substantive analysis to refine their item set based on perceived item content redundancies, terminological issues, and ambiguities. As is common in belief and affective measurement, the response format for the 20 3D-MEA items was a 5-point rating scale: 1 = strongly disagree, 2 = disagree, 3 = neither agree nor disagree, 4 = agree, and 5 = strongly agree.
Analytic Approach CFA. We addressed the first research question concerning 3D-MEA internal score structure (validity) within a CFA framework. IBM-SPSS AMOS v. 22.0 was used to conduct the CFA. The analysis was confirmatory in the sense that we sought to replicate among Midwestern teachers the specific five-factor 3D-MEA score structure proposed by Dunn et al. (2013b) in their study with Pacific Northwest teachers. A hypothesis was tested to confirm the model’s fit as a five-factor structure: Hypothesis 1: A confirmed five-factor 3D-MEA comprised of identification, technology, interpretation, application, and anxiety will estimate a consistent fit between the reproduced covariance matrix and the observed covariance matrix. CFA fit indices guidelines. To examine CFA model fit, we used the chi-square goodness-of-fit test and other fit indices. Hu and Bentler (1999) suggested that at least two fit indices per their functional domain be utilized in evaluating model fit results. In addition, they advocated for the pairing of relative (comparative) fit indices with absolute fit indices. Typically, relative fit indices range in value from 0 to 1.0 with values nearer to 1.0 preferred. The relative fit indices used in the present study were the CFI (Bentler, 1990) and the Tucker–Lewis index (TLI; Tucker & Lewis, 1973). The literature-supported thresholds for “good” model fit are values ≥.90 (Schumacker & Lomax, 1996) or ≥.95 (Hu & Bentler, 1999) for CFI and .90 for TLI (Bentler & Bonett, 1980). Residual-based, absolute fit indices were also examined and included the root mean square residual (RMR; Jöreskog & Sörbom, 1981) and the root mean square error of approximation (RMSEA; Steiger & Lind, 1980). For absolute fit indices, smaller values closer to 0 are favored. The goal for both RMR and RMSEA indices is to obtain values .05). Given that the data were recognized in the category of MCAR and all of the scale’s 20 items had .70 (Hair, Anderson, Tatham, & Black, 1998). Thus, the scores for each construct had high internal consistency.
Discussion Not only is teacher engagement in the practice of DDDM an expectation in the current educational assessment and accountability climate (e.g., Means et al., 2009), but rigorous evidence suggests that DDDM can actually improve student achievement outcomes (e.g., Carlson et al., 2011; van Geel et al., 2016). However, many teachers find such practices challenging, anxietyprovoking, and lack self-efficacy concerning their DDDM skills (e.g., Dunn et al., 2013b, Means et al., 2009; Volante & Fazio, 2007), which can undermine DDDM efforts. For example, research has shown that classroom teachers’ DDDM self-efficacy is related to the frequency with which they use data to ground decisions (Reeves et al., 2016). As teachers are key actors in the implementation of DDDM, understanding teacher-level factors that can facilitate and constrain DDDM—such as self-efficacy and anxiety—is critical. Thus, with this study we sought to (re)examine the 3D-MEA’s (Dunn et al., 2013b) score internal structure (validity) and reliability. In terms of 3D-MEA score validity, the confirmatory factor model’s factor loadings were all statistically significant (p < .001) and ≥.50, with 90% of the loadings ≥.70. The vast majority of the standardized residuals were within the range of ±1.96 (i.e., only four out of 210 moments or 1.90% were beyond this cut-point), and consequently, the model’s fit indices were all adherent to the literature-supported thresholds. Consistent with results derived from earlier research conducted by Dunn et al. (2013b), the current study confirms that a five-factor model provides a sound accounting of 3D-MEA scores among preK-12 teachers from the Midwest. At the same time, subscale score reliability estimates were acceptably high for measurement in both research and practice. Crucially, empirical substantiation of both an instrument’s hypothesized internal score structure and score reliability is needed to support score interpretation and use (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014). The present study has implications for the measurement of DDDM self-efficacy and anxiety among teachers. First, our replication of 3D-MEA score reliability and validity in an independent, Midwestern sample of teachers bolsters the case for the instrument’s use with U.S. teachers; however, as discussed later, additional work is still required. Second, given that DDDM
488
Journal of Psychoeducational Assessment 36(5)
self-efficacy and anxiety can affect implementation of DDDM practices, proper measurement of these constructs with the 3D-MEA may help support efforts to build educator DDDM capacity in several ways. For example, the 3D-MEA can be used with teachers to identify targets for professional development intervention, evaluate DDDM interventions for teachers, or research processes surrounding the adoption of DDDM practices (e.g., Dunn et al., 2013b). Alternatively, such an instrument can be used to help build theory around DDDM policy implementation. The importance of teacher beliefs in understanding DDDM practices has already been documented; however, beliefs receive relatively little attention in the literature (Datnow & Hubbard, 2015). In doing so, the 3D-MEA can help round out scholarship on (at least) self-efficacy beliefs as well as anxiety in the context of DDDM.
Limitations and Future Directions The present study had two probable limitations, which should be addressed through future research. One potential limitation pertains to statistical conclusion validity. Specifically, the study’s results may be somewhat unstable because of the violation of normality assumption (i.e., multivariate kurtosis). That is, there was an absence of multivariate normality and the chi-square value was large and statistically significant (p < .001). While the statistically significant chisquare suggested that there was a potential discrepancy in the two covariance matrices, however, findings derived from the standardized residuals and the fit indices all provided evidence of a good-fitting model. As noted by McIntosh (2006) and Hooper, Coughlan, and Mullen (2008, p. 54), the chi-square test “ . . . assumes multivariate normality and severe deviations from normality may result in model rejections even when the model is properly specified,” which is potentially the case with the current model. Thus, potential variability of our model findings could be attended to in replication studies. Another limitation relates to the study’s sample, which comprised teachers from only one Midwestern state whom were relatively homogeneous in terms of race/ethnicity. Future work should examine the 3D-MEA score factor structure and reliability in different U.S. teacher populations. Additional work might also attempt to replicate our findings among more representative samples of Midwestern teachers. Given that validation relies on a variety of sources of evidence (American Educational Research Association et al., 2014), other forms of validity evidence should be gathered as well such as validity evidence based on 3D-MEA score relations with other variables. Future work should also develop instrumentation to measure other fundamental DDDM constructs (e.g., measures of data use practices).
Conclusion The present study contributes affirmative validity and reliability evidence concerning the 3D-MEA’s scores. As noted, these findings most directly contribute to efforts in both research and practice to measure these teacher-level DDDM constructs. While our study centers chiefly on measurement, it also speaks to broader efforts within the U.S. education context to promote DDDM. On-the-ground use of data by teachers to inform decisions—such as selection of suitable instructional methods, prioritization of which content to teach, and determination of those students in most need of support—is notoriously complex and difficult. Given that DDDM calls upon not only data interpretational and analytic skills but also knowledge of assessment, pedagogy, and student thinking, suboptimal levels of teacher self-efficacy and anxiety are perhaps to be expected. This complexity certainly underscores the importance of DDDM self-efficacy and anxiety as measurement targets. More generally, accounting for constructs such as DDDM selfefficacy and anxiety in research can also help sort out the degree to which difficulties with DDDM implementation are due to these constructs, as opposed to others (e.g., teacher
Walker et al.
489
knowledge and skills related to assessment, pedagogy). In doing so, the field should be better positioned to cultivate teachers’ strategic use of data, and in turn achieve aims in policy and practice to use data to improve student outcomes. Declaration of Conflicting Interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.
Note 1.
About 2.90% of the sample worked in multiple school levels, and school level was indeterminable for 1.40% of the sample.
References American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Aydin, Y. C., Uzuntiryaki, E., & Demirdogen, B. (2011). Interplay of motivational and cognitive strategies in predicting self-efficacy and anxiety. Educational Psychology, 31, 55-66. Bandura, A. (1977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84, 191-215. Bandura, A. (1988). Self-efficacy conception of anxiety. Anxiety Research, 1, 77-98. Bandura, A. (1997). Self-efficacy: The exercise of control. New York, NY: Freeman. Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-246. Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606. Bentler, P. M., & Wu, E. J. C. (1993). EQS/Windows user’s guide: Version 4. Los Angeles, CA: BMDP Statistical Software. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136-162). Thousand Oaks, CA: Sage. Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105. Carlson, D., Borman, G., & Robinson, M. (2011). A multistate district-level cluster randomized trial of the impact of data-driven reform on reading and mathematics achievement. Educational Evaluation and Policy Analysis, 33, 378-398. Coburn, C. E., & Turner, E. O. (2011). Research on data use: A framework and analysis. Measurement: Interdisciplinary Research & Perspectives, 9, 173-206. Datnow, A., & Hubbard, L. (2015). Teacher capacity for and beliefs about data-driven decision making: A literature review of international research. Journal of Educational Change, 17, 1-22. DeLuca, C., & Bellara, A. (2013). The current state of assessment education: Aligning policy, standards, and teacher education curriculum. Journal of Teacher Education, 64, 356-372. Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters assessing personal qualities other than cognitive ability for educational purposes. Educational Researcher, 44, 237-251. Dunn, K. E., Airola, D. T., & Garrison, M. (2013). Concerns, knowledge, and efficacy: An application of the teacher change model to data driven decision-making professional development. Creative Education, 4, 673-682. Dunn, K. E., Airola, D. T., Lo, W.-J., & Garrison, M. (2013a). Becoming data driven: The influence of teachers’ sense of efficacy on concerns related to data-driven decision making. The Journal of Experimental Education, 81, 222-241.
490
Journal of Psychoeducational Assessment 36(5)
Dunn, K. E., Airola, D. T., Lo, W.-J., & Garrison, M. (2013b). What teachers think about what they can do with data: Development and validation of the data driven decision-making efficacy and anxiety inventory. Contemporary Educational Psychology, 38, 87-98. Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Structural Equation Modeling, 6, 56-83. Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18, 39-50. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1995). Multivariate data analysis with readings (4th ed.). Englewood Cliffs, NJ: Prentice Hall. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998). Multivariate data analysis with readings (5th ed.). Upper Saddle River, NJ: Prentice Hall. Hamilton, L., Halverson, R., Jackson, S., Mandinach, E., Supovitz, J., & Wayman, J. (2009). Using student achievement data to support instructional decision making (NCEE 2009-4067). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Hooper, D., Coughlan, J., & Mullen, M. R. (2008). Structural equation modelling: Guidelines for determining model fit. The Electronic Journal of Business Research Methods, 6, 53-60. Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424-453. Hu, L., & Bentler, P. M. (1999). Cutoff criterion for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55. Jimerson, J. B., Choate, M. R., & Dietz, L. K. (2015). Supporting data-informed practice among early career teachers: The role of mentors. Leadership and Policy in Schools, 14, 204-232. Jöreskog, K. G., & Sörbom, D. (1981). LISREL V: Analysis of linear structural relationships by the method of maximum likelihood. Chicago, IL: National Educational Resources. Kline, R. B. (1998). Principles and practice of structural equation modeling. New York, NY: Guilford Press. Little, R. J. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1098-1202. MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4, 84-99. Mandinach, E. B., & Gummer, E. S. (2013). A systematic view of implementing data literacy in educator preparation. Educational Researcher, 42, 30-37. Mandinach, E. B., & Gummer, E. S. (2016). Data literacy for educators: Making it count in teacher preparation and practice. New York, NY: Teachers College Press. Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 519-530. Marsh, J. A. (2012). Interventions promoting educators’ use of data: Research insights and gaps. Teachers College Record, 114, 1-48. Marsh, J. A., Pane, J. F., & Hamilton, L. (2006). Making sense of data-driven decision making in education. Santa Monica, CA: Rand Corporation. McIntosh, C. (2006). Rethinking fit assessment in structural equation modelling: A commentary and elaboration on Barrett (2007). Personality and Individual Differences, 42, 859-867. Means, B., Padilla, C., DeBarger, A., & Bakia, M. (2009). Implementing data-informed decision making in schools: Teacher access, supports and use. Washington, DC: U.S. Department of Education. Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York, NY: McGraw-Hill. Olsson, U. H., Foss, T., Troye, S. V., & Howell, R. D. (2000). The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality. Structural Equation Modeling, 7, 557-595. Onwuegbuzie, A. J. (2004). Academic procrastination and statistics anxiety. Assessment & Evaluation in Higher Education, 29, 3-19. Piro, J. S., Dunlap, K., & Shutt, T. (2014). A collaborative data chat: Teaching summative assessment data use in pre-service teacher education. Cogent Education, 1(1).
Walker et al.
491
Reeves, T. D. (2016). Pre-service teachers’ data use opportunities during student teaching. Retrieved from https://doi.org/10.1016/j.tate.2017.01.003 Reeves, T. D., & Chiang, J. L. (2016). Standardizing pre-service teacher preparation for instructional datadriven decision making. Unpublished manuscript. Reeves, T. D., & Honig, S. L. (2015). A classroom assessment data literacy intervention for pre-service teachers. Teaching and Teacher Education, 50, 90-101. Reeves, T. D., Summers, K. H., & Grove, E. (2016). Examining the landscape of teacher learning for data use: The case of Illinois. Cogent Education, 3(1). Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research, 8, 3-15. Schumacker, R. E., & Lomax, R. G. (1996). A beginner’s guide to structural equation modeling. Mahwah, NJ: Lawrence Erlbaum. Steiger, J. H., & Lind, J. C. (1980, June) Statistically-based tests for the number of common factors. Paper presented at the Annual Spring Meeting of the Psychometric Society, Iowa City, IA. Tschannen-Moran, M., & Hoy, A. W. (2001). Teacher efficacy: Capturing an elusive construct. Teaching and Teacher Education, 17, 783-805. Tschannen-Moran, M., Hoy, A. W., & Hoy, W. K. (1998). Teacher efficacy: Its meaning and measure. Review of Educational Research, 68, 202-248. Tucker, L. R., & Lewis, C. (1973). The reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1-10. U.S. Department of Education. (2008). Teachers’ use of student data systems to improve instruction: 2005 to 2007. Washington, DC: Office of Planning, Evaluation and Policy Development, U.S. Department of Education. U.S. Department of Education. (2010). Teachers’ ability to use data to inform instruction: Challenges and supports. Washington, DC: Office of Planning, Evaluation and Policy Development, U.S. Department of Education. van Geel, M., Keuning, T., Visscher, A. J., & Fox, J. P. (2016). Assessing the effects of a school-wide data-based decision-making intervention on student achievement growth in primary schools. American Educational Research Journal, 53, 360-394. Volante, L., & Fazio, X. (2007). Exploring teacher candidates’ assessment literacy: Implications for teacher education reform and professional development. Canadian Journal of Education, 30, 749-770. Wayman, J. C. (2005). Involving teachers in data-driven decision-making: Using computer data systems to support teacher inquiry and reflection. Journal of Education for Students Placed At Risk, 10, 295-308. Williams, A. S. (2010). Statistics anxiety and instructor immediacy. Journal of Statistics Education, 18, 1-18.