Consequences of Not Interpreting Structure Coefficients in Published ...

198 downloads 74 Views 74KB Size Report
Bruce Thompson. Texas A&M University and. Baylor College of Medicine. Confirmatory factor analysis (CFA) is a statistical procedure frequently used to test.
STRUCTURAL EQUATION MODELING, 10(1), 142–153 Copyright © 2003, Lawrence Erlbaum Associates, Inc.

TEACHER’S CORNER

Consequences of Not Interpreting Structure Coefficients in Published CFA Research: A Reminder James M. Graham and Abbie C. Guthrie Texas A&M University

Bruce Thompson Texas A&M University and Baylor College of Medicine Confirmatory factor analysis (CFA) is a statistical procedure frequently used to test the fit of data to measurement models. Published CFA studies typically report factor pattern coefficients. Few reports, however, also present factor structure coefficients, which can be essential for the accurate interpretation of CFA results. The interpretation errors that can arise when CFA results are interpreted without considering structure coefficients are described, and some examples from current literature illustrating these errors are also presented.

The close association between factor analysis and measurement has been previously noted (cf. Thompson & Daniel, 1996). Thus Nunnally (1978) long ago suggested that “factor analysis is intimately involved with questions of validity … Factor analysis is at the heart of the measurement of psychological constructs” (pp. 112–113). Both exploratory factor analysis (EFA; cf. Gorsuch, 1983) and confirmatory factor analysis (CFA; cf. Byrne, 1994) are frequently employed in measurement studies. Because CFA directly tests the fit of theoretically or empirically grounded models to data, these models are especially useful, for at least three reasons. First, CFA allows several rival models to be fit to data, and consequently better honors the role of falsification within scientific inquiry (Popper, 1962). FalsificaRequests for reprints should be sent to Bruce Thompson, TAMU Department of Educational Psychology, College Station, TX 77843–4225. E-mail: [email protected]

STRUCTURE COEFFICIENTS IN CFA

143

tion requires that a measure not be deemed credible until the underlying construct model has survived serious disconfirmation efforts (Moss, 1995). Second, CFA forces us to be precise in defining our constructs. As Mulaik (1987, p. 301) emphasized, “It is we who create meanings for things in deciding how they are to be used. Thus we should see the folly of supposing that exploratory factor analysis will teach us what intelligence is, or what personality is.” Of course, as Huberty (1994, p. 265) noted, our data can be used to guide our decisions as to what constructs are; that is, theory development and theory testing are “joint bootstrap operations” (Hendrick & Hendrick, 1986, p. 393). Third, CFA models can be evaluated so as to reward parsimony (cf. Mulaik et al., 1989). The law of parsimony suggests that simpler models are more likely to be true, assuming the fits of rival models are roughly equal. True results are more likely to be replicated. Thus, it seems reasonable to reward parsimony in construct definition, because more parsimonious results are more likely to be replicable. Parsimony-weighted fit indices do just that.

IMPLICATIONS OF THE GENERAL LINEAR MODEL The purpose of this article is to argue that both factor pattern and factor structure coefficients should be interpreted in most CFA reports involving correlated factors. Throughout the general linear model, structure coefficients are always correlation coefficients between measured variables and composite or latent variables, such as  variables (Courville & Thompson, 2001), descriptive discriminant regression Y function scores (Huberty, 1994, p. 206), and canonical function coefficients (Cohen & Cohen, 1983, p. 456; Levine, 1977, p. 20; Meredith, 1964, p. 55). The weights (e.g., regression β weights, factor pattern coefficients) applied to  measured variables to obtain composite or latent variable scores (e.g., regression Y scores, factor scores), on the other hand, are not always correlation coefficients (cf. Courville & Thompson, 2001). Thus, in a regression analysis it is possible that none of the β weights will lie in the range –1.0 to +1.0. For the EFA case, Gorsuch (1983, p. 208) argued that, “Indeed, proper interpretation of a set of factors can probably only occur if at least S [the factor structure coefficient matrix] and P [the factor pattern coefficient matrix] are both examined.” It will be argued here that structure coefficients can also be useful in CFA research. Thompson (1997) made the related argument regarding CFA applications. Regarding other general linear model applications, related arguments have been made regarding regression (Courville & Thompson, 2001; Thompson & Borrello, 1985), descriptive discriminant analysis (Huberty, 1994, p. 263), and canonical correlation analysis (Cohen & Cohen, 1983, p. 456; Levine, 1977, p. 20; Meredith, 1964, p. 55).

144

GRAHAM, GUTHRIE, THOMPSON

For the purposes of this study we investigated all CFA applications in 3 years of articles published in journals indexed by Psychological Abstracts to find illustrations of the misinterpretations that can arise by the failure to interpret both CFA pattern and structure coefficients when factors are correlated. Specifically, we reanalyzed published CFA articles (obtaining covariance matrices from authors where necessary) to compare our interpretations using both sets of coefficients to those made by authors using only a single set of coefficients. The important conceptual point on which our work is grounded is that EFA and CFA are part of a single general linear model (CFA subsuming EFA as a special case). The same basic mathematics apply. Thus, in both EFA and CFA pattern (PVxF) and structure coefficients (SVxF) are equal if and only if factors are perfectly uncorrelated, given that PVxF RFxF = SVxF where RFxF is the interfactor correlation matrix. Put differently, as Bentler and Yuan (2000) emphasized, in CFA when factors are correlated, “even if a factor does not influence a variable, that is, its pattern coefficient or weight is zero, the corresponding structure coefficient, representing the correlation or covariance of the variable with that factor, generally will not be zero” (p. 327, emphasis in original). IMPLICATIONS OF CFA FOR SEM Although this article focuses on analytic practice within the CFA context, the argument presented here also bears on structural equation modeling (SEM). It has been suggested that even in an SEM context an initial phase of analysis often should include an examination via CFA of the measurement models that will later be examined as reflecting constructs related within the structural model (cf. Thompson, 2000). If the measurement models are all inadequate, the interpretation of the structural model results becomes much less interesting. This is not to suggest that the field has yet reached a complete consensus about exactly how to evaluate measurement model fit within SEM (cf. Hayduk & Glaser, 2000; Mulaik & Millsap, 2000), but only that the importance of measurement models within SEM nevertheless is clearly critical. RELATIONSHIPS BETWEEN PATTERN AND STRUCTURE COEFFICIENTS Case No. 1: Variables Each Measure Only One Factor In CFA models in which each variable measures only one factor, the pattern coefficient between the variable and the factor is also the structure coefficient. But on factors for which the pattern coefficients are constrained to equal zero, the struc-

STRUCTURE COEFFICIENTS IN CFA

145

ture coefficients are nevertheless not zero when the factors are correlated. Indeed, as we shall show, measured variables may be more correlated with factors (i.e., have larger absolute structure coefficients) on which pattern coefficients were fixed to zero than with factors on which pattern coefficients were freed to be estimated! When examining only pattern coefficients when variables are each declared to measure only one factor, the researcher is essentially looking at some structure coefficients, though only for certain factor-variable pairs. The other structure coefficients, involving the correlations of variables with correlated factors on which pattern coefficients were constrained to equal zero, are being ignored. This can lead to misinterpretations of dynamics involving single variables or of the factors themselves. Case No. 2: Variables Measure More Than One Factor When variables are declared to measure more than one factor, the structure coefficients may not necessarily equal the pattern coefficients even for variable and factor combinations on which pattern coefficients were freed in the model. Here the value of interpreting both pattern coefficients and structure coefficients is equivalent to the interpretation of both beta-weights and structure coefficients in multiple regression when predictors are correlated (cf. Courville & Thompson, 2001). If a variable has a relatively large (compared to other variables on the same factor) pattern coefficient with a factor, but a relatively small structure coefficient, that variable is acting as a suppressor (Horst, 1966). That is, the variable does not directly overlap with the construction of a factor; rather it acts on the factor indirectly by suppressing the error of one or more of the other variables measuring that same factor. Alternatively, a variable may have a relatively small pattern coefficient with a factor (even zero), but a relatively large structure coefficient with the same factor. This is indicative of the fact that a large portion of that variable’s association with the factor is not unique. In other words, a sizable portion of the variance contributed by the factor by the variable is also contributed by another factor. Heuristic Example To make this discussion concrete, we analyzed the heuristic data presented in Table 1 using two different models. The first model posited the measurement of two correlated factors, and that each observed variable reflected the influence of a single underlying latent construct. Figure 1 presents the standardized coefficients for this model in a diagram such as many researchers employ to present their results. Table 2 presents both the standardized CFA pattern coefficients and the factor structure coefficients from this

146

GRAHAM, GUTHRIE, THOMPSON

TABLE 1 Heuristic Covariance Matrix Variable A B C D E F

A

B

C

D

E

F

2.497 3.007 3.487 2.537 2.188 2.537

9.990 6.014 4.446 3.836 4.446

9.990 4.545 3.926 4.545

9.990 6.873 7.023

9.990 6.164

9.990

FIGURE 1

Standardized pattern coefficients for case 1.

model. These structure coefficients can be easily obtained in most software. For example, in AMOS structure coefficients are obtained merely by requesting “all implied moments.” Note that, although the pattern and structure coefficients are equal for “freed” paths in the model in case 1, the pattern coefficients “fixed” to equal zero nonetheless involve non-zero structure coefficients (e.g., variable “A” has a factor pattern coefficient of .0 on Factor II, but nevertheless a factor structure coefficient of .580).

147

STRUCTURE COEFFICIENTS IN CFA

TABLE 2 Confirmatory Factor Analysis Coefficients for Case 1 Factor I Variable A B C D E F Note.

Factor II

Pattern

rs

Pattern

rs

.849 .726 .817 0 0 0

.849 .726 .817 .597 .528 .552

0 0 0 .875 .774 .808

.580 .495 .557 .875 .774 .808

Pattern coefficients constrained and not estimated in the model are presented as “0.”

As Bentler and Yuan (2000) emphasized, constraining a pattern coefficient to equal zero does not constrain the factor structure coefficients to be zero, if the factors are correlated! The second model posited the measurement of two correlated factors. However, in this model one measured variable (i.e., variable “C”) was presumed to reflect the influence of both underlying latent constructs. Figure 2 presents the standardized coefficients for this model in a diagram such as many researchers employ to present their results. Table 3 presents both the standardized CFA pattern coefficients and the factor structure coefficients from this model. Now in case 2 the pattern and structure coefficients are not equal even for “freed” paths in the model for variable “C” (i.e., on Factor I .930 ≠ .836; on Factor II –.132 ≠ .529). And again even where the pattern coefficients were “fixed” to equal zero nonetheless structure coefficients were not zero. POSSIBLE INTERPRETATION PROBLEMS ABSENT CONSIDERING STRUCTURE In none of the published studies we examined were structure coefficients reported, with one unfortunate exception. Donders (1999) published a table reporting a CFA of the California Verbal Learning Test–Children’s Version (CVLT–C). This table showed the “standardized structural coefficients” (p. 401) of 13 measured variables and five factors. Though the factors were allowed to correlate, the author reported “structural coefficients” of 0 between measured variables and factors on which the variables were constrained to have coefficients of 0. Suffice it to say, the author erroneously referred to pattern coefficients as structure coefficients. The fact that the only CFA study claiming to report structure coefficients actually mislabeled pattern coefficients does not bode well for common practice as regards the use of structure coefficients in interpreting CFAs.

148

GRAHAM, GUTHRIE, THOMPSON

FIGURE 2

Standardized pattern coefficients for case 2.

Due to the different possible relationships between pattern and structure coefficients given the number of factors variables are posited to measure, the two cases have different associated problems. In case 1, when the variables each measure only one factor, only problems 1 and 2 may occur. In case 2, however, all problems can possibly occur. In all of the following examples, tables reproducing all or a relevant portion of the pattern coefficient tables from the various studies are presented. In each case, the structure coefficients we computed between variables and correlated factors are added in parentheses next to the pattern coefficients. In these tables we used the variable labels employed in the original publications. Problem 1: Misplaced Interpretation of Dynamics of Variables Deary, Peter, Austin, and Gibson (1998) reported a CFA involving 16 measured variables and four factors. These variables were presumed to measure multiple factors (case 2). A portion of the pattern coefficient table is reproduced in Table 4; the structure coefficients we computed are also presented in the table.

149

STRUCTURE COEFFICIENTS IN CFA

TABLE 3 Confirmatory Factor Analysis Coefficients for Case 2 Factor I Variable A B C D E F Note.

Factor II

Pattern

rs

Pattern

rs

.834 .722 .930 0 0 0

.834 .722 .836 .622 .550 .575

0 0 –.132 .875 .774 .809

.593 .514 .529 .875 .774 .809

Pattern coefficients constrained and not estimated in the model are presented as “0.”

TABLE 4 Pattern and Structure Coefficients From Deary et al. (1998) Factor Variable

I

EPQ–R Neuroticism EPQ–R Extraversion EPQ–R Psychoticism EPQ–R Lie Note.

.853 0 .311 0

II (.792) (–.041) (.171) (–.227)

–.191 0 .673 –.571

III (.130) (.154) (.732) (–.587)

0 .800 0 0

(–.040) (.785) (.122) (–.115)

IV 0 –.171 –.581 0

(–.504) (.071) (.440) (.000)

The structure coefficients (rs) we computed are presented in parentheses.

Examining the pattern coefficients between the variable EPQ–R Neuroticism and the factors, it can be seen that this model freed the neuroticism variable to measure Factors I and II (as evidenced by non-zero pattern coefficients), but not Factors III and IV (as evidenced by pattern coefficients constrained or “fixed” to 0). However, the two bolded structure coefficients show that the neuroticism variable was more highly correlated (rS = –.504) with factor IV than with factor II, a factor that the variable ostensibly measured (rS = .130). The same is true for the extraversion variable for which rS = .154 on Factor II where the pattern coefficient was fixed to 0, but rS = .071 for Factor IV where the pattern coefficient was freed to be estimated. In fact, 6 of the 16 measured variables had correlations on factors with fixed zero pattern coefficients higher than the structure coefficients for factors on which the pattern coefficients were freed. If, as apparently was the case, these factors were interpreted with the unchecked presumption that on these factors these 6 variables had both pattern and structure coefficients that were zero, misinterpretation or incomplete interpretation occurred.

150

GRAHAM, GUTHRIE, THOMPSON

Problem 2: Misplaced Interpretations of Factors King, Leskin, King, and Weathers (1998) reported a CFA involving 17 measured variables and four latent factors. These measured variables were presumed to measure only one factor each (case 1). Table 5 presents the pattern coefficients reported by the authors and the structure coefficients that we computed for this model. As reported in Table 5, the highest coefficient between Factor IV and a freed variable freed on that factor occurred for variable D4 (.585), which is bolded. However, a total of 3 measured variables with paths to Factor IV fixed to zero (B1, B4, B5) were actually more highly correlated with Factor IV than were the five variables ostensibly measuring that factor. In fact, all of the “B” variables were at least as highly correlated with factor IV as were the variables with “D” prefixes.

Problem 3: Failing to Interpret Suppressor Effects Returning to the Deary et al. (1998) CFA of 16 measured variables and four factors, consider the information presented in Table 6. Again, these variables were declared to measure multiple factors (case 2). A portion of the pattern coefficient table is reproduced in Table 6 along with the structure coefficients that we computed. Of the seven measured variables contributing to Factor IV, the psychoticism variable (structure and pattern coefficients are bolded) had the second highest pattern coefficient. Upon examination of the structure coefficients, however, it can be seen that this variable has only the fifth highest structure coefficient. In short, the psychoticism variable was “getting more credit” than it directly contributed. The TABLE 5 Pattern and Structure Coefficients From King et al. (1998) Variable

Factor IV

B1 B2 B3 B4 B5 D1 D2 D3 D4 D5

0 0 0 0 0 .55 .53 .57 .59 .56

(.616) (.547) (.529) (.629) (.647) (.550) (.527) (.570) (.585) (.561)

Note. The structure coefficients we computed are presented in parentheses.

STRUCTURE COEFFICIENTS IN CFA

151

TABLE 6 Pattern and Structure Coefficients From Deary et al. (1998) Variable

Factor IV

EPQ–R Extraversion EPQ–R Psychoticism SCID–II avoidant SCID–II dependent SCID–II obsessive–compulsive SCID–II histrionic SCID–II narcissistic Note. theses.

–.171 –.581 .225 .257 .512 .172 .643

(.071) (.440) (–.502) (–.461) (–.519) (–.333) (–.586)

The structure coefficients we computed are presented in paren-

psychoticism variable was acting as a suppressor (Horst, 1966), reducing the amount of error variance in one or more of the other variables. Though the psychoticism variable appears to contribute a great deal directly, as reflected in its pattern coefficient, an examination of the structure coefficient reveals that a large portion of psychoticism’s contribution is also a result of indirect contributions via suppression (Henard, 1998; Lancaster, 1999; Stevens, 1996, pp. 106–107). Suppressor effects may go unnoticed unless both pattern and structure coefficients are interpreted, with the consequence that the interpretation of the nature of the suppressor’s contribution will be correspondingly incorrect.

Problem 4: Failing to Interpret Shared Variance Prieto, Santed, Cobo, and Alonso (1999) reported a CFA involving 6 measured variables and 2 factors. One measured variable, VDI HRQOL, was presumed to measure both factors (case 2). A portion of the pattern coefficient table is reproduced in Table 7 along with the structure coefficients that we computed. As can be seen in this table, the measured variable VDI HRQOL had the lowest pattern coefficient on the physical health factor compared with all the other measured variables presumed to measure that factor. That same measured variable, however, also had the largest correlation with the physical health factor. In this case, a large portion of the variance contributed by the measured variable VDI HRQOL is shared by other variables measuring the same factor. The variable VDI HRQOL might be interpreted as being the “least important” variable contributing to the physical health factor if only the pattern coefficients were examined. However, quite the opposite is true from the perspective of the structure coefficients: If any one variable had to be selected to represent the physical health factor, VDI HRQOL would be the single best estimate of that factor.

152

GRAHAM, GUTHRIE, THOMPSON

TABLE 7 Pattern and Structure Coefficients From Prieto et al. (1998)

Variable

Physical Health

VDI Symptoms VDI HRQOL Balance Scale PCS12 MCS12 GHQ12

–.76 .64 .74 .66 0 0

Note.

(–.757) (.836) (.738) (.659) (.412) (–.294)

Psychological Health 0 .40 0 0 .86 –.62

(.362) (–.706) (–.348) (–.315) (–.862) (.615)

The structure coefficients we computed are presented in parentheses.

DISCUSSION The problems resulting from the failure to interpret CFA structure coefficients occur only when factors are allowed to correlate with one another. The higher the correlation between factors, the more different will be the pattern and the structure coefficients on a given factor. Calculating structure coefficients in CFA is a simple matter. For example, in the AMOS software, one of the more popular statistical software packages used in CFA research, one only has to select “standardized estimates” and “all implied moments” as output. The resulting correlation matrix contains the structure coefficients, which are merely the correlations between the measured variables and the latent factors. The belief that, because the path between a measured variable and a factor is set to zero, the variable has no correlation with that factor, is simply erroneous (Bentler & Yuan, 2000). Our purpose here is not to argue that some authors make mistakes or that the journal review process is fallible—both these conditions are taken as givens. Instead, we hope that these and similar examples provide helpful heuristics to remind researchers that measured variables are correlated with all factors when the factors are correlated, even for variables with CFA pattern parameters fixed to be zeroes. If misinterpretations are to be avoided, this reality must be explicitly recognized as part of the interpretation process with correlated factors. REFERENCES Bentler, P. M., & Yuan, K. -H. (2000). On adding a mean structure to a covariance structure model. Educational and Psychological Measurement, 60, 326–339. Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/Windows. Thousand Oaks, CA: Sage. Cohen, J., & Cohen, P. (1983). Applied multiple regression/ correlation analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.

STRUCTURE COEFFICIENTS IN CFA

153

Courville, T., & Thompson, B. (2001). Use of structure coefficients in published multiple regression articles: β is not enough. Educational and Psychological Measurement, 61, 229–248. Deary, I. J., Peter, A., Austin, E., & Gibson, G. (1998). Personality traits and personality disorders. British Journal of Psychology, 89, 647–661. Donders, J. (1999). Structural equation analysis of the California Verbal Learning Test—Children’s version in the standardization sample. Developmental Neuropsychology, 15, 395–406. Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Hayduk, L. A., & Glaser, D. N. (2000). Jiving the four-step, waltzing around factor analysis, and other serious fun. Structural Equation Modeling, 7, 1–35. Henard, D. H. (1998, January). Suppressor variable effects: Toward understanding an elusive data dynamic. Paper presented at the annual meeting of the Southwest Educational Research Association, Houston. (ERIC Document Reproduction Service No. ED 416 215) Hendrick, C., & Hendrick, S. (1986). A theory and method of love. Journal of Personality and Social Psychology, 50, 392–402. Horst, P. (1966). Psychological measurement and prediction. Belmont, CA: Wadsworth. Huberty, C. (1994). Applied discriminant analysis. New York: Wiley. King, D. W., Leskin, G. A., King, L. A., & Weathers, F. W. (1998). Confirmatory factor analysis of the Clinician-Administered PTSD Scale: Evidence for the dimensionality of Posttraumatic Stress Disorder. Psychological Assessment, 10, 90–96. Lancaster, B. P. (1999). Defining and interpreting suppressor effects: Advantages and limitations. In B. Thompson (Ed.), Advances in social science methodology (Vol. 5, pp. 139–148). Stamford, CT: JAI. Levine, M. S. (1977). Canonical analysis and factor comparison. Beverly Hills, CA: Sage. Meredith, W. (1964). Canonical correlations with fallible data. Psychometrika, 29, 55–65. Moss, P. A. (1995). Themes and variations in validity theory. Educational Measurement: Issues and Practice, 14(2), 5–12. Mulaik, S. A. (1987). A brief history of the philosophical foundations of exploratory factor analysis. Multivariate Behavioral Research, 22, 267–305. Mulaik, S. A., James, L. R., van Alstine, J., Bennett, N., Lind, S., & Stilwell, C. D. (1989). Evaluation of goodness-of-fit indices for structural equation models. Psychological Bulletin, 105, 430–445. Mulaik, S. A., & Millsap, R. E. (2000). Doing the four-step right. Structural Equation Modeling, 7, 36–73. Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill. Popper, K. R. (1962). Conjectures and refutations: The growth of scientific knowledge. New York: Harper & Row. Prieto, L., Santed, R., Cobo, E., & Alonso, J. (1999). A new measure for assessing the health-related quality of life of patients with vertigo, dizziness or imbalance: The VDI questionnaire. Quality of Life Research, 8, 131–139. Stevens, J. (1996). Applied multivariate statistics for the social sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Thompson, B. (1997). The importance of structure coefficients in structural equation modeling confirmatory factor analysis. Educational and Psychological Measurement, 57, 5–19. Thompson, B. (2000). Ten commandments of structural equation modeling. In L. Grimm & P. Yarnold (Eds.), Reading and understanding more multivariate statistics (pp. 261–284). Washington, DC: American Psychological Association. Thompson, B., & Borrello, G. M. (1985). The importance of structure coefficients in regression research. Educational and Psychological Measurement, 45, 203–209. Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of scores: An historical overview and some guidelines. Educational and Psychological Measurement, 56, 213–224.

Suggest Documents