Behavior Research Methods 2005, 37 (1), 48-58
Exploring item and higher order factor structure with the Schmid–Leiman solution: Syntax codes for SPSS and SAS HANS-GEORG WOLFF and KATJA PREISING University of Erlangen-Nürnberg, Nürnberg, Germany To ease the interpretation of higher order factor analysis, the direct relationships between variables and higher order factors may be calculated by the Schmid–Leiman solution (SLS; Schmid & Leiman, 1957). This simple transformation of higher order factor analysis orthogonalizes first-order and higher order factors and thereby allows the interpretation of the relative impact of factor levels on variables. The Schmid–Leiman solution may also be used to facilitate theorizing and scale development. The rationale for the procedure is presented, supplemented by syntax codes for SPSS and SAS, since the transformation is not part of most statistical programs. Syntax codes may also be downloaded from www.psychonomic.org/archive/.
Whenever higher order factor analysis (FA) is conducted, the Schmid –Leiman solution (SLS; Schmid & Leiman, 1957) can be used to gain additional insights into the relationship between variables and factors. The SLS is a convenient tool to obtain the independent influence of first-order and higher order factors on a set of primary variables and will thus ease the interpretation of factors of differing levels. The SLS has typically been used to examine theoretical constructs in a multitude of research areas, including research on intelligence (Carroll, 1995; Colom, Contreras, Botella, & Santacreu, 2001; Luo, Petrill, & Thompson, 1994; Petrill, Luo, Thompson, & Detterman, 1996), personality (Chernyshenko, Stark, & Chan, 2001), as well as clinical psychology (Gullion & Rush, 1998; Steer, Clark, Beck, & Ranieri, 1995, 1999). For example, Steer et al. (1995, 1999) used the SLS to determine the relationship between first-order factors of anxiety and depression and a higher order factor of general distress. Colom et al. (2001) applied the SLS in a study on spatial abilities and showed that only a higher order factor of spatial ability possessed a meaningful interpretation, but that first-order factors were “psychologically meaningless” (p. 910). The SLS is a simple transformation of factor-loading matrices obtained from higher order FA and provides a different, informative glance at the data that supplements the view of higher order FA. Because most statistical packages do not provide rou-
tines for the calculation of the SLS, this article presents SPSS and SAS syntax codes for its computation. The advantages of the SLS rely on two characteristics, the calculation of direct relations between higher order factors and primary variables, and the provision of information about the independent contribution of factors of different levels to variables. The SLS may thus guard against several pitfalls of higher order FA. First, difficulties in interpretation and labeling of factors may be reduced by calculating direct relations between variables and higher order factors (Gorsuch, 1983). This feature also supports the development of scales directly measuring either the broader content of higher order constructs or the specific content of lower order constructs (Loehlin, 1998). The second advantage lies in the separation of the total contribution of factors to variables into nonoverlapping elements. This provides information about the independent contribution of first-order and higher order factors to variables. It facilitates interpretation by clearly showing each factor’s unique influence on variables. In addition, the content and importance of higher order constructs is easily identified, which will aid theory building in a field of research. Without the SLS, it may be difficult to determine whether higher order factors resemble an adequate generalization of the relationships between manifest variables or whether the loss of accuracy leads to relatively low contributions of higher order factors to variables. Since the SLS is not part of most statistical programs, our main objective is to describe the rationale of the SLS and to present SPSS and SAS syntax code for its application to a two-level higher order FA (see Appendixes A and B, and see Archived Materials notice for three-level syntax codes). Because the availability of statistical routines is often a concern in data analysis (Johnson & Johnson, 1995), we thus hope to encourage the use of the SLS. Our article also expands the collection of statistical
We greatly appreciate the comments of Klaus Moser, Johann Bacher, Nathalie Galais, Anja Göritz, Brian O’Connor, Jonathan Vaughan, and an anonymous reviewer on earlier drafts of this article. Correspondence concerning this article should be addressed to H.-G. Wolff, Lehrstuhl für Psychologie, insb. Wirtschafts- und Sozialpsychologie, Lange Gasse 20, 90403 Nürnberg, Germany (e-mail:
[email protected]). Note—This article was accepted by the previous editor, Jonathan Vaughan.
Copyright 2005 Psychonomic Society, Inc.
48
SCHMID–LEIMAN SOLUTION routines published in this journal (e.g., Bi, 2002; Chen, 2003; Hayes, 1998; O’Connor, 1999, 2000). HIGHER ORDER FACTOR ANALYSIS AND THE SCHMID–LEIMAN SOLUTION The following sections will provide a brief outline of higher order FA and introduce the SLS as a transformation of factor-loading matrices obtained from higher order FA. The two schematic diagrams in Figure 1 illustrate the differences between higher order FA and the SLS. In a subsequent section, the mathematical rationale of the SLS will be delineated. Finally, an example will be presented to illustrate the use of the SLS. Higher order FA is conducted because first-order factors often represent constructs of narrow scope, whereas higher order factors yield constructs of higher generality (Cattell, 1978; Comrey, 1988; Gorsuch, 1983). For example, the well-known construct of extraversion is a second-order factor in the Sixteen Personality Factor Questionnaire (Conn & Rieke, 1994; Schneewind & Graf, 1998) that is composed of the first-order factors warmth, social boldness, self-reliance, privateness, and liveliness (see also Chernyshenko et al., 2001). In practice, higher order FA is a factor analysis of factor correlations. When variables are submitted to FA and an oblique rotation procedure is chosen, a factor-correlation matrix results. This correlation matrix may be analyzed by FA again, yielding second-order factors. If second-order factors are rotated obliquely, the resulting correlation matrix may be used to extract third-order factors. This procedure may be continued until multiple uncorrelated factors or a single factor is obtained. Figure 1A shows an example of higher order FA, consisting of nine primary variables, three first-order factors, and a single secondorder factor. Variables load on first-order factors, and the first-order factors load on the second-order factor. In the SLS, factor-loading matrices resulting from higher order FA are transformed to provide independent
49
loadings of variables on factors of all levels. Figure 1B shows the result of the SLS using the example of Figure 1A. The SLS yields direct loadings of first- and second-order factors on variables. In fact, Figure 1B is a bifactor solution (Holzinger & Swineford, 1937). The bifactor solution is a submodel of the Schmid –Leiman model with a single second-order factor (Schmid, 1957; see Yung, Thissen, & McLeod, 1999, for a proof ). The terms hierarchical factors (Schmid & Leiman, 1957; Yung et al., 1999) and stratified uncorrelated determiners (SUD; Cattell, 1978) have been used for the SLS, as well. In comparison with higher order FA, the SLS provides further insights into factor structure, which allows easier interpretation of factors, scale development, and theorizing. This is achieved by two characteristics of the SLS, the calculation of direct relations (i.e., factor loadings) between primary variables and higher order factors, and the independence of these factor loadings of different levels. The following paragraphs will discuss these two characteristics and the resulting advantages over higher order FA. First, the SLS provides direct relations between primary variables and higher order factors, which facilitate factor interpretation. The direct relations are not immediately evident in higher order FA. Figure 1A shows that in higher order FA, the second-order factor (Factor I) is only indirectly related to the original variables. This carries the threat of misnaming factors, since interpretations of higher order factors are based on interpretations of lower order factors, which may be misleading (Gorsuch, 1983). When a variable is factorially complex—that is, it loads on several factors—problems of interpretation are aggravated. In this case, higher order FA does not yield total effects. For example, if Variable 1 in Figure 1A has substantive loadings on F1 as well as on F2, calculating the total effect of second-order Factor I on Variable 1 must take both indirect paths, from Factor I to Variable 1 via F1 and F2, into account. The SLS provides total effects by calculating direct relations between vari-
Figure 1. Exemplary higher order factor analysis (panel A) and the resulting Schmid–Leiman solution (Panel B). Var, variable.
50
WOLFF AND PREISING
ables and higher order factors. It is thus a reasonable approach to minimize obstacles and facilitate interpretation. A second advantage of the SLS is that the direct loadings of variables on factors are transformed to provide the relative and independent impact of both first-order and higher order factors on primary variables (Gorsuch, 1983; Loehlin, 1998). In higher order FA, it remains an open question whether a given variable is related to the more specific content of a first-order factor or if it is more closely related to a higher order factor. Since factors of different levels are correlated in higher order FA, a simple calculation of the total effect of a higher order factor on a variable (e.g., by the Cattell–White formula; see Cattell, 1978) is not sufficient to obtain independent effects of factors of different levels. Similar to the prediction of a criterion by two correlated predictors, the factors of different levels possess overlapping contributions to variables. The SLS yields a solution with independent contributions of factors of different levels, as is illustrated in Figure 1B, where Variable 1 loads on Factor I as well as on F1, although Factor I and F1 are uncorrelated. Therefore, loadings represent independent influences on Variable 1, and their relative strength provides further information about a variable. The separation of the overlapping contributions is established by calculating direct relations between variables and factors according to the following rationale: Factor loadings are recalculated to maximize the contribution (i.e., loadings) of higher order factors, which assigns contributions of factors to variables of the highest and most general level possible. Lower order factor loadings are transformed to residual loadings that cannot be explained by higher order factors. In the SLS, lower order factor loadings are essentially partial correlations between lower order factors and variables, where the influence of higher order factors has been partialed out. It is of practical and theoretical interest to distinguish the contribution of different factor levels to a variable. Practical considerations are relevant in the construction of specific measurement scales for a certain factor. For example, the construction of a scale measuring F1 in Figure 1 should employ items that are highly and specifically related to this first-order factor but should not include items that are strongly related to second-order Factor I. Conversely, if a direct measurement of higher order Factor I is intended, items should have high loadings on Factor I and minimize the impact of first-order factors F1 to F3. From Figure 1B, it is evident that the SLS provides the information necessary. For example, consider a study by Steer et al. (1999), where depression and anxiety were identified as first-order factors along with a second-order factor of general distress. To obtain a specific measure of depression, items with high and specific loadings on the depression factor should be selected. Conversely, to obtain a measure of general distress, items with high loadings on this second-order factor should be used. From a theoretical perspective, the contribution of factors to variables yields insights into the meaning of a
factor. The separation into independent contributions provides a clearer picture of the specific elements of factors. If a higher order factor is highly related to a variable, this variable should be used to establish the meaning of this higher order factor. Conversely, if a first-order factor has higher explanatory power than higher order factors, this indicates a specific element of a variable that is only reflected in the lower order factor. Using the example by Steer et al. (1999) again, the SLS yields the relative contribution of depression, anxiety, and general distress factors to a given item, respectively. If items that load on the first-order anxiety factor are more closely related to the higher order factor of general distress than to the first-order factor of anxiety, anxiety may not be the predominant element of this first-order factor. In this (hypothetical) case, the interpretation of the first-order factor is misleading and subject to renewed attempts of labeling this factor. In a similar vein, the SLS can also be used to determine the independent total impact of factors—that is, the variance explained by each factor. In higher order FA, the explanatory power of first-order factors is connected to the intercorrelations of primary variables. The explanatory power of higher level factors refers to the correlation between factors of the adjacent lower level. Thus, first-order factors explain x% of the correlation between variables, and second-order factors explain y% of the correlations between first-order factors.1 In the SLS, the variance explained by different levels is partitioned into nonoverlapping contributions. Thus, each factor in the SLS explains z% of the correlation between variables, regardless of factor level. As mentioned above, the variance explained is attributed to the most general or highest level possible. At the highest level, all the variance that can be explained by the factors on this level is attributed to this level. Of the remaining variance, the maximum amount possible will be attributed to the second highest level, and so on. The relative contribution of different levels is of theoretical relevance, since it indicates the tradeoff between accuracy and generality at different levels of analysis. If higher order factors explain a high percentage of variance extracted, lower order factors may be of little interest. In this case, the increase in generality is achieved at little expense in terms of accuracy. Gorsuch (1983) suggests that higher order factors of the SLS are of “definite interest” (p. 253) if they account for 40%–50% of the extracted variance. Conversely, if higher order factors have little explanatory power, lower order factors may be of higher importance. If we consider the Steer et al. (1999) example again, the authors report that general distress accounts for approximately 55% of the variance explained, and the first-order factors, depression and anxiety, account for 24% and 21% of the variance explained, respectively. The authors conclude that depression and anxiety are distinct, but related constructs, where “Unique cognitivemotivational depression and physiological-hyperarousal anxiety factors can also be identified” (Steer et al., 1999, p. 189). Research on intelligence provides another example for the tradeoff between generality and accuracy,
SCHMID–LEIMAN SOLUTION part of the long-lived debate on a single, broad factor of general intelligence versus several factors of specific primary abilities (cf. Carroll, 1995; Petrill et al., 1996). The SLS can be used to determine the explanatory power of these levels. Two implications with regard to the independence of factor levels in the SLS must be highlighted. First, transforming lower level factor loadings to part correlations implies that loadings on lower order factors cannot exceed their original loadings obtained from higher order FA. The loading depicted by the path from F1 to Variable 1 in Figure 1A would typically be higher than its analog in Figure 1B. However, since higher order factors have been partialed out, loadings from the SLS depict the specific element of a lower order factor, which may give a clearer picture of its meaning. A second implication is connected to the independence of contributions of factors of different levels, which has been termed the orthogonality of factors in the literature (Chernyshenko et al., 2001; Gorsuch, 1983). The term orthogonality describes the independence of factors of differing levels (e.g., F1 and Factor I in Figure 1B). As mentioned above, lower order factor loadings are reduced to part correlations to obtain independent or orthogonal contributions of factors of different levels. However, it is important to note that factors within a single level (e.g., F1 and F2 in Figure 1B) can be, but are not necessarily, correlated. They will be uncorrelated if higher order factors account for all intercorrelations between factors of lower order (Loehlin, 1998). It is important to note that in contrast to the SLS, factors of differing levels are related in higher order FA. Lower order factors are assumed to capture a specific element of a higher order factor and are therefore related to higher order factors. For example, the path from Factor I to F1 in Figure 1A indicates a relationship between these two factors. To summarize, the SLS augments information obtained by higher order FA by providing direct relations between higher order factors and primary variables. Furthermore, factor loadings and variance explained are transformed to represent the independent contribution of factor levels. These features provide further insights into the factor structure and support the interpretation of factors, judgments on the contribution of factors to variables, as well as evaluations of the theoretical relevance of factor levels. The following section will present the mathematical rationale of the SLS. Calculating the Schmid–Leiman Solution Formulas of higher order analysis and the SLS are well documented in textbooks on factor analysis (e.g., Gorsuch, 1983; Loehlin, 1998; see also Yung et al., 1999). Even though the mathematical rationale is not necessary to run the programs presented in Appendixes A and B, it will be briefly reiterated here in order to give a thorough presentation of the SLS and to provide a basis for implementation of other statistical programs. Since the SLS is based on higher order FA, the calculation of higher
51
order FA is necessary in a first step. Results from higher order FA will then be used to calculate the SLS in a second step. A higher order FA with two factor levels will be used to illustrate calculations. Higher order FA requires an obliquely rotated firstorder FA of a correlation matrix R v of primary variables. This correlation matrix is of the order v ⫻ v, where v is the number of variables. In first-order FA, this correlation matrix is reproduced by a first-order factor pattern matrix F1 (i.e., rotated loadings), a first-order factorcorrelation matrix termed R1, and unique variances U21 of variables: R v = F1R 1F1′ + U12 .
(1)
Matrix F1 is of the order v ⫻ f1, where f1 is the number of first-order factors. R1 is of the order f1 ⫻ f1. U21 is a matrix of the order v ⫻ v and consists of measurement error and specific components of variables. Second-order factors can be extracted from the factor correlation matrix R1, yielding a matrix of second-order factor loadings F2, a second-order factor correlation matrix R2, and the unique variance of this level, U22: R 1 = F2 R 2 F2′ + U 22 .
(2)
F2 is of the order f1 ⫻ f2, where f2 is the number of second-order factors. R2 is of the order f2 ⫻ f2. U22 is a matrix of the order f1 ⫻ f1. These two analyses comprise a minimal higher order FA consisting of two levels. F1 from Equation 1 provides first-order factor loadings (e.g., paths from F1 to variables in Figure 1A), and F2 yields second-order factor loadings (e.g., the path from Factor 1 to F1 in Figure 1A). If second-order factors are uncorrelated, matrix R2 is an identity matrix and can be ignored. If second-order factors are correlated, analyses may continue to extract third-order factors. To obtain the SLS, two further calculations are necessary. In a first step, direct effects of higher order factors on variables are determined by multiplying all factor-loading matrices from first to highest order. In the present case, the direct loadings of second-order factors on variables, termed F2SLS, are obtained by multiplying F1 ⫻ F2 F2SLS ⫽ F1* F2,
(3)
where is of the order v ⫻ f2. To obtain residualized first-order factor loadings FSLS 1 , factor loadings F1 are multiplied by the uniqueness U2 of higher order factors: F2SLS
FSLS ⫽ F1* U2, 1
(4)
where FSLS is of the order v ⫻ f1. The uniqueness U2 is 1 equivalent to the square root of the variances of R1 not explained by higher order factors. It is obtained by subtracting the explained variance of a factor from 1:
[
U 2 = I − diag (F2 R 2 F2′ )
]0.5 ,
(5)
where I is an identity matrix and diag indicates that only the diagonal elements from the second-order factor solution are used. Multiplying the first-order factor-loading
52
WOLFF AND PREISING
Table 1 Correlation Matrix of 9 Sample Items Item 1 2 3 4 5 6 7 8 1. Due to the flood of information, problems occur when I use the 1.000 new media. 2. In spite of several attempts to deal with the flood of information, I .483 1.000 have not managed to use the new media in a way to facilitate my work. 3. I have no idea what to do to get rid of the problems caused by the new .340 .624 1.000 media. 4. Even at home, I have to think about problems caused by the flood of .180 .260 .240 1.000 information. 5. When others want to talk to me about the new media, I sometimes .277 .433 .376 .534 1.000 react grumpily. 6. Even during my holidays, I sometimes think about problems caused .257 .301 .244 .654 .609 1.000 by the new media. 7. When I leave my desk for some time, I will find a flood of e-mails ⫺.074 ⫺.028 .233 .165 .041 .133 1.000 when I return. 8. Problems caused by the flood of information affect my entire workday. .212 .362 .577 .411 .300 .399 .346 1.000 9. Due to the flood of information, I have trouble to relax after work. .226 .236 .352 .306 .239 .320 .206 .457 Note—N ⫽ 193 using listwise deletion. Items taken from Moser et al. (2002).
matrix F1 by the unique variance of the higher order factors U2 transforms first-order factor loadings to part correlations: The impact of higher order factors has been partialed out, because lower order factors (FSLS 1 ) only account for the unique variance of higher order factors. They are thus independent of—or orthogonal to—factors of higher order (F2SLS). Furthermore, the loadings of variables on F SLS will be less than or equal to the original 1 loadings of the respective level, because the elements of U2 are less than one. To determine the impact of a factor, the sum of its squared loadings may be calculated from the SLS. To obtain the contribution of all factors of a particular level, these may be summed over all factors of this level. Standardization of these indexes by the total sum of variance explained yields the percentage of variance explained by a factor or factor level, respectively (Gorsuch, 1983). An Example This section will present an example of the SLS, using a sample of N ⫽ 193 employees who answered a questionnaire on practices and problems with the new media from a study by Moser, Preising, Göritz, and Paul (2002; see study for further information on sample and results). Of the 34 items measuring individual reactions to the new media, only 9 will be used here for reasons of brevity. The syntax in Appendixes A and B contain the data of this example, as well. Item intercorrelations and their approximate translations are depicted in Table 1; correlations are mostly positive, ranging from ⫺.07 to .65. First-order analysis was carried out by means of principal axis analysis. A parallel analysis (O’Connor, 2000, n.d.) indicated three factors that were rotated by the promax procedure (κ ⫽ 4) to an oblique solution. Table 2 shows the factor pattern matrix and factor intercorrelations. The three factors correspond to factors termed psychological stress caused by the new media (F1), problems with the flood of information (F2), and impairment
of work due to the new media (F3) identified by Moser et al. (2002). All items have substantive loadings (⬎.30) on at least one factor. Simple structure cannot be attained for some items; for example, Item 3 has high loadings on both Factors 2 and 3. Factor intercorrelations are all above .40, indicating that the factors are related. This is confirmed by a second-order principal axis FA, where a single factor (labeled G1 in Table 3) is obtained according to parallel analysis. All three factors have loadings ⬎.60 on this second-order factor. Neither the direct relation between variables and the second-order factor nor the relative impact of factor levels is apparent from the results presented in Tables 2 and 3. The SLS yields the pattern depicted in Table 4. Since there is only one second-order factor, the present example is equivalent to a bifactor solution (Holzinger & Swineford, 1937). Loadings of variables on the higher order factor termed G1 and first-order factors (F1 to F3) are depicted, as well as the relative variance explained
Table 2 First-Order Factor Analysis (Pattern Matrix) of Correlation Matrix Shown in Table 1 Item F1 F2 F3 1 .099 .565 ⫺.152 2 .012 .942 ⫺.154 3 ⫺.150 .618 .413 4 .744 ⫺.088 .142 5 .624 .279 ⫺.114 6 .869 ⫺.033 .029 7 ⫺.015 ⫺.271 .626 8 .091 .099 .722 9 .150 .083 .398 Eigenvalue 3.64 1.320 1.240 Factor correlations F1 1.000 F2 .451 1.000 F3 .427 .509 1.000 Note—Principal axis factor analysis with promax rotation is used.
SCHMID–LEIMAN SOLUTION
Table 3 Results of Higher Order Analysis of Factor Correlation Matrix Shown in Table 2 Factor G1 F1 .616 F2 .732 F3 .695 Eigenvalue 1.93 Note—Principal axis factor analysis with promax rotation is used. G1 ⫽ second-order factor, F1 to F3 refer to first-order factors.
by the four factors. Factor G1 accounts for 45.8% of the variance explained (see note 1). According to Gorsuch (1983), this factor represents an appropriate generalization of the relation between variables. However, in this example, specific elements of first-order factors account for 14%–23% of the variance. Depending on research focus, this would also justify examination of these more accurate measures (cf. the example by Steer et al., 1999, mentioned above). Regarding the factor loadings of the SLS in Table 4, four things should be pointed out. First, as mentioned above, first-order factor loadings of the SLS are less than the original loadings (see Table 2), because first-order loadings are reduced to part correlations. Second, Table 4 shows that Item 9 does not have a first-order factor loading above .30 in the SLS, although its original loading on F3 in first-order FA is above this criterion (.398, see Table 2). This item seems to reflect G1 to an extent that it should not be considered a “good” measure of its first-order factor, F3. Therefore, interpretations of F3 should rely on Items 7 and 8. A third aspect is that some items have higher loadings on G1 than on firstorder factors (Items 3, 5, and 9), because they deviate from simple structure. Apart from a high loading on one factor, these items also possess (mostly small) loadings on additional factors, which increases the total effect of the higher order factor on these items. These items should be considered if it is intended to measure the higher order construct directly with a shortened scale. Fourth, several items (e.g., Item 6) are better measures of their first-order factor than of G1. These items reflect purer
Item 1 2 3 4 5 6 7 8 9 % Variance explained
Table 4 Results of the Schmid–Leiman Solution Second-Order Factor First-Order Factor G1 F1 F2 F3 .368 .590 .653 .493 .510 .531 .228 .518 .430 45.8
.078 .010 ⫺.118 .586 .492 .685 ⫺.012 ⫺.072 .118
.385 .642 .421 ⫺.060 .190 ⫺.023 ⫺.184 .068 .057
⫺.109 ⫺.110 .303 .102 ⫺.082 .021 .450 .519 .286
22.8
17.1
14.3
53
measures of first-order factors and should be included in the measurement of a first-order factor. CONCLUSION Whenever higher order FA is calculated, the SLS is a useful procedure to gain additional insights into the factor structure. Since the syntax codes provide a quick and efficient means to calculate the SLS from higher order FA, we suggest reaping the benefits of this procedure whenever possible. The SLS gives further insights into the structural relations between factors as well as relations between items and higher order factors. Inspection of direct loadings of primary variables on higher order factors and the independent contributions of factor levels facilitates interpretation of factors and underlying theoretical constructs by providing an additional perspective on the data. Results of the SLS should also be helpful in the revision of scales. Regardless of factor level, if an accurate or fine-grained measurement of a construct is intended, items should be retained or discarded according to the results of the SLS. Because participants’ time restrictions may favor the use of short scales, the SLS may be used to develop a short scale directly measuring the secondorder construct. It should thus be possible to obtain a reliable scale by choosing items that are closely related to the second-order factor. REFERENCES Bi, J. (2002). Variance of d′ for the same–different method. Behavior Research Methods, Instruments, & Computers, 34, 37-45. Carroll, J. B. (1995). On methodology in the study of cognitive abilities. Multivariate Behavioral Research, 30, 429-452. Cattell, R. B. (1978). The scientific use of factor analysis. New York: Plenum. Chen, R. (2003). An SAS/IML procedure for maximum likelihood factor analysis. Behavior Research Methods, Instruments, & Computers, 35, 310-317. Chernyshenko, O. S., Stark, S., & Chan, K. Y. (2001). Investigating the hierarchical factor structure of the fifth edition of the 16PF: An application of the Schmid –Leiman orthogonalization procedure. Educational & Psychological Measurement, 61, 290-302. Colom, R., Contreras, M. J., Botella, J., & Santacreu, J. (2001). Vehicles of spatial ability. Personality & Individual Differences, 32, 903-912. Comrey, A. L. (1988). Factor-analytic methods of scale development in personality and clinical psychology. Journal of Consulting & Clinical Psychology, 56, 754-761. Conn, S. R., & Rieke, M. L. (1994). The 16PF fifth edition technical manual. Champaign, IL: Institute for Personality & Ability Testing. Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum. Gullion, C. M., & Rush, A. J. (1998). Toward a generalizable model of symptoms in major depressive disorder. Biological Psychiatry, 44, 959-972. Hayes, A. F. (1998). SPSS procedures for approximate randomization tests. Behavior Research Methods, Instruments, & Computers, 30, 536-543. Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41-54. Johnson, W. L., & Johnson, A. M. (1995). Using SAS/PC for higher order factoring. Educational & Psychological Measurement, 55, 429434. Loehlin, J. C. (1998). Latent variable models: An introduction to factor, path, and structural analysis (3rd ed.). Mahwah, NJ: Erlbaum.
54
WOLFF AND PREISING
Luo, D., Petrill, S. A., & Thompson, L. A. (1994). An exploration of genetic g: Hierarchical factor analysis of cognitive data from the Western Reserve Twin Project. Intelligence, 18, 335-347. Moser, K., Preising, K., Göritz, A. S., & Paul, K. (2002). Steigende Informationsflut am Arbeitsplatz: Belastungsgünstiger Umgang mit elektronischen Medien (E-Mail, Internet) [Increasing information load in the workplace: Strain-balanced coping with the electronic media (e-mail, Internet)]. Bremerhaven: Wirtschaftsverlag, NW. O’Connor, B. P. (1999). Simple and flexible SAS and SPSS programs for analyzing lag-sequential categorical data. Behavior Research Methods, Instruments, & Computers, 31, 718-726. O’Connor, B. P. (2000). SPSS and SAS programs for determining the number of components using parallel analysis and Velicer’s MAP test. Behavior Research Methods, Instruments, & Computers, 32, 396-402. O’Connor, B. P. (n.d.). SPSS, SAS, and MATLAB programs for determining the number of components and factors using parallel analysis and Velicer’s MAP test. Retrieved December 12, 2003, from http://flash. lakeheadu.ca /~boconno2/nfactors.html Petrill, S. A., Luo, D., Thompson, L. A., & Detterman, D. K. (1996). The independent prediction of general intelligence by elementary cognitive tasks: Genetic and environmental influences. Behavior Genetics, 26, 135-147. Schmid, J. (1957). The comparability of the bi-factor and second-order factor patterns. Journal of Experimental Education, 25, 249-253. Schmid, J., & Leiman, J. N. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53-61. Schneewind, K. A., & Graf, J. (1998). Der 16-Persönlichkeits-FaktorenTest. Revidierte Fassung (16 PF-R) [The 16-Personality-Factor Test. Rev. Ed.]. Bern: Huber. SPSS Inc. (2002). SPSS 11.5 Syntax Reference Guide [Computer version, available with SPSS 11 software]. Chicago: Author. Steer, R. A., Clark, D. A., Beck, A. T., & Ranieri, W. F. (1995). Common and specific dimensions of self-reported anxiety and depression: A replication. Journal of Abnormal Psychology, 104, 542-545. Steer, R. A., Clark, D. A., Beck, A. T., & Ranieri, W. F. (1999). Common and specific dimensions of self-reported anxiety and de-
pression: The BDI-II versus the BDI-IA. Behavior Research & Therapy, 37, 183-190. Yung, Y.-F., Thissen, D., & McLeod, L. D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113-128. NOTE 1. Higher order FA does not result in additional gains in terms of variance explained in primary variables. First-order FA yields the variance explained in primary variables. Higher order FA gives the proportion of this variance that may be explained by further generalization. ARCHIVED MATERIALS The following materials may be accessed through the Psychonomic Society’s Norms, Stimuli, and Data archive, http://www.psychonomic. org/archive/. To access these files, search the archive for this article using the journal (Behavior Research Methods), the first author’s name (Wolff ), and the publication year (2005). FILE: Wolff-BRM-2005.zip DESCRIPTION: The compressed archive file contains 10 files: 2_level_sls.sps, containing SPSS syntax code for a two-level SLS 2_level_sls.spo, containing the SPSS output file of a two-level SLS 3_level_sls.sps, containing SPSS syntax code for a three-level SLS 3_level_sls.spo, containing the SPSS output file of a three-level SLS spss_readme.txt, an ASCII file documenting the SPSS files 2_level_sls.sas, containing SAS syntax code for a two-level SLS 2_level_sls.lst, containing the SAS output file of a two-level SLS 3_level_sls.sas, containing SAS syntax code for a three-level SLS 3_level_sls.lst, containing the SAS output file of a three-level SLS sas_readme.txt, an ASCII file documenting the SAS files AUTHOR’S E-MAIL ADDRESS:
[email protected] AUTHOR’S WEB SITE: http://www.wiso-psychologie.uni-erlangen.de/ mitarbeiter/wolff.php
APPENDIX A Syntax Code for Schmid–Leiman Solution with SPSS 11 The syntax code uses the matrix command of SPSS (2002, p. 714) and has been run successfully with SPSS 11. Nine items from a study by Moser et al. (2002) are used in the code presented in Table A1. To adapt the program code to other problems, the user has to change five parts, all typed in bold: Two factor-loading matrices, as well as variable and factor labels, have to be replaced in Lines 5–22. To obtain first-order and secondorder factor matrices to be entered into this syntax, first-order and higher order factor analyses have to be calculated using ordinary SPSS commands. The first-order factor pattern matrix from an oblique solution (e.g., promax with κ ⫽ 4 has been used here) is entered in Lines 5–13 and termed F1. Factor loadings of a single variable are separated by a comma and end with a semicolon, which indicates the beginning of a new matrix row. In the example, three factors and nine variables are used. In Lines 15 and 16, variable labels can be entered. These labels need to be within quotation marks and separated by a semicolon. Labels of first-order factors can be adapted in Line 18. Line 20 (matrix termed F2) contains the second-order factor loadings, in this case a single factor with loadings of three first-order factors. In the case of more than one second-order factor, loadings of each first-order factor must be separated by a comma and end with a semicolon, indicating a new matrix row. Labels for second-order factors are entered in Line 22. It should be noted that no data need to be loaded into the data window of SPSS. All necessary data are entered directly with the syntax code. Lines 26–31 contain the calculations for the Schmid –Leiman solution. Lines 32– 44 compute several sums of squared loadings resulting in different indexes of variance explained in the output. Output computation starts at Line 45. Here, several compound matrices are defined for the sake of readability of the output. Some lines have been separated due to restricted column space and should be put on a single line (e.g., Lines 50–52). The output, as shown in Table A2, displays the two input matrices first, to allow a check for possible typing errors. The factor loadings from the Schmid –Leiman solution follow in the next section of the output. The first g columns contain loadings of the g higher order factors, in this case a single second-order factor. The next f columns contain loadings of the f orthogonalized first-order factors. Three indexes of communalities h2 follow in the last three columns. The first column contains the total amount of variance explained for a vari-
SCHMID–LEIMAN SOLUTION APPENDIX A (Continued)
1
5
10
15
20
25
30
35
40
45
50
55
60 62
Table A1 SPSS Input Syntax for Schmid–Leiman Solution * Schmid-Leiman Solution for 2 level higher order Factor analysis. Matrix. * Enter first-order pattern matrix. compute F1 ⫽ {0.099, 0.5647, -0.1521; 0.0124, 0.9419, -0.1535; -0.1501, 0.6177, 0.4128; 0.7441, -0.0882, 0.1425; 0.6241, 0.2793, -0.1137; 0.8693, -0.0331, 0.0289; -0.0154, -0.2706, 0.6262; -0.0914, 0.0995, 0.7216; 0.1502, 0.0835, 0.398}. * enter first-order variable names. compute varname ⫽ {“v1”; “v2”; “v3”; “v4”; “v5”; “v6”; “v7”; “v8”; “v9”}. *enter first-order factor names. compute f1name ⫽ {“factor1”, “factor2”, “factor3”}. * enter second-order factor loadings. compute F2 ⫽ {0.6158; 0.7316; 0.6949}. *enter second-order factor names. compute f2name ⫽ {“General1”}. print F1/Format”f5.3” /rnames ⫽ varname /cnames ⫽ f1name. compute C1 ⫽ ncol(F1). print F2/format”f5.3” /rnames ⫽ f1name /cnames ⫽ f2name. compute C2 ⫽ ncol(F2). compute zw1 ⫽ 1-rssq(f2). compute Unique ⫽ Mdiag (zw1). compute zw1 ⫽ sqrt(unique). compute B ⫽ {F2,zw1}. compute SLP ⫽ F1*B. compute hrtot ⫽ rssq(SLP). compute C1end ⫽ C1 ⫹ C2. compute C1start ⫽ C2 ⫹ 1. compute zw2 ⫽ slp(:,C1start:C1end). compute HR1st ⫽ rssq(zw2). compute zw3 ⫽ SLP(:,1:C2). compute HR2nd ⫽ rssq(zw3). compute HCtot ⫽ cssq(SLP). compute Htot ⫽ mssq(SLP). compute Htot100 ⫽ HCtot &/ Htot. compute Htotsum ⫽ msum(HCtot) / Htot. compute zw4 ⫽ Htot100(1:C2). compute zw5 ⫽ Htot100(C1start:C1end). compute EXG ⫽ rsum(zw4). compute EXF ⫽ rsum(zw5). compute results1 ⫽ {SLP, hRtot, HR2nd, HR1st}. compute slpname ⫽ {f2name, f1name, “h2 total”, “h2 G”, “h2 1st”}. print results1/ format “f5.3” /title ⫽ “factor loadings of Schmid-Leiman Solution and h2” /rnames ⫽ varname /cnames ⫽ slpname. compute results2 ⫽ {HCtot, Htot; Htot100, Htotsum}. compute fixedn2 ⫽ {f2name,f1name,”total”}. print results2 /format”f5.3”/ title ⫽ “sum of squared loadings” /rlabels ⫽ “h2” “%” /cnames ⫽ fixedn2. print EXG /format”f5.3”/ title ⫽ “percentage of extracted variance explained by general factors (%)”. print EXF /format”f5.3”/ title ⫽ “percentage of extracted variance explained by first order factors (%)”. End Matrix.
Note—Underlined type indicates parts of syntax code that have to be adapted to the user problem. These consist of firstorder factor pattern matrix (Lines 5–13), variable labels (Lines 15 and 16), labels of first-order factors (Line 18), a secondorder factor matrix (Line 20), and second-order factor labels (Line 22). v1 to v9 ⫽ primary variables, factor1 to factor3 ⫽ first-order factors, General1 ⫽ second-order factor. Commands end with a period. Some lines have been separated due to restriction of column space and should be put on a single line (e.g., Lines 48– 49 are a single command).
55
56
WOLFF AND PREISING APPENDIX A (Continued) able (termed h2 total). h2 total must be identical to extracted communalities of first-order FA except for rounding errors and can thus be used to check the correctness of results. The next column indicates the amount of variance explained in each variable by the higher order factors (termed h2G). The last column details the amount of variance explained in each variable by the orthogonalized first-order factors (termed h2F). Since this is a split of the total communalities, the last two columns must add up to h2total. The next section of the output contains indexes of variance explained for each factor. The first row yields the sum of squared loadings for each column—that is, variance explained for each factor, as well as the total extracted variance in the last column. Row 2 of this section standardizes the extracted variance explained and yields the distribution of extracted variance explained in terms of percentages, adding up to 1 in the column termed total (deviations should be due to rounding errors). The last two sections contain the sums of these percentages for higher order factors and orthogonalized first-order factors, respectively. They can be used to evaluate the contribution of these two factor groups to the variance explained.
Table A2 SPSS Output from Syntax Code for Schmid–Leiman Solution Run MATRIX procedure: F1 v1 v2 v3 v4 v5 v6 v7 v8 v9
factor1 factor2 factor3 .099 .565 -.152 .012 .942 -.154 -.150 .618 .413 .744 -.088 .143 .624 .279 -.114 .869 -.033 .029 -.015 -.271 .626 -.091 .100 .722 .150 .084 .398
F2 factor1 factor2 factor3
General1 .616 .732 .695
factor loadings of Schmid-Leiman Solution and h2 General1 factor1 factor2 factor3 h2 total h2 G v1 .368 .078 .385 -.109 .302 .136 v2 .590 .010 .642 -.110 .773 .348 v3 .646 -.118 .421 .297 .697 .418 v4 .493 .586 -.060 .102 .601 .243 v5 .510 .492 .190 -.082 .544 .260 v6 .531 .685 -.023 .021 .752 .282 v7 .228 -.012 -.184 .450 .289 .052 v8 .518 -.072 .068 .519 .547 .268 v9 .430 .118 .057 .286 .284 .185 sum of squared loadings General1 factor1 factor2 h2 2.191 1.094 .820 % .458 .228 .171
factor3 .684 .143
h2 1st .166 .425 .279 .358 .285 .470 .237 .279 .099
total 4.790 1.000
percentage of extracted variance explained by general factors (%) .458 percentage of extracted variance explained by first order factors (%) .542 ------ END MATRIX ----Note—The following labels are used: F1 ⫽ first-order factor-loading matrix, F2 ⫽ second-order factor-loading matrix, v1 to v9 ⫽ primary variables, factor1 to factor3 ⫽ first-order factors, General1 ⫽ second-order factor. These labels are used for input as well as output matrices with transformed factors. h2 total ⫽ total communality of all factors, h2 G ⫽ communality of transformed higher order factors, h2 1st ⫽ communality of transformed first-order factors.
SCHMID–LEIMAN SOLUTION APPENDIX B Syntax Code for Schmid–Leiman Solution with SAS 8.02 The syntax code presented in Table B1 uses proc IML of SAS 8.02. Nine items from a study by Moser et al. (2002) are used in the example. To adapt the code to other problems, five parts, all underlined in Table B1, need to be changed: Two factor-loading matrices, as well as variable and factor labels for these matrices, have to be replaced. To obtain first-order and second-order factor matrices to be entered into this syntax, firstorder and higher order factor analyses have to be calculated using ordinary SAS commands. The first-order factor pattern matrix from an oblique solution (promax with κ ⫽ 4 has been used here) is entered in Lines 7–15 and termed F1. In SAS, factor loadings of a single variable are separated by a blank and end with a comma, which indicates the beginning of a new matrix row. In the example, three factors and nine variables are used. In Line 17, variable labels can be entered. These labels need to be within quotation marks, separated by a blank. Labels of first-order factors can be adapted in Line 19. Line 21 (matrix termed F2) contains the second-order factor loadings, in this case a single factor with loadings of three first-order factors. In the case of more than one second-order factor, loadings of each first-order factor must be separated by a blank and end with a comma, indicating a new matrix row. Labels for this second-order factor are entered in Line 23. It should be noted that all necessary data are entered directly with the syntax code. Lines 30–34 contain the calculations for the Schmid –Leiman solution. Lines 35– 47 compute several sums of squared loadings resulting in different indexes of variance explained in the output. Output computation starts at Line 48. Here, several compound matrices are defined for the sake of readability of the output. A sample output has been omitted, since it is identical to the output obtained by the SPSS code (see Table A2).
1
5
10
15
20
25
30
35
40
45
Table B1 SAS Input Syntax for Schmid–Leiman Solution /* Schmid-Leiman Solution for 2 level higher order Factor analysis*/ options nocenter nodate; proc iml; reset noname; /*Enter First-order pattern Matrix*/ F1 = {0.099 0.5647 -0.1521, 0.0124 0.9419 -0.1535, -0.1501 0.6177 0.4128, 0.7441 -0.0882 0.1425, 0.6241 0.2793 -0.1137, 0.8693 -0.0331 0.0289, -0.0154 -0.2706 0.6262, -0.0914 0.0995 0.7216, 0.1502 0.0835 0.398}; /* Enter First-order variable names*/ varname={"V1" "V2" "V3" "V4" "V5" "V6" "V7" "V8" "V9"}; /*enter First-order factor names*/ f1name={"Factor1" "Factor2" "Factor3"}; /* enter second-order factor loadings */ F2 = {0.6158, 0.7316, 0.6949}; /*enter second-order factor names*/ f2name={"General1"}; print 'First-order Pattern Matrix',,F1 [rowname=varname colname=f1name format=5.3]; C1=ncol(F1); print "Second-order factor loadings",, F2 [rowname=f1name colname=f2name format=5.3]; c2=ncol(F2); zw1 =1-F2[,##]; unique=diag(zw1); zw1 = sqrt (unique); B = F2||zw1; SLP = F1 * B; hrtot =SLP[,##]; c1end = C1+C2; C1start = C2+1; zw2=slp[,c1start:c1end]; hr1st = zw2[,##]; zw3 = SLP[,1:C2]; hr2nd = zw3[,##]; hctot = SLP[##,]; htot = ssq(SLP); htot100 = hctot / Htot; htotsum = sum(hctot) / htot; zw4= htot100[,1:c2];
57
58
WOLFF AND PREISING APPENDIX B (Continued) 47
50
55
60
65
Table B1 (Continued) zw5 = htot100[,c1start:c1end]; exg = zw4[,+]; exf = zw5[,+]; results1 = SLP||hrtot || hr2nd || hr1st; slpname=f2name || f1name || {"h2total" "h2G" "h2F"}; print "Factor loadings of Schmid-Leiman Solution and h2",, results1 [rowname=varname colname=slpname format=5.3]; results2a = HCtot || htot; results2b = htot100 || htotsum; results2 = results2a // results2b; fixedname1={"h2", "%"}; fixedname2=f2name||f1name || {"Total"}; print "sum of squared loadings",, results2 [ rowname=fixedname1 colname=fixedname2 format=5.3]; print "percentage of extracted variance explained by general factors:",, exg [format=5.3]; print "percentage of extracted variance explained by first-order factors:",, exf[format=5.3]; quit;
Note—Underlined type indicates parts of syntax code that have to be adapted to the user problem. These consist of firstorder factor pattern matrix (Lines 7–15), variable labels (Line 17), first-order factor labels (Line 19), second-order factor matrix (Line 21), and second-order factor labels (Line 23). V1 to V9 ⫽ primary variables, factor1 to factor3 ⫽ first-order factors, General1 ⫽ second-order factor. Commands end with a semicolon. Some lines have been separated due to restriction of column space and should be put on a single line (e.g., Lines 60–61 are a single command). (Manuscript received November 5, 2002; revision accepted for publication May 13, 2004.)