Factor models for ordinal variables with covariate effects ... - CiteSeerX

Factor models for ordinal variables with covariate effects on the manifest and latent variables: A comparison of LISREL and IRT approaches. Irini Moustaki1 Statistics Department Athens University of Economics and Business

Karl G Jöreskog Department of Information Science Uppsala University

Dimitris Mavridis Statistics Department Athens University of Economics and Business

We consider a general type of model for analyzing ordinal variables with covariate effects and two approaches for analyzing data for such models, the item response theory (IRT) approach and the PRELIS-LISREL (PLA) approach. We compare these two approaches on the basis of two examples, one involving only covariate effects directly on the ordinal variables, and one involving covariate effects on the latent variables in addition.

INTRODUCTION Latent variable models are used in social sciences for analyzing interrelationships among observed variables. Latent variables are usually constructs that are not directly measured such as intelligence, emotion, political belief, wealth, stress. Those unobserved constructs are assumed to be measured through a set of observed variables also called indicators. Methodology exists for analyzing discrete, continuous and categorical indicators. The well known factor analysis model is a special case of a latent variable model with indicators measured on a continuous scale. The main idea behind latent variable models is that the latent variables account for the dependencies among the indicators. The number of latent variables required is much smaller than the indicators. In applications, it might be the case that additional covariates or explanatory variables are required together with the latent variables to account for the associations among the indicators. One might also be interested in measuring the effect of explanatory variables (e.g. demographic variables) on the latent variables identified from the model. This paper looks at latent variable models for ordinal indicators that allow for covariate effects both on the indicators and on the latent variables. We concentrate here on ordinal indicators since they are frequently met in social applications. The paper does not aim to develop new methodology for handling ordinal data but rather to compare existing methodologies in the literature in terms of easiness in fitting the models, model parameters 1

Request for reprints should be sent to Irini Moustaki, Athens University of Economics and Business, Department of Statistics, 76 Patission Street, Athens 104 34, Greece. Email:[email protected]

and goodness-of-fit. The two approaches are the structural equation modelling (SEM) approach and the item response theory (IRT) approach. Regarding the SEM approach we will concentrate here in the PRELIS-LISREL approach (PLA). A latent variable model consists of two parts. The part of the model that accommodates the effect of the latent variables and a set of observed covariates on the indicators and is called here the measurement model with direct effects (to distinguish it from the measurement model that only allows for latent variables) and the part of the model that links a set of observed covariates with the latent variables and is called the structural part of the model. Therefore, covariates are allowed to affect the indicators indirectly through the latent variables or directly. However, there might be situations where one would like to model the effect of a set of covariates on the latent variables and the effect of a different set of covariates directly on the indicators. In the applications section, we discuss an example in which we are interested in measuring overall satisfaction (latent variable) with the National Health Service in respondents’ area from five ordinal indicators controlling for the respondents’ political affiliation (observed covariate). In addition we allow for covariates age and gender to affect the latent construct satisfaction. In the literature there are two main approaches for conducting latent variable analysis. One is the underlying variable approach (UVA) developed within the structural equation modelling framework (SEM) which provides a general model that allows for covariate effects. The UVA is supported by commercial software such as LISREL (Jöreskog & Sörbom, 1999), EQS (Bentler, 1992) and Mplus (Muthén & Muthén, 2000). In this paper, we will use LISREL for computations. The other approach is the item response theory approach (IRT). Within the IRT approach, latent variable models have recently been extended to allow for covariate effects. Verhelst, Glas, and Verstralen (1994), Zwinderman (1997) and Glas (2001) discussed the Rasch or the one parameter logistic model with covariate effects, Sammel, Ryan, and Legler (1997) discussed an unidimensional latent trait model for binary and normal outcomes that allow for covariate effects and Moustaki (2003) discussed a multidimensional model for ordinal indicators with covariate effects. The models presented in Moustaki (2003) will be compared here with the SEM approach. A comparison between LISREL type models and IRT models for ordinal indicators without covariate effects can be found in Moustaki (2000) and Jöreskog and Moustaki (2001). Here, we extent that work to more complex models that allow for covariate effects. In this paper, we present two examples that differ in terms of the covariate effects included. The first example includes only direct effects of covariates on the ordinal indicators and the second example includes both direct and indirect effects of covariates. The LISREL software will be used for the UVA. The LISREL software has two main components: PRELIS (a preprocessor of LISREL) and LISREL. In PRELIS, the covariance or correlation matrix is estimated and the measurement and structural model are fitted in LISREL. For the item response theory approach we have developed our own software. Latent variable models with covariate effects contain a large number of parameters and they are very complex. An alternative would be to estimate the effect of covariates on latent variables in two-stages. The measurement model is fitted first and factor scores (latent scores) (Moustaki & Knott, 2000) are computed and used as dependent variables for further analysis. Croon and Bolck (1997) found that the use of factor scores as observed variables regressed on a set of covariates leads to biased estimates and therefore certain corrections need to be applied.

NOTATION Let y0 = (y1 , y2 , · · · , yp ) be a vector of p ordinal indicators. Small letters are used to denote both the variables and the values that these variables take. Let ci denote the 2

number of categories for the ith variable. The latent variables are denoted by a q × 1 vector z0 = (z1 , z2 , · · · , zq ) where q < p. We will distinguish between two different sets of covariates. Those covariates that affect the latent variables are denoted by w0 = (w1 , w2 , · · · , wk ) and those that affect the indicators directly by x0 = (x1 , x2 , · · · , xr ). Covariates can be of any type such as metric or categorical (dummy variables).

LATENT VARIABLE MODELS WITH COVARIATE EFFECTS Here, we briefly discuss the type of models that will be studied. Figure 1 shows the relationships that are allowed to be modelled using an example of three ordinal variables and three covariates. The path diagram shows that the three observed ordinal variables y0 = (y1 , y2 , y3 ) are indicators of a single latent variable z1 . The latent variable z1 and the observed covariate x1 account for the associations among the y variables. The direct arrow from x1 to y1 allows the mean level (here the thresholds) for variable y1 to be different for different values of the x1 variable. Finally, variables w0 = (w1 , w2 ) have an effect on the latent variable z1 . For example if w1 is a variable with two categories then the direct arrow from w1 to z1 indicates that the mean of the latent variable z1 is allowed to be different across the two groups defined by the w1 variable. Note that variable x1 needs to be different from variables w for identification reasons. As a result of that an arrow cannot be added from x1 to z1 when there is already arrows from x1 to all the y variables. Both x1 and w are considered fixed and they may be correlated. Figure 1 shows all the possible relationships that can be modelled. In certain applications some of those variables might not exist. For example, there might be a case where there are only covariates affecting the latent variables or covariates that only affect the ordinal indicators.

w1 H H HH

* ¾»

HH

j H * z1 J H

½¼ J H H j J HH w2 *

J

J x1

HH J HHJ JJ ^ HH j

y1

y2

y3

Figure 1. Path Diagram

ITEM RESPONSE THEORY APPROACH In the item response theory approach the element of analysis is the whole response pattern of the sample members. The assumptions made are that of conditional independence (responses to the p ordinal indicators are independent conditional on the latent variables z and the set of covariates x) and the multinomial assumption for the conditional distribution of the ordinal indicators conditional on z and x. The latent variables are taken to be independent with normal distributions. The normal distribution has rotational advantages. Correlated latent variables can be also fitted within the IRT framework but this is out of the scope of this paper. 3

Measurement model with direct effects First, we model the associations among the y variables as explained by the latent variables z and the covariates x. The covariates x and the latent variables z affect directly the ordinal indicators. In addition, we allow the vector of covariates w to affect the vector of latent variables z. The model and its estimation are discussed in detail in Moustaki (2003). The probability of responding into a particular category s is defined as the difference between two cumulative probabilities. πis (z, x) = γis (z, x) − γi,s−1 (z, x), i = 1, . . . , p; s = 1, . . . , ci

(1)

where γis (z, x) is the cumulative probability of a response in category s or lower of item yi , written as: γis (z, x) = πi1 (z, x) + πi2 (z, x) + · · · + πis (z, x) The cumulative probability γis (z, x) is modelled as a function of the latent variables z and the observed covariates x: link[γis (z, x)] = linkP (yi ≤ s | z, x) = αs(i) −

q X

αij zj +

j=1

r X

βil xl ,

l=1

i = 1, . . . , p; s = 1, . . . , ci

(2)

(i)

where αs , αij , and βil are parameters to be estimated. To simplify notation we just write γis . The link function can be any monotonically increasing function that maps (0, 1) onto (−∞, ∞) such as the logit, the complementary log-log function, the inverse normal function, (i) the inverse Cauchy, or the log-log function. The parameters αs are referred as ‘cut-points’ (i) (i) (i) (i) on the logistic, probit or other scale where α1 < α2 < · · · < αci , α0 = −∞ and (i) αci = +∞. The αij parameters can be considered as discrimination parameters or factor loadings since they measure the effect of the latent variables z on some function of the cumulative probability of responding up to a category of the ith item controlling for the effect of the covariates x. In the one latent variable case the negative sign in front of the slope parameter is used to indicate that as z increases the response on the observed item yi is more likely to fall at the high end of the scale. The βil are regression coefficients. Plots of the response probabilities and the cumulative probabilities for different parameter values can be found in Moustaki (2003). Let y = (y1 , y2 , . . . , yp ) represent the whole response pattern for a randomly selected individual. The density function f (y | x) of the manifest variables y is: f (y | x) =

Z +∞ −∞

···

Z +∞ −∞

g(y | z, x)h(z | w)dz

(3)

where g(y | z, x) is the conditional density function of y given z and x and h(z | w) is the density function of z conditional on w. The latent variables are assumed to be independent with normal distributions. The covariates x are assumed to be fixed. The integrals in (3) are approximated using Gauss-Hermite quadrature points. Other methods such as adaptive quadrature, Laplace approximation or Monte Carlo methods can be used. Under the assumption of conditional independence of y on z and x, the vector of latent variables z and the vector of observed covariates x account for the interdependencies among the observed ordinal variables so that when the latent variables are held fixed the responses to the p observed variables are independent: g(y | z, x) =

p Y i=1

4

g(yi | z, x).

(4)

For a manifest item yi the conditional probability of (yi | z, x) is given by: g(yi | z, x) = =

ci Y

πis (z, x)yi,s

s=1 ci Y

(γi,s − γi,s−1 )yi,s ,

(5)

s=1

where yi,s = 1 if the response yi is in category s and yi,s = 0 otherwise. Structural model Let us assume that the latent variables z are related to a set of observed covariates w in a simple linear form: z = Dw + δ

(6)

where z is q × 1 vector, D is a q × k matrix of regression coefficients, w is a k × 1 vector of covariates and δ is a q × 1 vector of independent standard normal variables. It follows that the distribution of the latent variables z conditional on the covariates w is normal with mean Dw and variance one. The covariates w are assumed to be fixed, non-stochastic. For a random sample of size N the log-likelihood to be maximized taken from (3) is written L=

N X

log f (ym | xm ) ,

(7)

m=1

where m refers to the m:th observation in the sample. The log-likelihood in (7) is maximized using an E-M algorithm. In order for the model to be identified, a necessary condition is that the covariates x that have direct effects on the indicators must be different from the covariates w that affect the latent variables.

THE UNDERLYING VARIABLE APPROACH Jöreskog and Goldberger (1975) discussed a multiple indicators and multiple causes (MIMIC) model for normal indicators with a single latent variable that allows for direct and indirect effects of covariates on the latent and manifest variables respectively. They report that when covariates are included in the model the parameter estimates are more efficient. Muthén (1989) discusses the MIMIC model for other types of observed indicators such as binary and ordinal for capturing heterogeneity across groups (groups are defined through the covariates). The basic element of analysis within the SEM approach is either the covariance or the correlation matrix of the indicators. The LISREL model (measurement and structural model) is fitted on the estimated covariance or correlation matrix. More specifically, for continuous, binary and ordinal indicators the Pearson correlations, the tetrachoric and the polychoric correlations are estimated respectively. In SEM the ordinal variables y are taken to be manifestations of some underlying continuous unobserved variables y∗ . The connection between the ordinal variable yi and the underlying variable yi∗ is yi = s

⇐⇒

(i)

τs−1 < yi∗ ≤ τs(i) , s = 1, 2, . . . , ci ,

5

(8)

where

(i)

(i)

(i)

(i)

τ0 = −∞ , τ1 < τ2 < . . . < τci −1 , τc(i) = +∞ , i are parameters called thresholds. For variable yi with ci categories, there are ci −1 threshold parameters. The measurement model is the classical factor analysis model extended with a new term that contains the covariates x: yi∗ =

q X j=1

λij zj +

r X

bil xl + ui , i = 1, 2, . . . , p ,

(9)

l=1

or in matrix form y? = Λz + Bx + u

(10)

where ui is an error term representing a specific factor and measurement error and yi∗ is an unobserved continuous variable underlying the ordinal variable yi . In classical factor analysis yi∗ is directly observed but here it is unobserved. The structural part of the model is z = Dw + δ

(11)

where z is a q × 1 vector of latent variables, w is a k × 1 vector of fixed covariates, D is a q × k matrix of regression coefficients and δ is a q × 1 vector of error terms. The error term follows a N (0, I). It is further assumed that z1 , . . . , zq , u1 , . . . , up are independent and normally distributed with zj ∼ N (dj w, 1). Since, no information is known about the origin and scale of the underlying variables yi∗ the mean of yi∗ is set to 0 and variance of yi∗ is fixed such that the conditional variance of yi∗ for given x and w is 1. Other ways of fixing the scale of the underlying variables are also possible, see Jöreskog (2002). (i) The parameters of the model are the thresholds τs , i = 1, 2, . . . , p , s = 1, 2, . . . , ci −1 and the factor loadings λij , i = 1, 2, . . . , p , j = 1, 2, . . . , q. When q > 1, the matrix Λ = (λij ) of order p × q is determined only up to an orthogonal transformation of order q × q. The additional parameters are the elements of B and D. The PRELIS-LISREL (PLA) approach does not specify a model for the complete p-dimensional response pattern. Since it makes use of the data in the univariate and bivariate marginals only, it specifies a model only for the univariate and bivariate marginal distributions. The PLA approach involves two steps, one PRELIS step to estimate the thresholds and the joint covariance matrix of y? , x and w and their asymptotic covariance matrix and one LISREL step to estimate all the other parameters. In the LISREL step, the model can be estimated either by robust maximum likelihood (RML) or weighted least squares (WLS), see, e.g., Jöreskog, Sörbom, Du Toit, and Du Toit (2001), Chapter 4 and Appendix A. PRELIS Step Let x? = (x, w). As before let y(p×1) be a vector of ordinal variables with underlying variables y? . It is assumed that y? | x? ∼ N (α + Γx? , Ψ) . To fix the scale of y? there are two equivalent specifications: the standard parameterization and the alternative parameterization. In the applications section we use the standard parameterization. This fixes the origin of y? such that α is 0 and the unit of measurement of y? such that Ψ is a correlation matrix, see Jöreskog (2002). 6

The rows of α and Γ and the diagonal elements of Ψ are estimated from the univariate margins and the off-diagonal elements of Ψ are estimated from the bivariate margins, see Jöreskog (2002). ˆ and Ψ, ˆ we have the following: ˆ Γ, Denoting these estimates as α, ˆ In the standard • The estimated conditional covariance matrix of y? for given x? is Ψ. parameterization this is a correlation matrix. • The estimated unconditional covariance matrix of y? is ˆ xx Γ ˆ0 + Ψ ˆ , ΓS where Sxx is the sample covariance matrix of x? . • The estimated joint unconditional covariance matrix of y? and x? is Ã

ˆ = Σ

ˆ0 + Ψ ˆ ˆ xx Γ ΓS 0 ˆ Sxx Γ Sxx

!

.

(12)

ˆ PRELIS also estimates the asymptotic covariance matrix of Σ. ˆ In addition to Σ, ˆ There is no latent variable model imposed on the Σ in (12). It is an unconstrained covariance matrix just as a sample covariance matrix S for continuous variables. It can therefore be used for modelling in LISREL just as if y? and x? were directly observed. LISREL Step The LISREL step is straightforward. Equation (10) is interpreted as a measurement model and equation (11) is interpreted as a structural model in LISREL (see, e.g., Jöreskog, et al, 2001, Chapter 1). The covariance structure implied by these equations and their ˆ by either MLR or WLS. assumptions may be fitted to Σ

RELATIONSHIPS BETWEEN IRT AND PLA PARAMETERS The parameters of the IRT and PLA approaches are not directly comparable because of different parameterizations. For the case of covariates affecting the ordinal variables only, i.e., the case without w-variables, the parameters of the two approaches are related as follows: (i)

τs =√ φi

(13)

λij αij = √ φi

(14)

bil βil = − √ φi

(15)

αs(i)

where φi is the variance of ui in (9). This extends the results of Jöreskog and Moustaki (2001) to the case with covariate effects. To compare the PLA parameters λij and bil with the corresponding IRT parameters αij and βil , we standardize the latter as follows: ? αij = qPq

2 j=1 αij

βil? = − qPq

2 j=1 αij

αij

+

Pr

2 l=1 βil Var(xl )

+1

βij

+

Pr

2 l=1 βil Var(xl )

7

+1

(16)

(17)

In the case where there are covariates affecting the latent variables, i.e., where wvariables are included in the model, no such standardization is needed because the LISREL specification can be done in such a way that the unstandardized parameters of the two approaches correspond.

APPLICATIONS In this section we analyze a data set from the 1996 British Social Attitudes Survey2 (BSA). The data set has previously been analyzed with the logit IRT model in Moustaki (2003). First example The data set consists of five ordinal indicators. Respondents were asked the question: On the whole do you think it should or not be the government’s responsibility to • • • • •

provide a job for everyone who wants one [JobEvery] keep prices under control [PriCon] provide a decent standard of living for the unemployed [LivUnem] reduce income differences between the rich and the poor [IncDiff] provide decent housing for those who can’t afford it [Housing]

The response alternatives given to the respondents were: definitely should be (DSB), probably should be (PSB), probably not be (PNB) and definitely should not be (DSNB). The sample size is 822. A covariate x (available in the BSA survey) constructed to measure left to right political identification is used, after it has been standardized, as a continuous explanatory variable for the ordinal indicators. There are 45 = 1024 possible response patterns but only 252 appear in the sample. The fifteen most common response patterns are given in Table 1. We see that the response alternative ‘definitely should not be’ does not appear in any of them. Table 2 gives the observed proportions for each category of the five ordinal indicators. The bulk of the answers are in the first two categories especially for the indicators PriCon and Housing. One-Factor Model without Covariates. We begin the analysis by fitting a one-factor model to the five indicators without taking into account the covariate x political identification. The results of that analysis using the IRT model with a logit link function can be found in Moustaki (2003). Here, we consider also IRT with the probit link function and compare it with the PLA approach. As stated in the underlying variable approach section, the PLA approach involves one PRELIS step to estimate the thresholds and the polychoric correlations and their asymptotic covariance matrix and one LISREL step to estimate the factor loadings. In the LISREL step, the model can be estimated by either MLR or WLS. Details are given in Appendix 1. There are two IRT models Probit and Logit and two PLA methods MLR and WLS. MLR means estimating the parameters by maximum likelihood and the standard errors by asymptotically robust methods using the asymptotic covariance matrix. WLS means estimating the parameters by weighted least squares using the inverse of the asymptotic covariance matrix as a weight matrix. Thus there are four approaches to be compared: 2

Social and Community Planning Research, British Social Attitudes Survey,1996, [computer file] Colchester, Essex: The Data Archive [distributor], 2 December 1998. SN: 3921

8

Table 1: Fifteen most common response patterns.

Frequency 88 41 23 22 22 19 18 17 15 14 13 11 11 10 10

Response 111 222 212 222 322 222 112 322 212 112 211 221 212 121 222

pattern 11 22 12 12 22 32 11 32 22 22 11 11 11 11 21

Table 2: Observed proportions for the ordinal indicators.

Definitely should be Probably should be Probably not be Definitely should not be

JobEvery 30.0 38.8 19.3 11.8

PriCon 43.3 41.7 10.2 4.7

LivUnem 29.3 49.0 15.1 6.6

IncDiff 36.4 31.8 21.5 10.3

Housing 37.6 50.9 9.2 2.3

IRT Probit, IRT Logit, PLA MLR, PLA WLS. We give parameter estimates and standard errors for all four approaches but for evaluation of fit and for the models with covariates we concentrate on the comparison of IRT Logit and PLA WLS. The standardized factor loadings with their estimated standard errors are given in Table 3. The IRT Logit estimates are generally larger than the IRT Probit estimates. Similarly, the PLA WLS estimates are generally larger than the PLA MLR estimates. It is also seen that the LISREL estimates are closer to the IRT Probit estimates than to the IRT Logit estimates, particularly for MLR. This is to be expected since the Probit link function corresponds to the assumption of underlying normality used in the PLA approach. The difference between IRT Probit and PLA MLR is not a difference between models but rather a difference between estimation methods. The IRT Probit is a full information maximum likelihood method whereas the PLA MLR is a limited information maximum likelihood method. Table 3 also shows that the standard errors are very similar across methods. To compare the fit of the two approaches we first compare the fit to the bivariate contingency table of the first pair of variables, namely JobEvery and PriCon. The chisquare residuals for IRT Logit and the PLA WLS are given in Tables 4 and 5 respectively. For IRT Logit, there are 6 chi-square residuals greater than 4. For PLA WLS there are 3 chi-square residuals greater than 4. The sum of the chi-square residuals is 51.75 for IRT Logit and 39.84 for PLA WLS. Thus, for this pair of variables, the fit is better for PLA WLS than for IRT Logit. But, as will be seen, for other pair of variables, it is the other way around. 9

Table 3: Standardized Loadings

Item JobEvery PriCon LivUnem IncDiff Housing

IRT Probit Logit .71(.03) .87(.02) .54(.03) .76(.03) .78(.02) .91(.01) .76(.02) .90(.02) .78(.03) .92(.02)

PLA MLR WLS .69(.03) .79(.02) .53(.04) .62(.03) .79(.03) .81(.02) .75(.03) .77(.02) .78(.03) .82(.02)

Table 4: Chi-square residuals for JobEvery vs PriCon for IRT Logit

JobEvery DSB PSB PNB DSNB

DSB 4.93 2.65 5.01 0.26

PriCon PSB PNB 10.90 0.34 8.87 1.58 1.65 1.44 4.19 0.56

DSNB 0.25 5.07 0.10 4.19

We extend this analysis to the rest of the pairs and we see that there are chi-square residuals exceeding 4 in all pairs of items and in most cases there are many. This is an indication that the one-factor model does not fit. Tables 6 and 7 give the total GF contributions for all pairs of variables under IRT Logit and PLA WLS, respectively. Although, using both approaches the results are not satisfactory, we see that the total GF contribution of IRT Logit is almost half of that of PLA WLS. Every pair of variables has 16 possible combinations of response categories and if the GF contribution for a pair of items is larger than 16*4=64 then the fit is considered to be poor. We see from Table 6 that the IRT model shows a satisfactory fit for all pairs of variables whereas from Table 7 we see that the GF contributions are smaller than 64 for 5 pairs. The striking difference is that some GF contributions are much larger for PLA WLS than for IRT Logit. For example, for the pair JobEvery and Housing, the total GF contribution PLA WLS is more than three times that of IRT Logit. One-Factor Model with Covariate. We proceed by fitting a one-factor model to the five ordinal indicators allowing for

Table 5: Chi-square residuals for JobEvery vs PriCon for PLA WLS

JobEvery DSB PSB PNB DSNB

DSB 3.04 2.48 3.28 4.51

PriCon PSB PNB 8.48 0.17 6.36 1.55 0.68 0.59 3.71 0.12

10

DSNB 1.11 3.00 0.01 0.75

Table 6: Total GF Fits for IRT Logit

Item PriCon JobEvery 52.04 PriCon LivUnem IncDiff Total(GF)=367.72

LivUnem 56.28 50.74

IncDiff 18.80 24.37 21.29

Housing 29.59 31.89 58.77 24.52

Table 7: Total GF Fits for PLA WLS


LivUnem 188.27 117.15

IncDiff 35.00 36.40 49.00

Housing 136.35 80.01 88.58 51.75

the covariate x to affect the ordinal indicators directly. The IRT model given in (2) with a logit link is fitted with one latent variable (q = 1) and one covariate x (r = 1). As before, the PLA approach involves a PRELIS step and a LISREL step. The PRELIS step estimates the joint unconditional covariance matrix of the underlying variables and the covariate and its asymptotic covariance matrix as described in (Jöreskog, 2002). The model is fitted to the unconditional covariance matrix by MLR and WLS using a special trick to obtain correct standard errors for the standardized solution described in Jöreskog et al. (2001), Appendix C. Details of the PLA approach is given in Appendix 1. Table 8 gives standardized estimates of factor loadings and regression coefficients for the one-factor model with direct effects. Standard errors of the parameter estimates are also given. Both the loadings and the regression coefficients are larger for the IRT models, particularly for the IRT Logit. Table 8: Estimated standardized loadings and regression coefficients

Item

JobEvery PriCon LivUnem IncDiff Housing

Factor loadings IRT PLA Logit (WLS) ˆ i1 α ˆ i1 λ .57(.17) .54(.03) .46(.09) .41(.04) .81(.04) .73(.02) .58(.03) .50(.03) .82(.003) .71(.02)

Regression coefficients IRT PLA Logit (WLS) ˆbi1 βî1 .65(.15) .54(.02) .58(.07) .43(.03) .46(.04) .40(.03) .69(.03) .58(.02) .46(.001) .42(.03)

We will next examine the goodness-of-fit of the IRT Logit and PLA WLS. We have 11

looked on how well the models fit the two-way margins. The covariate is continuous and therefore takes many different values. To check the fit of the model we take three values (with many occurrences) and we check how good the model predicts the observed frequencies of the bivariate margins of the indicators for these values of the covariate. We select the values such that the first one comes from the left tail of the distribution, the second from the middle and the third from the right tail. We select the values -1.239, -0.126 and 0.987 with frequencies 44, 103 and 53 respectively. Tables 9, 10 and 11 give the sum of the chi-square residuals for the three values of the covariate we have chosen a-priori for the IRT Logit model with one factor and direct covariate effect. The fit has improved much compared to the fit we get when the one-factor model is fitted without the covariate. The model shows bad fit only for a few pairs of categories and the total GF has decreased in comparison with the one-factor model without the covariate in all three cases. We should note that although we give the total GF for three values of the covariate we have checked the fit for many other values of the covariate and they all give similar results. Most of the ‘problematic’ chi-square residuals involve the response categories DSB and DSNB. Table 9: Total GF Fits for IRT Logit for covariate value= -1.239


LivUnem 41.28 19.43

IncDiff 13.12 7.07 34.30

Housing 6.96 3.16 14.15 12.41

Table 10: Total GF Fits for IRT Logit for covariate value= -0.126


LivUnem 21.64 57.29

IncDiff 8.72 30.60 18.74

Housing 26.59 33.95 11.20 28.67

Table 11: Total GF Fits for IRT Logit for covariate value= 0.987


LivUnem 12.55 19.38

12

IncDiff 13.16 18.78 11.92

Housing 26.54 14.53 11.83 10.76

Tables 12, 13 and 14 give the chi-squared residuals obtained when the factor model is fitted on the unconditional covariance matrix with WLS for the values -1.239, -0.126 and 0.987 of the covariate, respectively. In the LISREL model, for values of the covariate near the mean value 0 the fit has improved, whereas for values at the tails of its distribution the fit has deteriorated. The fit has deteriorated considerably for the value -1.239. Table 12: Total GF Fits for PLA WLS for covariate value= -1.239


LivUnem 265.71 117.97

IncDiff 539.85 335.57 518.42

Housing 214.31 24.82 96.82 870.34

Table 13: Total GF Fits for PLA WLS for covariate value= -0.126


LivUnem 27.62 141.90

IncDiff 11.16 49.23 22.22

Housing 68.70 69.52 18.34 40.50

Table 14: Total GF Fits for PLA WLS for covariate value= 0.987


LivUnem 37.47 39.22

IncDiff 95.73 111.47 67.02

Housing 48.79 37.44 42.86 60.64

Second Example The second application is also from the 1996 British Social Attitudes(BSA) Survey. Five ordinal manifest variables were selected for the analysis. The items measure satisfaction with the National Health Service in respondent’s area. The items asked are whether the National Health Service in your area is, on the whole, satisfactory or in need of improvement. The items asked are: • GP’s appointment systems [Appointment] • Amount of time GP gives to each patient [AmountTime] • Being able to choose which GP to see [ChooseGP] 13

• Quality of medical treatment by GPs [Quality] • Waiting areas at GP’s surgeries [WaitingArea] The response alternatives given to the respondents are: in need of a lot of improvement (LI), in need of some improvement (SI), satisfactory (S), and very good (VG). These are coded 1, 2, 3, 4, respectively. Item non-response varies between 1.5%-2.5%. After we excluded the missing values we were left with 841 respondents. Here, we are interested in building a model where the relationships among the five indicators are explained by a latent variable and an observed covariate political identification and the latent variable is affected by gender and age. There are only 205 different response patterns. The most common response patterns are given in Table 15. Table 15: Example 2: Most frequent response patterns.

Frequency 149 41 30 23 22 18 16 15 14 13 13 12 12 11 11 10 10 9 9 9

Response 333 233 444 223 333 333 333 232 222 323 222 332 333 444 343 233 322 344 223 334

pattern 33 33 44 33 43 32 34 33 22 33 33 33 23 43 44 43 33 44 23 43

The percentages for each category for each item are shown in Table 16. We see that the majority of the responses fall in the two middle categories. Table 16: Example 2: Frequency distribution for the observed ordinal items

LI SI S VG

Appointment 11.4 29.4 47.2 12.0

AmountTime 6.5 22.8 57.9 12.7

ChooseGP 6.7 20.9 58.3 14.1

14

Quality 3.8 19.0 53.9 23.3

WaitingArea 3.6 16.1 63.3 17.1

The one-factor IRT logit model was fitted first to the five ordinal indicators. The fit on the two-way margins was satisfactory if one looks at the chi-square residuals. There were pair of categories that gave values greater than four but the total GF across categories for all pair of items was not greater than 64. The LISREL model gave more or less the same good fit except from two GF contributions. The LISREL model has almost double GF-statistic in comparison with the IRT. In our example the latent variable can be taken to measure overall satisfaction with GP’s. We would like to use the covariate political identification as an extra variable that accounts for the relationships among the indicators. Political identification is measured as an observed covariate with four categories: conservative, labour, liberal democrat (called liberal for short) and other. We also want to measure the effect of gender and age on the latent variable satisfaction. Age is given in four categories: 18-25, 26-44, 45-64, 65+. In theory the covariate political identification should be a continuous variable but since it is categorical in the data, it will be used as a set of three dichotomous dummy variables, one for labour, one for liberal, and one for other. The category conservative is not used. Similarly, since age is coded as an ordered categorical variable, it will be used as a set of three dichotomous dummy variables, one for age 26-24, one for age 45-64, and one for age 65+. The age group 18-25 is not used. Thus, in the model we estimate one latent variable Satisfaction, three x-variables Labour, Liberal, and Other, and four w-variables Female, SecondAgeGroup, ThirdAgeGroup, FourthAgeGroup. The details of the PLA approach is given in Appendix 2. Table 17 gives the estimated standardized factor loadings and regression coefficients for the IRT model and for the LISREL model. Table 17: Standardized factor loadings and regression parameters

Item

Factor loadings IRT PLA ˆ i1 α ˆ i1 λ

Appointment AmountTime ChooseGP Quality WaitingArea

.86 .93 .91 .89 .80

.71 .85 .77 .78 .61

Regression coefficients IRT PLA IRT PLA IRT PLA ˆbi1 ˆbi1 ˆbi1 βî1 βî1 βî1 Labour Liberal Other .39 .34 .31 .27 .20 .18 .38 .36 .17 .14 .19 .15 .20 .17 -.11 -.09 .07 .07 .34 .30 .22 .20 .39 .37 .28 .23 .01 .03 .35 .25

Table 18 gives the estimated structural parameters for the IRT model and the LISREL model. Table 18: Structural parameters

Female 26-44 45-64 65+

IRT γî -.06 (.04) .19 (.10) .49 (.11) .70 (.11)

15

PLA dî -.07 (.07) .18 (.12) .49 (.13) .70 (.14)

It is very difficult to test the fit of the model when there are many covariates because we have to test the fit of the model for combinations of the values of the covariates. Also we need a large sample to do that.

CONCLUSIONS We have considered a general type of model for analyzing ordinal variables with covariate effects and two approaches for analyzing data for such models, the item response theory (IRT) approach and the PRELIS-LISREL (PLA) approach. We have compared these two approaches on the basis of two examples, one involving only covariate effects directly on the ordinal variables, and‘one involving covariate effects on the latent variables in addition. On the basis of these two examples, we find that parameter estimates are often close but the IRT models fit the data better, often much better. We also find that the models with covariates fit better than the models without covariates, although not much better. Both approaches have their advantages and their disadvantages. It was expected that the IRT method would give a better fit since it uses the whole response pattern and no loss of information occurs, whereas LISREL uses only the univariate and the bivariate margins. LISREL requires a large sample for the estimation of the asymptotic covariance matrix and also we do not know the effects of the violation of the bivariate normality on the estimated parameters. On the other hand, IRT models have been developed recently and there is no flexible software available for fitting those models. If one wants to fit a model with many factors, one will probably have to use LISREL, Mplus or EQS. LISREL is a very easy to use and gives much potential to the user. LISREL also allows the user to make the latent variables dependent, fix the dependence among them or fix any other parameter in the model. The computational burden in the IRT models increases rapidly as the number of factors increases, whereas this is not the case with LISREL. One way to reduce the computational burden in IRT models is to decrease the number of quadrature points but in that case the estimates may not be precise. LISREL provides many goodnessof-fit measures or model selection criteria, but there are not many available for IRT models particularly not for models with covariate effects.

Appendix 1 This Appendix gives the PRELIS and LISREL syntax files used in the analysis of Example 1. The data file is GVRESP.DAT. This is a text (ASCII) file where the six data values are given on one line per person and with spaces between the numbers. There are no missing values. The covariate PolIden is the last variable. This will not be used in the first part of the analysis. The following PRELIS syntax file is used to compute the polychoric correlations of the five ordinal variables and their asymptotic covariance matrix. These are saved in the files GVRESP.PM and GVRESP.ACP, respectively. Computing Polychoric Correlations and Asymptotic Covariance Matrix Data Ninputvars = 5 Labels JobEvery PriCon LivUnem IncDiff Housing Rawdata = GVRESP.DAT Output MA=PM PM=GVRESP.PM AC=GVRESP.ACP This gives the results reported in Table 1. The polychoric correlations are given in Table 19. To estimate the one-factor model with WLS we use the following SIMPLIS syntax file 16

Table 19: Matrix of Polychoric Correlations

JobEvery PriCon LivUnem IncDiff Housing

JobEvery 1.000 0.558 0.505 0.573 0.462

PriCon

LivUnem

IncDiff

Housing

1.000 0.314 0.451 0.328

1.000 0.552 0.712

1.000 0.566

1.000

Estimating the One-Factor Model Observed Variables: JobEvery PriCon LivUnem IncDiff Housing Correlation Matrix from File GVRESP.PM Asymptotic Covariance Matrix from File GVRESP.ACP Sample Size: 822 Latent Variables: Gvresp Relationships: JobEvery - Housing = Gvresp Method of Estimation: Weighted Least Squares Options: LX=LOADINGS End of Problem This run gives the results reported in the last two columns of Table 3. To use MLR instead of WLS just omit the line Method of Estimation: Weighted Least Squares The line Options: LX=LOADINGS is used to save the factor loadings with six decimals in the file LOADINGS. These factor loadings are needed to compute the bivariate GF fits reported in Tables 5 and 7. For the analysis of the five ordinal items with PolIden as covariate, we first compute the covariance matrix of the underlying variables and the covariate. This is done with the following PRELIS syntax file. Computing the Unconditional Covariance Matrix Data Ninputvars = 6 Labels JobEvery PriCon LivUnem IncDiff Housing PolIden Rawdata=GVRESP.DAT Fixedvariable: PolIden Output MA=CM CM=GVRESP.CM AC=GVRESP.ACC This will save the unconditional covariance matrix in the file GVRESP.CM and the corresponding asymptotic covariance matrix in the file GVRESP.ACC. As a byproduct this run will give the conditional correlation matrix given in Table 20. The unconditional covariance matrix is given in Table 21. Although we have taken the covariate into account, we see that all correlations remain highly significant. As we see from the conditional correlation matrix (see Table 20), the covariate alone does not account for the correlations of the variables underlying the ordinal 17

Table 20: Conditional Covariance Matrix

JobEvery 1.000 0.434 (0.037) LivUnem 0.375 (0.037) IncDiff 0.376 (0.038) Housing 0.314 (0.040) Note. Standard errors JobEvery PriCon

PriCon

LivUnem

IncDiff

Housing

1.000 0.169 1.000 (0.043) 0.278 0.428 1.000 (0.041) (0.036) 0.184 0.657 0.441 (0.044) (0.028) (0.037) are given in brackets

1.000

indicators. Probably the introduction of a latent variable along with the covariate will account for the correlations among the ordinal indicators better. The joint unconditional covariance matrix that is going to be used for estimating factor loadings and regression coefficients under the second method is given in Table 21. Table 21: Unconditional covariance matrix

JobEvery PriCon LivUnem IncDiff Housing PolIden

JobEvery 1.388 0.714 0.645 0.818 0.593 0.623

PriCon

LivUnem

IncDiff

Housing

PolIden

1.202 0.364 0.598 0.386 0.450

1.188 0.736 0.851 0.434

1.504 0.760 0.710

1.201 0.449

1.000

We see from Table 21 that the covariate is more related to items JobEvery and IncDiff than the other items. To estimate the LISREL model with the covariate we use the following LISREL syntax file. Estimating Standardized Solution da ni=6 no=822 cm=gvresp.cm ac=gvresp.acc mo ny=6 ne=6 nk=2 ly=di,fr ph=id ps=di te=ze fi ly(6) ga(6,1) ga(6,2) ps(6) va 1 ly(6) ga(6,2) co ps(1)=1-ga(1,1)**2-ga(1,2)**2 co ps(2)=1-ga(2,1)**2-ga(2,2)**2 co ps(3)=1-ga(3,1)**2-ga(3,2)**2 co ps(4)=1-ga(4,1)**2-ga(4,2)**2 co ps(5)=1-ga(5,1)**2-ga(5,2)**2 ou me=wls This gives the WLS estimates and standard errors given in Table 8. In the output, factor loadings are given in the first column and the regression coefficients are given in the 18

second column of the Gamma matrix. To obtain MLR estimates replace wls by ml on the ou line.

Appendix 2 This Appendix gives the PRELIS and LISREL syntax files used in the analysis of Example 2. The data file is GPDATA.RAW. This is a text (ASCII) file where the 12 data values are given on one line per person and with spaces between the numbers. There are no missing values. The following PRELIS syntax file is used to compute the covariance matrix for all 12 variables and the corresponding asymptotic covariance matrix. These are saved in the files GP.CM and GP.ACC, respectively. These computations are done as explained briefly in the underlying variable approach section. The PRELIS syntax file is (long variable names can be used but PRELIS retains only the first eight characters): Computing Covariance Matrix Data Ninputvars = 12 Labels Appointment AmountTime ChooseGP Quality WaitingArea Labour Liberal Other Female SecondAgeGroup ThirdAgeGroup FourthAgeGroup Rawdata = GPDATA.RAW Clabels Appointment-WaitingArea 1=LI 2=SI 3=S 4=VG Covariates: Labour - FourthAgeGroup Output MA=CM CM=GP.CM AC=GP.ACC WP The LISREL model is fitted to the covariance matrix using the following SIMPLIS command file: MIMIC Model with Direct Effects Observed Variables: Appointment AmountTime ChooseGP Quality WaitingArea Labour Liberal Other Female SecondAgeGroup ThirdAgeGroup FourthAgeGroup Covariance Matrix from File GP.CM Asymptotic Covariance Matrix from File GP.ACC Sample Size = 841 Latent Variables Satisfaction LABOUR LIBERAL OTHER Relationships Appointment - WaitingArea = Satisfaction LABOUR LIBERAL OTHER Satisfaction = Female SecondAgeGroup ThirdAgeGroup FourthAgeGroup LABOUR = 1*Labour LIBERAL = 1*Liberal OTHER = 1*Other Set the Error Variance of LABOUR - OTHER to 0 Set the Error Variance of Satisfaction to 1 Path Diagram LISREL OUTPUT: AD=OFF SO End of Problem A few explanations may be needed.

19

• The three latent variables LABOUR, LIBERAL, and OTHER are defined to be equal to the corresponding observed variables Labour, Liberal, and Other, respectively. This is achieved by the four lines LABOUR = 1*Labour LIBERAL = 1*Liberal OTHER = 1*Other Set the Error Variance of LABOUR - OTHER to 0 This trick is necessary because in a LISREL model, there cannot be a path from an x-variable directly to a y-variable if there are Eta-variables in the model. • The line Set the Error Variance of Satisfaction to 1 This correspond to the assumption that The variance of δ in (11) is 1. This fixes the scale for Satisfaction in such a way that the LISREL solution is comparable to the IRT solution. • The AD=OFF on the LISREL OUTPUT line is needed because the three latent variables LABOUR, LIBERAL, and OTHER have zero error variances. Hence, the matrix Ψ in LISREL is singular. • The AD=OFF on the LISREL OUTPUT line is needed because the scale for the latent variables are not specified by fixed values in each column of Λy . SO tells LISREL to skip this check. The SIMPLIS input gives maximum likelihood estimates with standard errors estimated under non-normality (RML). To obtain weighted least squares (WLS) estimates, just put WLS on the LISREL OUTPUT line. The ML estimates are given in Tables 17 and 18.

References Bentler, P. M. (1992). EQS: Structural Equation Program Manual. BMDP Statistical Software. Croon, M., & Bolck, A. (1997). On the use of factor scores in structural equations models (Tech. Rep. No. 97.10.102/7). Work and Organization Research Centre, Tilburg University. (WORC Paper) Glas, C. (2001). Differential item functioning depending on general covariates. In A. Boomsma, M. van Duijn, & T. Snijders (Eds.), Essays on item response theory (pp. 131–145). New York: Springer-Verlag. Jöreskog, K. G. (2002). Structural equation modeling with ordinal variables using LISREL (Available at http://www.ssicentral.com/lisrel/ordinal.htm). Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70, 631–639. Jöreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: a comparison of three approaches. Multivariate Behavioral Research, 36, 347–387. Jöreskog, K. G., & Sörbom, D. (1999). LISREL 8 user’s reference guide. Chicago: Scientific Software International. Jöreskog, K. G., Sörbom, D., Du Toit, S., & Du Toit, M. (2001). LISREL 8: New statistical features. Chicago: Scientific Software International. Moustaki, I. (2000). A review of exploratory factor analysis for ordinal categorical data. In R. Cudeck, S. Du Toit, & D. Sörbom (Eds.), Structural equation modeling: present and future. Scientific Software International.

20

Moustaki, I. (2003). A general class of latent variable models for ordinal manifest variables with covariate effects on the manifest and latent variables. British Journal of Mathematical and Statistical Psychology, 56, 337–357. Moustaki, I., & Knott, M. (2000). Generalized latent trait models. Psychometrika, 65, 391–411. Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54 (4), 557–585. Muthén, B. O., & Muthén, L. (2000). Mplus: The comprehensive modeling program to applied researchers. 11965 Venice Boulevard, Suite 407, Los Angeles, CA 90066. Sammel, R. D., Ryan, L. M., & Legler, J. M. (1997). Latent variable models for mixed discrete and continuous outcomes. Journal of the Royal Statistical Society, B, 59, 667–678. Verhelst, N., Glas, C., & Verstralen, H. (1994). Oplm: Computer program and manual. Armhem: CITO. Zwinderman, A. (1997). Response models with manifest predictors. In W. van der Linden & R. Hambleton (Eds.), Handbook of item response theory. Springer.

21

Factor models for ordinal variables with covariate effects ... - CiteSeerX

Factor models for ordinal variables with covariate effects ... - CiteSeerX

Suggest Documents

Factor Analysis of Ordinal Variables with Full Information Maximum ...

Causal Graphical Models with Latent Variables - CiteSeerX

Functional models for longitudinal data with covariate ... - Project Euclid

On Covariate Importance for Regression Models with ... - Springer Link

SIMEX R Package for Accelerated Failure Time Models with Covariate ...

Selecting Input Variables for Fuzzy Models - CiteSeerX

Ordinal regression models - Stata

A New Look at Horn's Parallel Analysis With Ordinal Variables

interpreting covariate effects in regression models - the Biostatistics ...

Bayesian Quantile Regression for Ordinal Models

Estimating ordinal reliability for Likert-type and ordinal ... - CiteSeerX

Stuart's tau measure of effect size for ordinal variables: Some ...

Combining Ordinal Financial Predictions with Genetic ... - CiteSeerX

Evaluating Structural Equation Models with Unobservable Variables ...

Models with transformed variables: interpretation and software

VAR models with exogenous variables - Personal.psu.edu

Developing Cost Models with Qualitative Variables

Factor Analysis with Ordinal Indicators: A Monte ... - Semantic Scholar

Covariate Functional Form in Cox Models

Can we use seasonally adjusted variables in dynamic factor models?*

A Hierarchical Latent Variable Model for Ordinal Data with ... - CiteSeerX

Correction for bias of models with lognormal distributed variables in ...

Analytic sensitivity analysis for models with correlated input variables

Improving Your Exploratory Factor Analysis for Ordinal Data - Practical ...