A Joint Use of PLS Regression and PLS Path ... - Semantic Scholar

0 downloads 0 Views 293KB Size Report
Analysis Approach to Latent Variable Modelling. Vincenzo ... Here a modified PLS Path Modeling algorithm, able to model also non-linear relations, is proposed.
Contents

Page 1 of 9

Print

Sessions

IPMs

A Joint Use of PLS Regression and PLS Path Modelling for a Data Analysis Approach to Latent Variable Modelling

T

Vincenzo Esposito Vinzi; Giorgio Russolillo; Laura Trinchera ESSEC Business School of Paris, Avenue Bernard Hirsch - B.P. 50105 Cergy-Pontoise - 95021, France. Università degli Studi di Napoli \Federico II" - Dipartimento di Matematica e Statistica Via Cintia - Complesso Monte S. Angelo, Napoli - 80126, Italy. Università degli Studi di Macerata - Dipartimento di Studi sullo Sviluppo Economico Piazza Oberdan, 3, Macerata - 62100, Italy. E-mail: [email protected]; [email protected]; [email protected]

Introduction Structural Equation Models (SEMs) [Bollen, 1989; Kaplan 2000] are widely used to model complex causal relations as the ones defining human behaviors. As a matter of fact, SEMs are very suitable to describe and estimate conceptual structures where some latent variables (LVs), linked by linear relations, are measured by sets of manifest variables (MVs). A double level of relations characterizes each SEM: the first involves relations among the LVs (structural model), while the other considers the links between each LV and its own block of MVs (measurement model). Several techniques exist to estimate SEM parameters. Among them, PLS Path Modeling (PLS- PM) algorithm [Wold, 1975; Wold, 1982; Tenenhaus et al., 2005] is the most widely used technique, thanks to its features. Actually, PLS-PM does not require distributional hypotheses and is more predictionoriented than classical covariance-based methods, like the LISREL one. Moreover, according to Tenen-

Contents

Page 2 of 9

Sessions

IPMs

Print

haus [2008], it provides systematic convergence of the algorithm due to its simplicity; it allows data to be managed with a small number of individuals and a large number of variables; it provides a practical interpretation of the LV estimates; and it represents a general framework for multi-block analysis. This is of main importance in several application domains, especially when modeling human behaviors. DUE PAROLE SULLA REGRESSIONE NEL PLSPM [Wold et al., 1983; Tenenhaus, 1998] However, PLS-PM only allows taking into account linear relations both in measurement and structural models. Especially in social sciences is of main importance to model also non-linear relations. Here a modified PLS Path Modeling algorithm, able to model also non-linear relations, is proposed. In the next section we review the standard PLS Path Modeling algorithm and the use of PLS Regression to estimate measurement model parameters. Then, non-linear approaches to PLS Regression are presented.

PLS Regression in PLS Path Modeling The PLS approach to SEMs (PLS Path Modeling, PLS-PM) [Wold, 1975; Wold, 1982; Tenenhaus et al., 2005] uses an iterative algorithm to obtain LV estimates through a system of multiple and simple linear regressions. The iterative algorithm works by alternating inner and outer estimates of the LVs. In particular, in the outer estimation step each LV is obtained as a weighted linear combination of its own MVs, while in the inner estimation step each LV is obtained as a weighted linear combination of the connected LVs (according to the path diagram structure).

T

Contents

Page 3 of 9

Sessions

IPMs

Print

T

Different ways exist to compute outer and inner weights to be used in the two steps of the iterative algorithm. Namely, the inner weights, i.e. the weights associated to each LV in the inner estimation step, can be computed according to three different schemes: the centroid scheme, the factorial scheme and the path weighting scheme. In all of these schemes, the inner weights are a function of the linear correlation between the LVs. Computation of the outer weights, instead, depends on the type of relation between a block of MVs and the underlying LV. As a matter of fact, in SEMs it is possible to assume that each block of MVs in the model is of reflective or formative type. In the case of a reflective block, the manifest variables are assumed to be the reflection in the real word of a latent concept. In other words, the LV is considered as a predictor of the MV. As a consequence, the generic outer weight used in the outer estimate of the LV is the regression coefficient of the simple linear regression of each MV on the inner estimate of the corresponding LV. In a formative scheme, instead, each LV is formed by its own MVs measuring different aspect of the same latent concept. In other words, the LV is a function of its own indicators. In this case, a multiple linear regression model defines the relation between the latent and manifest variables. Hence, in a formative scheme the outer weights are the regression coefficients of a multiple regression model of the inner estimate of each LV on its own MVs. Formative blocks often violate the independence hypothesis of the classical multiple regression model, because they are affected by multicollinearity problems. In this case variance of OLS estimators exponentially increases, leading to regression coefficients never significant. Moreover, multicollinearity may lead to difficultly interpretable weights because of the difference in sign between the regression coefficient of a MV and its correlation with the LV. In particular, not interpretable weights may occur if manifest predictors are very correlated and:

Contents

Page 4 of 9

Print

1. there exists a weak correlation between the manifest predictors and the regression external criterion (i.e the LV). 2. the LV is strongly correlated with all the predictors; in this case removing a MV does not change the multiple determination index (redundancy effect).

Sessions

IPMs

3. there is some suppressor variable highly correlated with other MVs but very poorly correlated to the LV: if we remove it, the multiple determination index significantly decreases.

T

Starting from these considerations, a new way to compute outer weights in the case of a formative block has been recently proposed by Esposito Vinzi et al. [2009]. This approach involves using PLS Regression (PLS-R) [Wold et al., 1983; Tenenhaus, 1998] in order to compute significant outer weights. In particular, Esposito Vinzi et al. [2009] propose to calculate at each iteration the outer weigths as coefficient of a PLS Regression of the LV inner estimate on the MVs. PLS Regression (PLS-R) method has been extensively described in the literature [Wold et al., 1983; Tenenhaus, 1998]. PLS-R is a linear regression technique that allows relating a set of predictor variables to one (PLS1) or several (PLS2) response variables. PLS-R shrinks the predictor matrix by sequentially extracting orthogonal components which at the same time summarise the explanatory variables and allow modelling and predicting the response variables. Finally, it provides a classical regression equation, in which the response is estimated as a linear combination of the predictor variables. De Jong [1995] showed that the sequence of PLS coefficient estimators forms a suite of vectors whose length strictly increases with the number of components. The upper bound of this suite is the length of least squares estimator. Due to its usefulness in reducing the variance of estimators, PLS-R has become a standard tool to face multicollinearity problems such as the ones previously described.

Contents

Page 5 of 9

Print

PLS-R assumes that there is a common structure underlying two blocks of variables, and that this structure can be resumed by few latent components, calculated as a linear combination of the predictor variables. When PLS-R is used as outer estimation method of the LVs, it extracts a component for each sub-block of variables expressing different concepts with regard to the LV. This feature is very useful when one or both of the following situations occur:

Sessions

IPMs

1. we handle block that are composed of several groups of variables each of them measuring a different concept.

T

2. some of this concepts are not related to the latent variable we are searching for. Let’s suppose that a block of MV is composed by three different groups of variables very correlated within the groups; only two of these groups are correlated with the LV, while the last one consists of one or several suppressor variables. In this case PLS-R would provide no more than two components. As a consequence, MVs belonging to the group of MVs not correlated to the LV will share very low (and not significant) regression coefficients. On the contrary, an ordinary LS regression would provide not reliable coefficients in terms of sign and intensity. From this point of view, PLS-R can be used as a powerful diagnostic tool, able to detect those latent dimensions underlying a block that are useful for explicating the LV. The usual diagnostic tools used for assessing unidimensionality or homogeneity of a block (Chronembach’s α, Dillon-Goldstein’s ρ, eigenvalue-one criterion), in fact, evaluate unidimensionality in itself, and not with respect to the LV. However, a block of variables could be summarized by few latent dimensions not all necessary related to the LV. If just one of these dimensions was related to the LV, the block should be re-thought as reflective. It is worth to notice that in this case PLS-R weights are the same that we would have found if we had modeled the

Contents

Page 6 of 9

Print

Sessions

IPMs

block as reflective. In the opposite case, where all the MVs are related with the LV but they are not correlated among them, PLS-R yields the same weights of a OLS multiple regression, and we have the confirm that modeling the block as formative was right. Finally, PLS mode seems to be the only way to modeling spurious situations. PLS Regression supposes linearity in relations between variables. Following the work by Esposito Vinzi et al. [2009], we decide to use a non-linear approach to PLS-R in order to estimate measurement model parameters in a non-linear PLS-PM approach to SEMs. In the next section the use of non-linear PLS-R in PLS-PM is investigated.

Non-linear PLS Regression in PLS Path Modeling Use non-linear PLS-R can be very useful to estimate measurement model parameters in a nonlinear PLS Path Modeling with formative indicators. In the following section non-linear approaches to PLS Regression are investigated and the new proposal is explained. Several approaches have been proposed to provide non-linear models that retain the properties of a linear PLS Regression, whereby there appears to be at least three different approaches to non-linear modeling. The first one is based on a non-linear relation linking the predictor PLS components with the response PLS components. In other words, the first approach is characterised by the inner relation: (1)

T

ˆ h = f (th ) u

Various forms have been proposed for f (th ), such as a quadratic form [Wold et al., 1989; Baffi et al., 1999a], a smoothing procedure [Frank, 1990], a spline function [Wold, 1992] and a neural network

Contents

Page 7 of 9

Sessions

IPMs

Print

T

[Qin et al., 1992; Baffi et al., 1999b]. The second approach, instead, is based on the maximisation of different criteria, maintaining the properties of a PLS Regression [Durand et al., 1997; Taavitsainen et al., 1992; Haario et al., 1994]. The last way to cope with non-linearity is based on the transformation of the explanatory variables [Durand et al., 1997; Berglund et al., 1997, 2001; Durand 1999, 2001]. All methods following the latter approach are based on a priori transformation function of the predictors. According to the Durand’s proposal [1997, 2001] predictors are transformed by spline functions, while Berglund’s [2001] proposal involves the transformation of quantitative variables to a set of dichotomous variables similarly to the coding of qualitative variables. In both these approaches the transformed variables are used as predictors in a linear PLS Regression. All these techniques strongly increase the number of predictors and so there are some difficulties in their application when the number of original predictors is high. In order to overcome this problem, we propose to use another approach for handling non linearity in PLS Regression: the PLS algorithm for CAtegorical Predictors (PLS-CAP) [Russolillo, 2009]. PLS-CAP was born to handle categorical predictors in a PLS regression. It involves an optimal quantification of predictors, finding out scaling parameters that maximize covariance between the first PLS component and the response variable. PLS-CAP assigns a numerical value to each class of the variable properly discretized. This implies that the number of predictors does not change when variables are tranformed. Another advantage of this technique is that the trasformation of the variables is internal to the algorithm and not a priori. As a consequence, transformations are more prediction oriented and coherent to the model. Starting from these considerations, we propose to use such non-linear PLS Regression instead of linear PLS Regression to compute outer weights in the case of formative blocks to analyse non linear

Contents

Page 8 of 9

Print

relations between a LV and its formative MVs. In particular, we propose to perform at each outer estimation step a PLS-CAP Regression between the MVs (as predictors) and the LV (as response variable). This approach is compared to linear PLS-PM and to the approach by Berglund et al. [2001] as estimation methods of formative measurement models.

Sessions

IPMs

REFERENCES

T

Baffi G., Martin E.B., Morris A.J. (1999a). Non-linear projection to latent structures revisited: the quadratic PLS algorithm, Computers and Chemical Engineering, 23, pp. 395-411. Baffi G., Martin E.B., Morris A.J. (1999b). Non linear projection to latent structure revisited (the neural network PLS algorithm), Computers and Chemical Engineering, 23, pp. 1293-1307. Berglund A., Wold S. (1997). INLR, Implicit non-linear latent variable regression, Journal of Chemometrics, 11, pp. 141-156. Berglund A., Kettaneh N., Uppgard L., Wold S. and et al. (2001). The GIFI approach to non-linear PLS modelling, Journal of Chemometrics, 15, pp. 321336. Bollen K. A. (1989). Structural equations with latent variables, Wiley, New York. De Jong S. (1995). PLS shrinks, Journal of Chemometrics, 9, pp. 323326. Durand J. (2001). Local polynomial additive regression through PLS and splines: PLSS, Chemometrics and Intelligent Laboratory Systems, 58, pp. 235246. Durand J., Sabatier R. (1997). Additive splines for partial least squares regression, Journal of the American Statistical Association, 92, pp. 15461554. Esposito Vinzi V., Russolillo G. (2009). Partial least squares path modeling and regression, in: Wiley Interdisciplinary Reviews: Computational Statistics, Wegman E., Said Y. and Scott D., eds., John Wiley and Sons, to appear. Frank I.E. (1990). A nonlinear PLS model, Chemometrics and Intelligent Laboratory Systems, 8, pp. 109-119.

Contents

Page 9 of 9

Sessions

IPMs

Print

T

Haario H. and Taavitsainen V.M. (1994). Nonlinear data analysis II. Exemples on new link functions and optimisation aspects, Chemometrics and Intelligent Laboratory Systems, 23, pp. 51-64. Kaplan D. (2000). Structural Equation Modeling: Foundations and Extensions, Sage Publications Inc., Thousands Oaks, California. Qin S.J., McAvoy T.J. (1992). Non linear PLS modelling using neural networks, Computers and Chemical Engineering, 16, pp. 379-391. Russolillo G. (2009). A proposal for handling categorical predictors in pls regression framework, in: PostConference Proceedings of First joint meeting of the Societe Francophone de Classification and the Classification and Data Analysis Group of the Italian Statistical Society - Series Studies in Classification, Data Analysis, and Knowledge Organization, June, 11 -13th 2008 Caserta Italy, Springer, in press. Tenenhaus M. (1998). La R´egression PLS: th´eorie et pratique, Technip, Paris. Tenenhaus M. (2008). Component-based structural equation modelling, Total Quality Management & Business Excellence, 19, pp. 871886. Tenenhaus M., Esposito Vinzi V., Chatelin Y.M. and Lauro C. (2005). PLS path modeling, Computational Statistics and Data Analysis, 48, pp. 159-205. Wold H. (1975). Modelling in complex situations with soft information, in Third World Congress of Econometric Society, Toronto, Canada. Wold H. (1982). Soft modeling: the basic design and some extensions, in Systems under Indirect Observation, Part 2, Jreskog K.G., Wold H. (eds). North-Holland, Amsterdam, pp. 1-54. Wold S., Martens H. andWold H. (1983). The multivariate calibration method in chemistry solved by the PLS method, in: Proc. Conf. Matrix Pencils. Lecture Notes in Mathematic, Ruhe A. and Kagstrom B., eds., Springer-Verlag, Heidelberg, pp. 286-293.