Suppression Situations in Psychological Research - CiteSeerX

13 downloads 0 Views 2MB Size Report
As Conger and Jackson (1972) pointed out, the part correla- tion reduces to the ..... bird-robin), a word semantically unrelated to the target (e.g., furniture-robin) ...
Copyright 1991 by the American Psychological Association, Inc. 0033-2909/91/J3.00

Psychological Bulletin 1991, Vol. 109, No. 3, 524-536

Suppression Situations in Psychological Research: Definitions, Implications, and Applications Joseph Tzelgov and Avishai Henik Department of Behavioral Sciences Ben-Gurion University of the Negev Beer-Sheva, Israel

In 1941, Horst noticed that a variable can be totally uncorrelated with the criterion and still improve prediction by virtue of being correlated with other predictors. He christened such variables suppressors, a title that implies that such variables suppress criterion-irrelevant variance in other predictors. During the 50 years that have passed since Horst's original analysis, the concept of suppression has been extended and reanalyzed. What follows provides a general approach to the analysis of suppression situations. This approach is based on coupling the analysis of 3 variate suppression situations with the applications of the concept of suppressor to the general linear model. The implications of the analysis are discussed, and some applications of the concept of suppression are provided.

multiple regression by taking Velicer's (1978) definition as a starting point. Tzelgov and Henik (1985) made a similar extension on the basis of Conger's (1974) definition. In what follows, we provide an approach to the analysis of suppression situations, which resulted from coupling the three-variate analysis (see Tzelgov & Stern, 1978) and the application of the concept of suppressor to the general linear model originally suggested by Holling (1983). This approach is based on defining suppressor situations in terms of regression weights. In the second part of the article, we discuss the implications of our conceptualization of suppression situations, and we compare our definitions to alternative approaches. The implications of the analysis are then discussed. Real-life applications of the concept of suppression are provided and discussed in the last part of the article.

Suppose you are trying to predict the success of high-school students in college. \bu may ask the students in high school whom they expect to succeed in college. A sociometric score for each student can then be defined and used as a predictor. It seems much less likely that you would decide to include a sociometric variable based on the question "Who is your best friend?" in your regression equation. At first glance such a decision seems perfectly correct: The "friend" sociometric score may be expected to have, and in most cases will have, almost zero validity. It follows that if validity is the criterion for the inclusion of a variable in a prediction equation, invalid variables such as the friend sociometric score should not be included. However, almost 50 years ago, Horst (1941) noticed that a variable can be totally uncorrelated with the criterion and still improve prediction by virtue of being correlated with other predictors. He christened such variables suppressors, a title that implies that such variables suppress criterion-irrelevant variance in other predictors. The concept of suppressor variables has been extended beyond the case of a variable having null validity by several investigators (e.g., Lubin, 1957). Conger (1974) provided a most extensive definition of a suppressor variable. A slightly different definition has been suggested by Velicer (1978). Tzelgov and Stern (1978) analyzed suppression situations in the three-variate case. Holling (1983) extended the concept of suppression beyond its applications in

Toward a General Definition In this section, we first define suppression situations for the three-variate case and then extend our conceptualization to linear combinations. We then present a mapping of possible relations between two predictors and a criterion. Finally, we discuss the distinction between absolute and relative suppression that is critical when the notion of suppression is extended to more than two predictors. The Three-Variate Case

The preparation of this article was inspired by an invitation to present our approach to suppression situations at the National Institute for Testing and Evaluation in Jerusalem. Some of the ideas and definitely the motivation for our work in this area stemmed from discussions with Daniel Kahneman when we were his students in Israel. We thank him for that. We also thank David Budesco, Jacob Cohen, Yoav Cohen, Clifford Lunneborg, Emda Orr, and two anonymous reviewers for their very valuable comments and suggestions. Correspondence concerning this article should be addressed to Joseph Tzelgov, Department of Behavioral Sciences, Ben-Gurion University of the Negev, P.O. Box 653, Beer-Sheva, Israel 84105.

We now present a framework for defining and detecting suppressor situations in the special case of two predictors p and s and a criterion variable c. This is done in terms of the intercorrelation between the two predictors rps and the ratio of their validities. For the sake of simplicity, in most of our discussion we consider correlations and regression weights as parameters and thus ignore the problem of sampling. We discuss the problem of statistics and parameters estimation in the last part of the article. Let us assume that all variables under consideration are standardized to have a mean of 0 and a standard deviation 524

525

SUPPRESSION SITUATIONS

of 1. In addition, let us assume that p is scored in a direction that invokes a positive correlation rcp. This framework was originally proposed by Tzelgov and Stern (1978), and we mention it to introduce the multivariate case. Conger (1974) defined a suppressor variable as one that increases the validity of another variable by its inclusion in a regression equation. Following this definition, s will be considered a suppressor if the following condition holds: PP>rcp.

(la)

In intuitive terms, the inclusion of s suppresses the criterionirrelevant part of p's variance, and as a result the weight of p in predicting the criterion is increased. Given the definition of/3p, Equation la can be rewritten as (Ib)

where as and Fs are the optimal weights resulting from multiple-regression analysis. Because the sets P and S are mutually exclusive, the correlation between p and 5 may be written as

Rp and Rs are the multiple-correlation coefficients between the variables in the sets P and S, respectively, and the criterion c. However, it can easily be shown that Rf and Rs are also the standard deviations of the variables p and s, respectively.' We may now define the conditions for suppression situations in terms of p and 5 as linear combinations rather then single variables. Note that the coefficients Rp and Rs are always positive, being (by definition) the positive square roots of/?2. Therefore, k may now be defined as the ratio k=

Let us now define the ratio of the two validities as k:

Because 1 — rps2 > 0, substituting k in Equation Ib and applying a little algebra results in the following inequality: 1-r

(2)

It follows that whenever Equation 2 holds, s acts as suppressor. A Linear-Combination Approach Holling (1983) suggested defining suppression situations in terms of the relationships between the predictor and the predicted (rather than the actual) values of the criterion. His suggestion implies that the criterion to be predicted may be not only a single variable, but a linear combination of variables. A definition based on predicted rather than the actual scores applies immediately to cases in which the criterion is not a single variable but to combinations of variables as well. As a result, Holling (1983) extended the concept of suppression to the general linear model. Following Holling's (1983) approach, Tzelgov and Henik (1985) showed that the definition suggested by Conger (1974) applies to the case of the general linear model. The linear-combination approach originated by Holling allows one to extend the analysis of suppression situations (Tzelgov & Stern, 1 978 ) to the general multivariate case. To examine the general case, let us assume that c is predicted by a number of (standardized) predictors that can be partitioned into two (mutually exclusive) sets: Fand S. Let us further assume that p is the best linear combination of the members of P used to predict c resulting, of course, from multiple regression. Formally, a2p2+ ---- ctjpj.

In parallel, s is the best linear combination of the members of S used to predict c. Again, formally,

R.

Note that according to this approach k is always positive. The three variables can always be scored to ensure that two of the three correlations will be positive, so that this analysis does not lose generality. Let us further assume that, in addition, a multiple regression based on all predictors was run. This resulted in c', a linear combination of all predictors (those belonging to P and to 5). We refer to c' as a linear-combination definition of the criterion. Because the analysis is based on the definition of the criterion as a linear combination of variables, it applies naturally to the more complicated cases of the general linear model, such as discriminant analysis or canonical correlation. Let us clarify this point. Tzelgov and Henik (1985) proved that the precondition for suppression (see Equation la) may also be stated as P'pand r'cp represent, respectively, the ft weight and the correlation between the predictor p and c', the predicted value of the criterion. To be more specific, r'cp is the zero-order correlation between the predictor p and c', the criterion predicted by both .Pand S. (i'pis the weight of p in the multiple-regression equation based on P and S and the criterion predicted by both P and S. Formally, |8pand r'Cf are defined2 respectively as 8' "p =

\r'='

Such an approach permits the analysis of suppression situations not only in the case of multiple regression, but also in the cases of canonical correlation and discriminant analysis. In our discussion, for the sake of simplicity, we refer to the variables p, s, and c. When necessary we distinguish between these variables as elementary and as linear combinations. We emphasize 1

p is in fact a linear combination of variables based On a multiple-regression equation. The squared multiple R is the variance of p, and therefore R is its standard deviation. The same logic applies to 5. 2 The proof behind this definition can be found in Holling (1983). See also Tzelgov and Henik (1985).

526

JOSEPH TZELGOV AND AVISHAI HENIK

that, unless explicitly stated otherwise, our arguments apply to these variables when they are univariate and when they are defined as linear combinations. The following analysis applies to p and s as linear combinations. However, one may also refer to Rp and R, as the validities of p and s. In other words, Rp and R s are the zero-order correlation coefficients computed (independently) between each of the two linear combinations p and s and some criterion variable c. However, for the sake of simplicity, we designate these validities rcp and rcs, respectively. This notation also stresses the fact that the application of the concept of suppression to linear combinations is straightforward. Relations Among Predictors and Suppression The concept of suppression may be better understood by denning Y0 and F, as

parallel to the x-axis at height y = 1 , reflecting the fact that in such conditions k = oo . Negative suppression. Darlington (1968) defined suppressors as variables that, while having positive correlation with the criterion, receive negative 0 weights in the (multiple) regression equation. Negative suppression conditions appear in Figure 1 above the parabola, but under the line representing classic suppression. Reciprocal suppression. Such situations have been defined by Conger (1974) as resulting from the fact that two predictors being positively correlated with the criterion are negatively intercorrelated. Reciprocal suppression situations are characterized by the fact that for both predictors the /3 weights are (in their absolute values) higher than the respective validities. Such situations appear in Figure 1 above the line depicting the classic suppression condition in the area of negative rps intercorrelation. The only case in which the less valid predictor may act as a suppressor is under the conditions of reciprocal suppression.

Y0=\-!f, and /C

Relative Suppression The precondition for the existence of suppression is the relation Y0 > Fj. This parallels the precondition to suppression as defined by Tzelgov and Stern (1978). However, k is defined in terms of multiple-regression coefficients to warrant generality. Following this definition, s will function as a suppressor whenever the following condition exists:

YO>YI.

(3)

Figure 1 presents Y0 and Y{ as a function of rps. F, is represented by a parabola with a maximum at (0,1) passing through (—1,0) and (1,0) and truncated at the x-axis. YQ appears as a set of lines reflecting different values of k, each of them passing through the point (0,1). (Only representative values of A: are depicted). The lines appearing in Figure 1 correspond to k values of 0.5,1, 2, and oo. (See Figure 1 for further description.) Y0 always intersects the point (0,1). Thus, the lines for all k values meet at this point. Note that, according to our approach, k is always positive. Figure 1 depicts the possible relationships between two predictors (or in more general terms two linear combinations of predictors) and a criterion in multiple regression. Similar classifications appear in Conger (1974) and McFatter (1979). A detailed discussion of the different conditions appearing in Figure 1 can be found in Tzelgov and Stern (1978). However, a few points should be mentioned. Given the validities of p and s and their intercorrelation, the relation between p, s, and c can be mapped on Figure 1. On the basis of this mapping, these relations can be classified as belonging to one of the following categories. Redundancy relationships. In the area circumscribed by the parabola and filled in with bars, suppression relationships are impossible because Y, is always larger than F0. Conger (1974) proposed designating as redundant variables that do not act as suppressors. Classic suppression. The situations of classic suppression discussed by Horst (1941) and characterized by a zero validity of the predictor s (i.e., ra = 0) appear in Figure 1 as the line

This general approach to suppression is in line with Conger's (1974) definition of suppression situations based on regression weights. Conger (1974) made a distinction between absolute and relative suppression. The case of relative suppression applies to a situation in which there is no indication for suppression in the three-variate case, but such a situation develops when additional predictors are added. Consider a case in which a predictor PJ is used to predict the criterion c in three different situations: by itself; as a member of a predictors set P consisting of the predictor PJ and another single predictor; and as member of a larger predictor set P * (P is a subset of P *) . Let ftj3, $f and /3* represent, respectively, the standardized regression weights of PJ in bivariate, three-variate, and multivariate (more than two predictors) prediction equations. Note that (1° is, in fact, the validity of pjm A situation of relative suppression would reveal itself in the following relationship.3

Absolute suppression may be stated as P? < ftf < (3*.

Let us clarify the distinction between the two situations. The relationship 0° < 0f is in fact equivalent to the definition of suppression situation given in Equation la. The relationship PJP ^ /?* results from applying Equation la to the regression weights of p in the three-variate case (/?/) and when more than two predictors are included in the equation (/3*) . In the absence of a suppression situation, the maximal weight of a given predictor is obtained in a bivariate-regression equation (i.e., when it is the only predictor) . Adding predictors to a bivariate-regres3 In fact, Conger's (1974) definitions are based on weighted validities (i.e., the multiplications of the validities of the variables by their regression weights). However, when all predictors are scored to have positive validities, Conger's definition may be stated using only the beta weight.

SUPPRESSION SITUATIONS

Y

527



-1 ps Figure 1. YQ and Y, as a function of rps. (The dotted line represents a k value of 0.5; the line of rectangles represents a k value of 1; the line of Xs represents a k value of 2; the solid line parallel to the horizontal axis represents a k value of oo.)

sion equation will not affect the regression weight of the predictor if the additional predictor is uncorrelated with the predictor already in the equation. Absolute suppression is denned by the relationship between the predictor's weight in bivariate regression equation and its weight in multivariate equations. It exists whenever adding predictors increases the weight of the variable relative to its weight in the bivariate equation (i.e., the validity of this predictor). In other words, it applies when f}° < ftf. Relative suppression exists whenever the weight of one or more predictor is increased by adding variables to a regression equation (i.e., when for that predictor ftf < 0* holds), but none of them exceeds the respective weights of the predictors in the bivariate case (i.e., when ft° < Qf does not hold for any of the predictors). The concept of relative suppression implies hierarchical or stepwise regression and comparison of the /3 weights of the predictors in the equation to both their validities and their /? weights in an equation including additional predictors.

Furthermore, the concept of relative suppression is applicable only if at least three predictors are involved. In contrast, the general analysis, being an extension of the three-variate case, is parallel to a definition stated in terms of two rather than three sets of regression weights. From this aspect our analysis is similar to absolute suppression. Alternatively, the partition of the predictor set into the P and 5 subsets is arbitrary. Therefore, whenever the predictor p/ is not the only member of P and a suppression situation exists, it is relative suppression. In fact, because our analysis is based on linear combinations rather than single variables, one may say that, according to our definition, suppression situations are always relative. Our definition reduces to absolute-suppression situations when PJ is the only member of P. Thus, the general definition of suppression situations applies for both absolute and relative suppression. The only clear example of absolute suppression occurs in the three-variate case.

528

JOSEPH TZELGOV AND AVISHAI HENIK

Some Implications of the Analysis Suppression Situations or Suppressor Variables? Our general definition focuses on suppression situations rather than on suppressor variables. If P and S include more than one predictor each and the relationship fi'p > r'cp holds, we know that a suppression situation exists. We do not know which variable (or subset of variables; see Conger, 1974) within the set S acts as a suppressor, nor do we know which variable or subset of variables within P is suppressed. It is only in the three-variate case that we can define the suppression relationships in terms of variables rather than situations. The concept of suppression was developed in relation to variables having some specific relationship to the criterion, such as zero validity (Horst, 1941) or having P weights and validities with opposite signs. Nevertheless, it is clear, even in relationships among variables in the three-variate case, that there are suppression situations that cannot be stated in terms of some specific characteristics of the suppressor variable. The condition of reciprocal suppression is such a situation. It is not characterized by zero validities or by suppressor variables with negative regression weights (and positive validities). Rather the reciprocal-suppression situation is defined in terms of the effects of these variables on the regression weights of the variables they suppress. As a result, analyses of suppression situations published in the 1970s (Conger, 1974; Conger & Jackson, 1972; Tzelgov & Stern, 1978; Velicer, 1978) revolved around the effects of suppression rather than on the characteristics of suppressor variables themselves. A general definition, such as the one proposed previously which is stated in terms of linear combinations, can be formulated only by specifying conditions and not by characterizing variables. It should be noted that when we are interested in detecting suppression relationships rather than suppressor variables, the best strategy is to define the validity ratio k by having the more valid predictor as the denominator. If both validities are positive, a suppression situation for ks smaller than 1 are possible only under conditions of reciprocal suppression. To clarify this point, let us have another look at Figure 1. The dotted line corresponds to the F0 function for k = 0.5 and is a typical case of YQ for k smaller that 1. Note that it is only at the area of reciprocal suppression that the precondition for suppression (i.e., Y0 > Y,) holds. In contrast, when k = 2 (see the line made of Xs in Figure 1), the precondition for suppression holds in the area of negative suppression as well. In other words, the precondition for suppression holds for the values of k smaller than unity only when both variables act as suppressors. Situations when either only p or only 5 acts as suppressor will not be detected with k smaller than unity. Thus, the more extreme condition for detecting suppression situations results from setting k > 1. The idea of suppressor variables implies a dichotomy of predictors: There are predictors whose function is to account for variance in the dependent variable (sometimes called redundant predictors), and there are other predictors (suppressor variables) whose function is to clear out criterion-irrelevant variance from the redundant predictors. This dichotomy is clearly wrong. Each variable in multiple regression and in its multivariate equivalents explains some variance in the criterion and

clears out criterion-irrelevant variance from other predictors. The limiting cases are: the classic suppressor whose only function is to clear criterion-irrelevant variance from other predictors; and a predictor totally uncorrelated with other predictors whose only function is to explain some criterion variance. This point is quite clear if one writes down the normal equation for some variable p, that is a member of P:

where r represents the validities and /3s the regression weights. ft*1 is the regression weight of the zth predictor. In path-analytic terms, under the assumption that all members of Pact as exogenous variables,4 ft" is the direct effect of the /th predictor, whereas 2 r,//3/ is its indirect effect. The direct effect represents the contribution to prediction by explaining some variance of the criterion, whereas the indirect effect represents the effect resulting from correlation with other predictors. Although there is no simple way to identify suppressor variables, suppressed variables are easily identified by having direct and indirect effects with opposite signs. In other words, when a variable is suppressed, its indirect effect has a sign different from its validity. The fact that in most cases variables in multiple regression explain some variance in the criterion on the one hand and suppress criterion-irrelevant variance on the other emphasizes the importance of suppressor-oriented analysis of multiple regression. More specifically, identification of suppression situations helps clarify how variables contribute to prediction, whether their main role is to explain some of the criterion variance or to clean out variance that is criterion irrelevant. Regression Weights or Part Correlations? The definition we have been discussing up to this point was based on regression weights. McNemar (1962) suggested that whenever the linear effect of a variable sj is cleared out from another variable p, and the residual of p is correlated with a third variable c, the resulting correlation, rc(ff), is the part (sometimes called semipartial) correlation between c and p. Formally, the part correlation between two variables c and p equals

As Conger and Jackson (1972) pointed out, the part correlation reduces to the definition of the classic suppressor whenever rcs equals zero. More specifically, whenever the validity of i equals 0, part correlation may be written as 4

Exogenous variables are those whose variability is assumed to be caused by variables outside the causal model (Pedhazur, 1982). 5 Although in this section we are discussing p and s as single variables in three-variate multiple regression, it should be clear that this is done just for the sake of simplicity. Our discussion applies to more than two predictors as well when p and s represent linear combination of variables.

529

SUPPRESSION SITUATIONS

which is exactly Meehl's (1945) formula for classic suppression. Furthermore, as can be seen from the definition of the part correlation, the 0 weight of a predictor in a multiple-regression equation equals its part correlation with the criterion divided by the standard deviations of that part of the predictor p, which is orthogonal to s. Therefore, it is not surprising that some of the definitions of suppression situations are stated in terms of part correlations rather than in terms of ft weights. For example, Lord and Novick (1968) explained how suppressors work using validity relationships (although their formal definition is based on regression weights). In this section, we analyze the definition of suppression based on part correlation and compare it to our approach. Velicer (1978) proposed that a suppression situation exists whenever the part correlation of one of the variables with the criterion in multiple regression is higher than its validity. Such a definition of suppression results in classifying as redundant some of the situations that on the basis of inequality (Equation 2; see also Conger, 1974; Tzelgov & Stern, 1978) are defined as suppression situations. Recall the fact that the ft weight of the variable p may be stated as a function of part correlation: 'c(p-s)

Given this definition, the following relationship between the 0 weight and the part correlation is immediately evident:

equals their part correlation with the criterion. As has been shown elsewhere (Stern & Tzelgov, 1978; Tzelgov & Stern, 1978), under such conditions the squared multiple correlation may be computed as the sum of the squared validities. It is known and well documented that whenever the intercorrelation among predictors is zero the squared multiple R can be computed as the sum of the squared validities. However, the zero-intercorrelation condition is a special case of the general condition defined by the equality of the zero order and the part correlation. Suppression and Partial Correlation Although the idea of suppression has been defined and discussed in the context of multiple regression, it should be noted that it is applicable to partial correlation as well. Given a criterion variable c and two predictors (or sets of predictors) p and s, it is immediately evident that the numerator of the partial (and of the part) correlation between one of the predictors and the criterion and the numerator of the ft weight of that predictor are identical. Therefore, if the sign of the partial correlation between a pair of variables differs from that of the zero-order correlation between the same pair of variables, it should be immediately evident that it is a situation of negative or classic suppression. The situation is more complicated when we are dealing with reciprocal suppression. As mentioned in the previous section, the ft weight of a given variable p may be defined as its part correlation divided by

ft ^ rc(p.s) •

Thus, as shown by Tzelgov and Henik (1 98 1 ) , a variable may be classified as a suppressor according to the definition proposed by Conger (and accepted by us) , but it would be a redundant variable according to Velicer. Tzelgov and Henik (1981) showed that the precondition for suppression according to Velicer's definition may be stated as

Let Y2 = V[l ~ rps2]- Velicer's definition identifies a suppression situation whenever

y 0 >F 2 . The difference between the two definitions is very clear in Figure 2, which shows Y0,Yi, and Y2 as a function of rps. Y2 is represented by a semicircle (made of rectangles) of Radius 1, centered at (0, 0). Any point within this semicircle represents redundant relationships according to Velicer's definition of suppression. Hence, as can be seen in Figure 2, in the region enclosed between the semicircle and the parabola, the relationships among the predictors p and 5 and the criterion c define suppression situations according to Conger (1 974) but not according to Velicer (1978). The points of intersection between Y0 and Y2 (i.e., the points on the circle in Figure 2 are of special interest. They define conditions for which the zero-order validity of the predictors

By analogy, the partial correlation rcips) can be defined as the ratio between part correlation rc(p.S) and V[l ~ ra2]. It follows that the increase in the (partial) correlation, after controlling for a variable or a set of variables, does not necessarily mean that we encounter a suppression situation, although it may be a good hint that a suppression situation exists. Still it is possible to define a set of conditions that will enable detection of suppression situations on the basis of part correlation. If predictor p is one of the suppressed variables and ft is its regression weight, then it follows that > > rr,

ft may be written as the ratio between the part correlation and V[l — rps2 ] , where s represents the suppressing variables, because it is also the case that the partial correlation rc(ps) may be written in terms of part correlation, as r

c(ps)



' c(p-s) _ r

By applying a little algebra, /3 may be expressed in terms of partial correlation:

It follows that whenever the following condition holds, we have encountered a suppression situation:

530

JOSEPH TZELGOV AND AVISHAI HENIK

-1

ps

Figure 2. Different definitions of suppression: The relationships between Y0, Y,, and Y2. (The line of rectangles represents F2; see text for details.)

>r '

Thus, whenever the partial correlation between p and the criterion is larger than the zero-order correlation multiplied by the ratio of two variance residuals (the residual of predicting p by s and the residual of predicting c by s) , we encounter a suppression situation. Of course, the chances to find such a situation are high if this ratio is lower than unity, and that will happen when the intercorrelation between the predictors is higher than the validity ofs.

Suppression Situations Exist and Are Useful Conger and Jackson (1972) discussed the relative utility of suppressors versus redundant predictors. They concluded that for any given degree of correlation a suppressor "does not yield as much incremental validity as an additional predictor" (p. 397). Their discussion, however, is limited to classic suppres-

sion. They also suggested that negative suppression situations are rare. The analysis of the situations appearing in Figure 1 implies that redundancy relations between two predictors (or linear combinations of variables) are only a relatively small subset of the possible relationships between a pair of variables. We suggest that suppression relationships are more frequent than it is believed. In most cases, investigators are simply not aware that they are dealing with a suppression situation. One reason to look for suppression situations is that such situations yield, in most cases, higher multiple correlations than do redundant relations among predictors. To demonstrate this point, let us examine Figures 3 and 4. Figures 3 and 4 display the improvement in prediction by multiple regression over prediction by the more valid predictor p as a function of the intercorrelation6 rps between the two sets 6 The incorrelation between two predictors is constrained by the formula: rcprsp/(\ - rc/- r,p2 + rct z ) 5 (see McNemar, 1962).

SUPPRESSION SITUATIONS

531

D

ps

.98

Figure 3. Increment in prediction by adding s for k = 2. (Dotted line = p = .4; range = —0.818 ... rfs... 0.978. Bold line = p = .6; range: - 0,583 , . . ras.,. 0.943.)

of predictors p and 5 for two possible values of k. In Figure 3, the ratio k equals 2, and in Figure 4 k equals 3. This increment is defined as the difference (D) between the multiple correlation coefficient R and rcp, the validity of the more valid predictor. Formally, D = R - rcp. Let us focus first on Figure 3. The dotted line represents the increment in prediction for the validities of 0.4 and 0.2. It should be noted that for these validities the effective range of rp, is bounded between —0.818 and 0.978. The bold line represents the increment in prediction for the validities of 0,6 and 0.3, which constrain the intercorrelation between -0.583 and 0.943.7 The parabola corresponds to Yj as denned in the first section of this article, while the straight line corresponds to Y0 when k-2. Figure 4 is very similar to Figure 3. It applies to sets of predictors having a validity ratio k of 3. The dotted line corresponds to validities 0.6 and 0.2 (for p and s, respectively), and the bold line corresponds to validities 0.9 and 0.3 (for p and s, respectively).

In both Figures 3 and 4, the ratio of validities is defined with p, the more valid predictor, as the numerator. It should be noted that if a suppression situation is not found for this definition of k, it will not be found for a k' value that equals 1 /k. As can be seen in Figure 3, for k values of 2, nonsuppression situations are bounded between rps values of 0 andO.5, In parallel, for k values of 3 (see Figure 4), the nonsuppression situations are bounded between rps values of 0 and 0.334.8 Given the effective range of rK i n our examples, it is clear that the nonsuppression or full-redundancy situations cover only a small portion of the actual relationships between variables in multiple regression. Another fact is evident from Figures 3 and 4: Maximal improvement in prediction, resulting from the addition of vari-

7 The effective ranges of the intercorrelations between p and s appear below each figure. ' As Tzelgov and Stern (1978) showed, the upper limit is defined by the condition r,,= \/k(see also Stern & Tzelgov, 1978).

532

JOSEPH TZELGOV AND AVISHAI HENlfC .5

D

-.67

ps

Figure 4. Increment in prediction by adding v for k = 3. (Dotted line 0.904. Bold line = p = .9; range = - 0.164 ... res... 0.668.

ables to the prediction equation, is not achieved when the added variable (or combination of variables) is uncorrelated with the variables already in the equation. This is consistent with the analysis of the effects of multicollinearity reported by Weber and Monarchi (1978). They have shown that, for the case of two predictors, R2 is not a monotonic decreasing function of multicollinearity.9 Furthermore, it should be noted that the increment in prediction resulting from the addition of a variable (or a set of variables) is more likely to appear when there are suppression relationships between the added predictors and the predictors already in the equation. As a result, suppression relationships will usually lead to a higher multiple correlation. This conclusion is in contrast to the analysis of Conger and Jackson (1972) but, as already mentioned, their analysis was limited to classic suppression. Enhancement and Suppression McFatter (1979) proposed a distinction between enhancement as a statistical phenomenon and suppression as a particu-

.91

p = .6; range = -- 0.664 ... rp,..

lar interpretation of such phenomenon. His definition of an enhancer variable is identical with Conger's (1974) definition of suppression cited previously here. He defined suppression as an interpretation of enhancement in terms of Conger's (1974) twofactor model of suppression. The model formalized the approach, common to most investigators (e.g., Lord & Novick, 1968) in the field, according to which variation in a given predictor arises from two sources, one relevant to the criterion and the other irrelevant. McFatter showed that the suppression interpretation is not the only one possible in the case of enhancement, and suggested additional models that can provide an explanation. Yet it seems to us that the two-factor model is the most parsimonious; therefore, unless an alternative structural model has been specified in advance, the interpretation of enhancement as suppression is the most simple and should be preferred. 9 Their conclusion can be easily extended to two linear combinations by using the general approach of this article.

SUPPRESSION SITUATIONS

Applications Attempts to Improve Prediction Following the original suggestions of Horst (1941) and Meehl (1945), several investigators attempted to improve prediction by applying the idea of a classic suppressor to various psychological scales. This was attempted in the case of the K scale of the Minnesota Multiphasic Personality Inventory (MMPI; Fricke, 1956; Fulkerson, 1958) and in the case of the stylistic scales of the California Personality Inventory (CPI; Goldberg, Rorer, & Green, 1970) without much success. Recently, several proposals have been made to improve prediction by removing halo (Holzbach, 1978; Landy, Vance, BarnesFarrell, & Steele, 1980) or leniency (Bannister, Kinicki, Denisi, & Horn, 1987) variance from rating scales. In contrast to the efforts made in the context of the MMPI (Fricke, 1956; Fulkerson, 1958) and the CPI (Goldberg et al., 1970), these attempts capitalize on the existence of suppressor relationships rather than testing data for their existence. Thus, both Landy et al. (1980) and Bannister et al. (1987) proposed using a rating scale after removing from it the variance it shares with a variable assumed to capture halo variance or leniency variance. This procedure results in a rating score that is orthogonal to the cleared-out variable. The validity of this score is the part correlation between the rating scale and the criterion, partialing out the halo'°-capturing variable from the rating scale. Conger and Jackson (1972) showed that, when the correlation between the criterion and the halo-capturing variable is zero, the previously defined part correlation is equal to the multiple correlation between the criterion, the rating scale, and the halo-capturing variable. Under such conditions, the halo-capturing variable acts as a classic suppressor, resulting in improved prediction (for a detailed analysis of this point, see Henik & Tzelgov, 1985; Tzelgov, 1988). In more general terms, this partialing-out approach will increase the validity whenever the irrelevant, variance-capturing variable is characterized by suppression relations with the remaining predictors. Ghiselli, Campbell, and Zedeck (1981) proposed conceptualizing the halo effect as a suppressor variable. A detailed analysis of the halo effect in suppression terms has been made by Henik and Tzelgov (1985). It is evident from their analysis that when there are nonsuppression or redundancy relations among the predictors, the partialing-out approach will eliminate criterion-relevant variance (for a similar point, see Murphy, 1982). Furthermore, they proposed testing for a possible halo effect by including the halo-capturing variable in a multiple-regression equation together with the rating scales. A halo effect would reveal itself by an increase in the regression weights of the rating scales as a result of the inclusion of this variable. Alternatively, the presupposed halo-capturing variable may improve prediction as a redundant variable. In any case, such an approach does not result in the exclusion of criterion-relevant variance. In this context, it is interesting to discuss the attempt by Bannister et al. (1987) to control for leniency. To justify the partialing-out approach, they performed a multitrait-multimethod (Campbell & Fiske, 1959) analysis of two sets of ratings. The intercorrelations among the ratings scales from the two ratings were compared with a similar set of partial correlations

533

obtained by partialing out the leniency components. The correlations between scales within the same set of ratings reflecting convergent validity were only slightly attenuated by partialing out leniency. The correlations between rating scales from different sets reflecting discriminant validity were increased by partialing out leniency. As shown previously here, although the increase in correlation between a pair of variables as a result of partialing out the same variable from both does not necessarily mean that variable acts as a suppressor, it is usually the case. It follows that the leniency scale Bannister et al. (1987) used does act as a suppressor. It does not necessarily mean that it acts as a classic suppressor. Therefore, the use of rating scales after partialing out leniency is not recommended because it may result in throwing away criterion-relevant variance (Henik & Tzelgov, 19 8 5). A much better solution would be to use a linear combination of the relevant rating scales together with the leniency scale. The weights of the variables in this combination should be based on multiple regression, which would result in clearing out only the criterion-irrelevant variance from the rating scales. This analysis also implies that, contrary to Bannister et al. (1987), there is no general way to decide how one should clear out criterion-irrelevant variance, and the optimal weight of the irrelevant-variance-capturing variable should be estimated by multiple regression. Coping and Hardiness Studies We believe that understanding suppression relations may significantly contribute to theoretical thinking. This point is evident given the close relationship between the suppression relationship and partial correlation. The concept of suppression implies that there are cases in which the effects of some (independent) variables of interest are blurred by criterion-irrelevant variance. Therefore, multivariate relationships should be analyzed in search of possible suppression conditions before or in parallel with the process of providing a theoretical interpretation. This should help to distinguish between (a) cases in which a variable contributes to an explanation by its statistical feature of being correlated with other independent variables, and is therefore able to serve as an irrelevant-variance "cleaner" (i.e., suppressor) and (b) cases in which the contribution of a variable reflects a relationship of theoretical interest. Orr (1986, 1987) investigated coping strategies of women after mastectomy. She found that seeking information about one's medical condition has low negative correlation with adjustment. According to Orr (1986), this variable focuses on facts or external reality. When the same variable was included in a multiple-regression equation together with variables that focus on the internal reality of the patient, such as openness toward one's internal feelings, the picture changes. Seeking information has a positive significant regression weight. This fact clearly indicates that it acts as a negative suppressor. In other words, seeking information clears out the variance reflecting focus on external world from the variables measuring focus on the internal world. In rather simplistic terms, it means that 10 Although this argument is made in terms of halo variable, it applies equally well to leniency.

534

JOSEPH TZELGOV AND AVISHAI HENIK

what really matters for patients adjusting to mastectomy is the internal world. An alternative to an interpretation in terms of suppression relations would be a rather complicated argument that would have to explain the change in the value of external focus as a result of introducing the possibility of focusing on the internal world. Hannah and Morrisey (1987) investigated the development of hardiness in high-school students. They used sex, feeling of happiness, grade, age, and religion as independent variables and hardiness as the dependent variable. The results of stepwise multiple-regression analysis indicate a complex pattern of relative reciprocal suppression. These suppression relationships can be seen at the absolute level as well. The regression coefficients of both age and grade in three-variate multiple regression were both higher than their respective validities, reflecting a clear pattern of reciprocal suppression. In fact, the zero-order correlation between age and hardiness was not significant. The investigators concluded that "the causal path linking age, grade and hardiness is through the combined effects of both age and grades on the development of hardiness ..." (p. 343). Taking into account the reciprocal suppression relations between both predictors, a better interpretation can be suggested. It is not the effects of age and grades as latent variables on hardiness that are combined; it is their measures. Only after the criterion-irrelevant variance of each predictor is taken out can its real contribution to the development of hardiness be appreciated. Thus, the two predictors contribute to the development of hardiness independently, but the problem is the measurement of their contribution.

Suppression Relations in an Experimental Setting Because the use of multiple prediction is not limited to correlational designs, the detection of suppression relations in correlational analyses of experimental data may contribute to the understanding of complex psychological relationships. One such example can be found in Neely, Keefe, and Ross (1986; see also Neely, Keefe, & Ross, 1989). In what follows, we analyze the data from their first two experiments. Their work concentrates on the semantic priming (SP) effect in lexical decision tasks. In a typical experiment, the subject has to make lexical decisions about target stimuli, some of which are words and some of which are nonwords. Each target stimulus is preceded by a prime stimulus. When the targets are words, the prime may be either a word semantically related to the target (e.g., bird-robin), a word semantically unrelated to the target (e.g., furniture-robin), or a neutral stimulus (e.g., xxx-robin). The SP effect is reflected in faster lexical decisions in response to related targets. Neely et al. (1986) contrasted two interpretations of the effect. According to the prelexical account, prime evoked processes that affect the speed of lexical access are completed before the target appears (e.g., Balota & Lorch, 1986; Becker, 1980; Meyer & Schvaneveldt, 1971; Forster, 1979; Neely, 1976). In contrast, the postlexical account suggests that the effect is caused by processes that begin after the target has been presented, and that such processes affect the decision itself rather than the speed of lexical access (e.g., Balota & Lorch, 1986; de Groot, 1984; Forster, 1979). Accordingly, the postlexical interpretation accounts nicely for the nonword facilitation effect

(NWF; i.e., faster nonword responses following an unrelated word prime; de Groot, 1984; Neely, 1976). In their first two experiments, Neely et al. (1986) manipulated (across conditions and experiments) the proportion of the related trials within the word-prime/word-target trials and the proportion of the nonword-targets within the "unrelated" (nonword and unrelated word) targets. To keep the terminology of the present article, let us designate these two variables as p and s, respectively. Whereas p is assumed to capture strategic factors affecting the prelexical stage, s is supposed to capture postlexical strategic factors. The correlation between these two variables as computed across conditions was 0.72. Furthermore, their correlations with SP (i.e., the criterion when one is interested in word targets) were 0.950 and 0.894 for p and s, respectively. In parallel, the correlations of these two variables with NWF (the criterion when one is interested in nonword targets) were 0.289 and 0.839, respectively. If we focus on explaining the SP effect, k (the ratio of the validities of the two predictors) equals 1.062. This, taken together with the 0.72, intercorrelation among the predictors, maps their relation with the criterion SP in the nonsuppression area (see Figure 1). Thus, both variables have the equal status of being redundant. This is consistent with both prelexical and postlexical processes being involved in the SP effect. The situation is different in the case of the NWF effect. Here the ratio of the validities equals 2.903. This, together with the intercorrelation of 0.72, implies that the less valid predictor p acts as a negative suppressor. In other words, its contribution to the prediction of NWF is mainly by clearing out criterion-irrelevant variance from the other predictor. Therefore, it should be concluded that the nonword facilitation effect reflects essentially postlexical factors, whereas SP reflects both pre- and postlexical processes rather than only one of them. These conclusions are, at least in part, possible without an analysis in terms of suppression (Neely et al., 1986). The negative partial correlation between NWF and p, while partialing out s, hints about these relationships. However, the analysis in terms of suppression situations provides a consistent theoretical framework for explaining changes in the sign of (part/partial) correlation as result of partialing out other variables.

Estimating the Parameters: Some Words of Caution Up to this point, we restricted our discussion of suppression relations to parameters. However, the problem of parameters estimation, given sample data, cannot be ignored in the discussion of application of the theoretical framework we are developing. To be more specific, the question is: What are the implications of the existence of suppressor situations on the standard error of the /? coefficients? We analyze the problem for the three-variate case. It should be clear that p and ^ can represent linear combinations rather than single variables. The formula of the standard error of/3 in the case of two predictors is given by 2 i

+' ^-' 2r ns*ren'rr.\ IV5

where N represents the number of observations. Because the expression (1 - rps2) appears in the denominator, it may be

535

SUPPRESSION SITUATIONS

expected that high intercorrelation between the predictors will usually result in an increase of the standard error of estimate for the |3 coefficients. It is also evident from our analysis and from Figure 1 that suppression situations are more frequent under conditions of high correlation between predictors. A straightforward implication would be that 0 coefficients indicating suppression have very high errors of estimate or, in other words, suppression situations are less replicable than (redundant) nonsuppressor situations. It follows that suppression situations have to be carefully replicated. We emphasize the importance of replicating suppression results given the significance that detection of suppression situations may have for psychological theory. It should be clear, however, that /? coefficients obtained under suppression situations are not less stable in general than the coefficients obtained under redundant situations. The instability of 0 coefficients is related to the correlation between the predictors and not specific to suppressor situations. Furthermore, the impression that high intercorrelation among predictors always implies instability of regression coefficients may be misleading. A detailed analysis of the relationships between the variance of the regression coefficients and the intercorrelations between predictors can be found in Weber and Monarchi (1978). They also showed that the standard error of estimate is a nonmonotonic (rather than monotonic) function of multicollinearity. More specifically, it is not always true that maximal intercorrelation between predictors results in maximal instability of the coefficient. In conclusion of this point, we stress that additional research is needed to understand the relations between our conceptualization of suppression on the one hand and the issue of significance of regression weights on the other.

Summary In this article, we coupled the approach developed for the analysis of a three-variate case of suppression (Tzelgov & Stern, 1978) with the application of the concept of suppression to the general linear model (Holling, 1983). The resulting general framework permits one to map the possible relationships between two sets of predictors and a criterion. Analysis of this mapping implies that suppression conditions are useful because they result in increased validity more frequently than is generally believed. This also implies that the addition of variables with minimal multicollinearity to a multiple-regression equation will not necessarily result in maximal validity (see Weber & Monarchi, 1978). Analysis of empirical relationships among variables in terms of suppressor relations assumes that variance of predictors can be partitioned into criterion-relevant and criterion-irrelevant components. McFatter (1979) pointed out that this is not the only possible interpretation of the empirical relationships among variables that are evident in the case of suppression. Nevertheless, the examples described in the last part of this article indicate that such an interpretation may help clarify relationships among the constructs reflected in the measured variables. It has been shown that the framework developed for the three-variate case can be successfully applied when the criterion is defined as a linear combination of variables (Tzelgov & Henik, 1985). Therefore, the mapping and its analysis apply

not only to multiple regression but also to canonical correlation and discriminant analysis.

References Balota, D. A., & Lorch, R. (1986). Depth of automatic spreading activation: Mediated priming effects in pronunciation but not in lexical decision. Journal of Experimental Psychology: Learning, Memory, and Cognition, 12, 340-357. Bannister, D. B., Kinicki, A. J., Denisi, A. S., & Horn, P. W (1987). A new method for the statistical control of rating error in performance ratings. Educational and Psychological Measurement, 47, 583-596. Becker, C. A. (1980). Semantic context in visual word recognition: An analysis of semantic strategies. Memory and Cognition, 8, 336-345. Campbell, D. X, & Fiske, D. W (1959). Convergent and discriminant validation by the multitrait multimethod matrix. Psychological Bulletin, 52,81-105. Conger, A. J. (1974). A revised definition of suppressor variables: A guide to their identification and interpretation. Educational and Psychological Measurement, 34, 35-46. Conger, A. J., & Jackson, D. J. (1972). Suppressor variables, prediction, and the interpretation of psychological relationships. Educational and Psychological Measurement, 32, 579-599. Darlington, R. B. (1968). Multiple regression in psychological research and practice. Psychological Bulletin, 69,161-182. Forster, K. I. (1979). Levels of processing and the structure of the language processor. In W E. Cooper & E. C. T. Walker (Eds.), Sentence processing: Psycholingustic studies presented to Merrill Garrett (pp. 27-85). Hillsdale, NJ: Earlbaum. Fricke, B. G. (1956). Response set as a suppressor variable in the OAIS and MMPI. Journal of Consulting Psychology, 20,161-169. Fulkerson, S. C. (1958). An acquiescence key for the MMPI (Rep. No. 58-71). Randolph Air Force Base, TX: USAF School of Aviation Medicine. Ghiselli, E. E., Campbell, J. P., & Zedeck, S. (1981). Measurement theory for the behavioral sciences. San Francisco, CA: Freeman. Goldberg, L. R., Rorer, L. G, & Green, M. M. (1970). The usefulness of "stylistic" scales as potential suppressor or moderator variables in predictions from the CPI. Oregon Research Institute Research Bulletin, 10(3). Groot de, A. M. B. (1984). Primed lexical decision: Combined effects of the proportion of related prime-target pairs and the stimulus-onset-asynchrony of prime and target. Quarterly Journal of Experimental Psychology, 36A, 253-280. Hannah, J. E., & Morrisey, C. (1987). Correlates of psychological hardiness in Canadian adolescents. Journal of Social Psychology, 1271, 339-344. Henik, A., & Tzelgov, J. (1985). Control for halo error: A multiple regression approach. Journal of Applied Psychology, 70, 577-580. Holling, H. (1983). Suppressor structures in the general linear model. Educational and Psychological Measurement, 43,1-9. Holzbach, R. (1978). Rater bias in performance ratings: Superior, self and peer ratings. Journal of Applied Psychology, 63, 579-588. Horst, P. (1941). The role of the predictor variables which are independent of the criterion. Social Science Research Council, 48, 431-436. Landy, F. L., Vance, R. L., Barnes-Parrel, J. L., & Steele, J. W (1980). Statistical control of halo error in performance ratings. Journal of Applied Psychology, 65, 501-506. Lord, F. M., & Novick, R. (1968). Statistical theories of mental tests scores. Reading, MA: Addison-Wesley. Lubin, A. (1957). Some formulae for use with suppressor variables. Educational and Psychological Measurement, 17, 286-296. McFatter, R. M. (1979). The use of structural equation models in inter-

536

JOSEPH TZELGOV AND AVISHAI HENIK

preting regression equations including suppressor and enhancer variables. Applied Psychological Measurement, 3, 123-135. McNemar, J. (1962). Psychological statistics. New York: Wiley. Meehl, P. E. (1945). A simple algebraic development of Horst's suppressor variables. American Journal of Psychology, 58, 550-554. Meyer, D. E., & Schvaneveldt, R. (1971). Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90, 227-243. Murphy, K. R. (1982). Difficulties in the statistical control of halo. Journal of Applied Psychology, 67, 161-164. Neely, J. H. (1976). Semantic priming and retrieval from lexical memory: Evidence for facilitatory and inhibitory processes. Memory and Cognition, 4, 648-654. Neely, J. H., Keefe, D. E., & Ross, K. L. (1986, November). Retrospective postlexical processes produce the proportion effects in semantic priming. Paper presented at the meetings of the Psychonomic Society, New Orleans, LA. Orr, E. (1986). Open communication as an effective stress management method for breast cancer patients. Journal of Human Stress, 72,175-185. Orr, E. (1987). Coping with breast cancer-patients' communication with regard to their illness. The Family Physician, 14, 255-267. Pedhazur, E. J. (1982). Multiple regression in behavioral research: Explanation and prediction. New York: Holt, Rinehart & Winston. Stern, I., & Tzelgov, J. (1978). Comments on two statements about

three-variate multiple regression. Psychological Reports, 43, 687690. Tzelgov, J. (1988). Why should partialling out leniency improve performance ratings, or should it? Unpublished manuscript, Department of Behavioral Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel. Tzelgov, J., & Henik, A. (1981). On the difference between Conger's and Velicer's definition of suppression. Educational and Psychological Measurement, 41, 1027-1031. Tzelgov, J., & Henik, A. (1985). A definition of suppression situations for the general linear model: A regression weights approach. Educational and Psychological Measurement, 45, 281-284. Tzelgov, J., & Stern, I. (1978). Relationship between variables in a three variable linear regression and the concept of suppressor. Educational and Psychological Measurement, 38, 325-335. Velicer, W F. (1978). Suppressor variables and the semipartial correlation coefficient. Educational and Psychological Measurement, 38, 953-958. Weber, J. E., & Monarch!, D. E. (1978). Graphical representation of the effects of multicollinearity. Decision Sciences, 8, 534-546.

Received December 18,1989 Revision received August 10,1990 Accepted September 6,1990 •