Comparing Correlations Based on Individual-Level ...

28 downloads 19 Views 2MB Size Report
individual-level data and correlations based on aggregated data from individuals. In particular, the conditions under which differences between individual ...
Copyright 1993 by the American Psychological Association. Inc. * 0021-9010/93/i3.00

journal of Applied Psychology 1993, Vol. 78, No. 4, 569-582

Comparing Correlations Based on Individual-Level and Aggregated Data Cheri Ostroff Researchers are often interested in comparing correlations between variables at different levels of analysis (e.g., individual and organizational) to determine if the same relationship holds across the levels. A special situation emerges when correlations at higher levels are based on aggregated data. This article contains an analysis of the nature of the relationship between correlations based on individual-level data and correlations based on aggregated data from individuals. In particular, the conditions under which differences between individual correlations and correlations based on aggregates represent statistical artifacts or meaningful differences are explored.

In recent years, levels-of-analysis issues and understanding relationships between levels (e.g., individual, group, and organizational) have become important themes in organizational research (e.g., Dansereau, Alluto, & Yammarino, 1984; Dansereau & Markham, 1987; Glick, 1985; Click & Roberts, 1984; James, 1982; Mossholder & Bedeian, 1983; Roberts, Hulin, & Rousseau, 1978). As a result, some researchers have begun to hypothesize that stronger relationships between variables may be found at higher levels of analysis. For example, Schneider (1985) suggested that research is needed to assess relationships at the group or organizational level in many areas that have traditionally been studied at the individual level, such as motivation and leadership, leadership and organizational performance, and absenteeism and attitudes. Comparisons of relationships between variables at different levels of analysis (e.g., individual and organizational) necessitates collection of data at each of the different levels. Oftentimes, researchers do not have a global index of the organizational variables of interest. Hence, they rely on aggregated (or averaged) data from individuals to represent the organizational-level variable (Roberts et al., 1978). Studies using aggregated data to represent organizational characteristics have often shown stronger correlations at the organizational level compared with the individual level. Stronger correlations at the group or organizational level (relative to individual-level relationships) have been found between climate perceptions and context, structure, and demographic variables (Jones & James, 1979); between job attitudes and stress (Schmitt, Colligan, & Fitzgerald, 1980); between commitment and turnover intentions (Angle & Perry, 1981); and between satisfaction and performance (Ostroff, 1992). Work has been directed at describing and formulating correlations based on aggregates and at describing fallacies when

I am grateful to David Harrison, Steve Kozlowski, and Paul Sackett for their helpful comments on an earlier version of this article. Thanks also to two anonymous reviewers for insightful comments and useful feedback on the article. Correspondence concerning this article should be addressed to Cheri Ostroff, Industrial Relations Center, University of Minnesota, 549 Management and Economics Building, 271 19th Avenue South, Minneapolis, Minnesota 55455-0430.

using aggregated data (e.g., Duncan, Cuzzort, & Duncan, 1961; Firebaugh, 1978,1979; Glick & Roberts, 1984; Hannan, 1971; Hannan & Burstein, 1974). Furthermore, Dansereau and his colleagues (Dansereau et al., 1984; Dansereau & Markham, 1987) have provided a framework for formulating theories and analyzing data across multiple levels of analysis. One question that has not been fully addressed is why the correlations at the organizational level (correlations based on aggregated data) are higher than those at the individual level. Is it because of statistical artifacts, elimination of error variances, biased estimates, and the like? Or does a higher correlation at the organizational level indicate that something different is happening at this higher level than at the individual level, and hence there is some organizational effect? My purpose in the research for this article was to describe the nature of the relationship between organizational and individual correlations when the organizational characteristics are represented by aggregated individual data. The relationship between correlations across levels depends on individual variation within organizations, the correlation within organizations, and the degree to which individual variation within organizations differs for the two variables investigated. Furthermore, the organizational-level correlation can be stronger than, weaker than, or equal to the individual correlation, depending on these factors. I show that failure to consider random error and measurement error can result in erroneous interpretations about the strength of relationships among variables at different levels of analysis. I also discuss issues concerning the conditions under which organizational-level correlations are of greater magnitude than individual correlations and some of the factors that might result in spurious organizational-level correlations are discussed. I was not concerned with the issue of whether or not to aggregate; rather, my purpose was to show what results to expect and how to interpret these results once the decision to aggregate and make cross-level comparisons has been made. The analyses presented here provide a basis on which researchers can determine whether differences between individual- and organizational-level correlations (based on aggregates) represent statistical artifacts or substantively meaningful differences in the relationships observed at different levels of analysis. I use the term organizational correlation here to refer to a correlation based on aggregated individual data that 569

570

CHERI OSTROFF

represent organizational characteristics; however, the same analyses can be applied to any correlation based on aggregated data.

Aggregation Issues The assumption underlying the use of aggregates to represent a higher level or organizational characteristic is that the aggregated variable represents another form of the construct at a higher level of analysis (Rousseau, 1985). Several authors have discussed the question of whether and when it is appropriate to aggregate data and have proposed rules for judging whether the aggregated variable is a good measure of an organizationallevel construct (see Glick, 1985,1988; James, 1982; James, Demaree, & Wolf, 1984; James, Joyce, & Slocum, 1988; Roberts et al., 1978, for details of these issues). Researchers have described fallacies associated with using aggregated data (cf. Hannan, 1971; James, 1982; Roberts et al, 1978). In general terms, the fallacy of the wrong level occurs when correlations at a more macro level are used to make inference about individuals, or vice versa. As shown by Robinson (1950), a macro-, or aggregated, correlation and an individual correlation cannot always be equated. Hence, researchers have been warned not to assume that a relationship among variables at one level represents the same relationship at another level of analysis. Relationships Among Variables Recently, many researchers and theorists have proposed the general hypothesis that the correlation between two variables will be stronger at the organizational level than at the individual level. This notion is somewhat vague as to the theoretical and conceptual underpinnings of the relationships. At least two interpretations of this hypothesis can be made. One interpretation is to expect homology across levels of analysis. Homology exists when the relationship between two variables is the same at the individual and organizational levels of analysis, assuming the constructs are appropriate at higher levels of conceptualization. The same processes operate at both levels of analysis, and linkages among the individual data reflect the same constructs as those same linkages at the organizational level. Alternatively, it can be hypothesized that different processes operate at the two levels or that the aggregated data reflect linkages among different constructs than those at the individual level. For example, in a study of individual absence, Mathieu and Kohler (1990) found that mean absence for the group significantly predicted subsequent individual absence beyond that accounted for by previous individual absence. The assumption was that the mean absence for the group represented an absence culture at the organizational level. This shared perception of absence influences individual responses beyond that accounted for by individual difference factors. Similarly, the differences in the relationship between variables for the individual and organizational levels may result from interdependence (Glick & Roberts, 1984). For example, a stronger correlation may result at the organizational level for the relationship between satisfaction and performance than at the indi-

vidual level, because at the organizational level the effects of interdependence are captured in the organizational-level variables. That is, when individuals in general are more satisfied, organizational performance may be enhanced through more cooperation, collaborative effort, better communication, participation, and mutual trust (Ostroff, 1992). It is important to denote whether similar or different processes are operating at the different levels. With homology, similar correlations at both levels are expected, whereas with different processes, the organizational correlation may be stronger (or weaker) than the individual correlation.

Components of Individual and Organizational Correlations Suppose one is interested in the correlation between a pair of variables, x (technology) and y (satisfaction), at the individual level, and that individuals are also grouped into j organizations. For each organization, individual scores on x are aggregated (averaged) to form variable u, and individual scores on y within each organization are aggregated to form variable v. Technology is assessed by asking individuals their perceptions of the amount of standardization of tasks performed. The organizational-level technology (u) is taken as the aggregated individual perceptions of technology within each organization. Satisfaction is assessed by asking individuals to report their level of job satisfaction, and organizational-level satisfaction (v) is computed by taking the aggregate of satisfaction across individuals within each organization. One could compare the correlation between technology and satisfaction (x and y) to the organizational-level correlation between technology and satisfaction (u and v), remembering that u and v are based on aggregates of x and y. Expanding on the work of Robinson (1950), researchers have derived population formulations for describing the components of the individual and organizational correlations (cf. Dansereau et al, 1984; Duncan et al, 1961; Hannan, 1971). The components of these population formulations are presented below: x y u, or mx v, or my 4

value of variable x for individual; in organization j value of variable y for individual / in organization j aggregated value of x for all individuals in organization j aggregated value of y for all individuals in organization j individual deviation from the mean of their organization for variable x(x— mx) iy individual deviation from the mean of their organization for variable y (y - my) a2x total variance o f x across all individuals d'y total variance of y across all individuals a\ variance between for x: variance of the aggregate « scores al variance between for y: variance of the aggregate v scores CT£ variance within for x: variance of individual deviations from the mean of their organization on x o£ variance within for y: variance of individual deviations from the mean of their organization on y Furthermore, by definition x = u + ix

an individual score for x is the sum of the mean for the organization in which the individual belongs and an individual's deviation from the mean

571

INDIVIDUAL AND AGGREGATE CORRELATIONS v = v + iy

an

individual score for y is the sum of the mean for the organization in which the individual belongs and the individual's deviation from the mean. v\ = °l + °t total variance for x is equal to the variance between plus variance within for x "I - "I + "| total variance for y is equal to the variance between plus the variance within for y.

Making the assumptions that ix, y, and v and iy, u, and v are independent yields _ covar(M, v) + covar(/x, iy

(1)

where covar indicates covariance, and p^ is the correlation between x and y for all individuals. Next, covar(tt, v)

individuals and organizations. The simplification of equal-variance ratios is made for several reasons. First, for many variables, there is no reason to assume a priori that the ratio of the variance within to total variance for one variable should be different than that for another variable. Second, as I show later, only large differences in variance-within ratios for the two variables have a substantial impact on the relationship between the individual and organizational correlations. Finally, because the analyses are fairly complex, this simplification makes it easier to see how the relationship between the individual and organizational correlations operates. It is also assumed that mean differences across organizations exist. As an example, it would be assumed that the mean scores for technology and satisfaction differ by organization. Without mean differences, there would be no organizational correlation. Given these assumptions, Equation 6 for the organizational correlation reduces to

(2) Pxy

where puv is the organizational correlation, that is, the correlation between aggregate x and y (u and v) for organizations. A major difference between the individual and organizational correlations is that individual variation from the mean organizational scores (i,. and iy) is not contained in the organizationallevel correlation. Furthermore,

1 _ °W

4

where 0^/4 represents the common variance-within ratio for x and y, 0^/4 = 0?^* = °|/

W/

PxuPy,

and, rearranging terms, the individual correlation is

and the correlation within is Pxy\uv ~

PxuPyvPw

(8)

Using the Equations 1-8, one can examine the conditions under which the organizational correlation will be higher than, lower than, or equal to the individual correlation by focusing on the various components that make up the correlations. First, the relationship between the individual and organizational correlation can be examined when pxu = p^. That is, the ratio of the variance within to the total variance for x is the same as the ratio of the variance within to the total variance for y. This ratio represents how much individual variation there is within an organization relative to the total variation across all

(9)

Puv ~

Pxy

°W Pxy\uv 4

1 -

Pxy

(10)

4

which indicates the ratio of the organizational correlation to the individual correlation. Equation 10 can be used to clarify the relationship between the organizational correlation and the individual correlation. In sociology, it has long been known that correlation coefficients increase as the size of the units used increases (Hannan, 1971). Studies have shown that as individuals are grouped into successively larger units on the basis of their areal proximity, the correlation of the aggregate scores increases. The explanation for this effect is that as smaller aggregates (e.g., states) are combined into larger units (e.g., geographical regions), the correlation within increases because of the increasing heterogeneity of the more inclusive aggregates. The values of variance within for x and for y increase (or those for variance between for x and for y decrease) as a result of the decrease in the heterogeneity of the means. Together, these produce a higher correlation for the aggregate scores relative to the correlation based on individuals. Unfortunately, this conclusion is not as straightforward as it appears, especially when one considers some organizational applications. In organizational research, one must consider that individuals are not randomly grouped into organizations, that measurement procedures are not perfectly reliable, and that macroorganizational characteristics can profoundly affect the extent to which individuals vary within an organization. Depending on the variables of interest, individuals within an organization may be more or less similar to each other. For example, a group of individuals within an organization may be fairly homogeneous in ability level because of the organization's

572

CHERI OSTROFF

selection procedures. Across organizations, however, ability levels may vary widely. Likewise, the variability of individual performance within an organization may be fairly homogeneous because of situational constraints and reward practices, but performance may vary widely across organizations. The relationship between the organizational correlation and individual correlation depends not only on the variance within relative to the total variance, but also on one's definitions of error variance, the correlation within, and the effect the organization has on these components. Thus, the organizational-level correlation can be higher than, lower than, or equal to the individuallevel correlation. Figure 1 shows the ratio of the organizational correlation to the individual correlation—when the ratio of the correlation within to the individual correlation is less than 1, greater than 1, and negative and less than 1—for the various levels of variance ratios (within variance to total variance). Also note that the variance-between ratio is 1 minus the variance-within ratio (variance within divided by total variance). To further under-

stand how the components of the individual and organizational correlation operate, consider Figures 2 and 3. In both figures, the individual and organizational correlations are positive. However, in Figure 2, the correlation within is positive, whereas in Figure 3 the correlation within and the ratio of the correlation within to the individual correlation are negative and less than 1. I use these figures, along with the equations presented above, to help illustrate in the following sections the nuances of the relationship between the organizational and individual correlations.

Variance Within Some have argued that the organizational-level characteristic can be represented by the aggregated variable when people within the organization respond in a similar fashion (e.g., James, 1982); in other words, that the variance within organizations is low relative to total variation. Given the previous technology-satisfaction example, suppose that the majority of jobs

20 O

i

£ 16 O

o ~a

S 12 >

'•5

c O

8

I

£

O

o

4

"5 o

1 o 'c no>

6

•5 -4

re oc -8

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ratio of Variance Within to Total Variance for X and Y Equal Variance-Within Ratios Figure 1. Ratio of organizational correlation to individual correlation in relationship to the ratio of the correlation within to the individual correlation and the ratio of the variance within to the total variance, when variance-within ratios for x and y are equal.

0.9

573

INDIVIDUAL AND AGGREGATE CORRELATIONS

variance-within ratio (or the variance-between ratio), ranged from 0 to 0.5, with a median of approximately 0.12. When the variance-between ratio is much lower (i&., when the variancewithin ratio is 0.5 or higher), the organizational correlation can be much higher, much lower, or equal to the individual correlation, depending on the view of error variance and the ratio of the within correlation to the total correlation. Issues related to this are considered in the following sections. Error Variance

Correlation within: positive Individual correlation: positive Organizational correlation: positive

Figure 2. Situation in which correlation within, individual correlation, and organizational correlation are positive. (Ovals represent the scores for individuals within one organization; dots within ovals represent aggregate organizational scores. The solid line within an oval represents the best fitting straight line for the correlation within; the dotted line represents the best fitting straight line for the organizational correlation; the solid 1 ine across ovals represents the 1 ine for the individual correlation.)

within organizations are similar, but not identical, in their levels of task standardization. Across organizations, there are fairly large differences in technology. Most jobs in one organization may be highly automated and the organization characterized as high technology because most individuals describe their jobs as highly automated, but some jobs may not be as highly automated. A similar situation exists for satisfaction, in that most people within the organization have a similar level of satisfaction but satisfaction levels differ across organizations. The variance-within ratios are fairly small. It can be seen in Figure 1 that when the variance within is very low (

Suggest Documents