Personal Relationships, 9 (2002), 327–342. Printed in the United States of America. Copyright # 2002 ISSPR. 1350-4126/02
Estimating actor, partner, and interaction effects for dyadic data using PROC MIXED and HLM: A user-friendly guide
LORNE CAMPBELLa AND DEBORAH A. KASHYb a Simon Fraser University; and bMichigan State University
Abstract Data collected from both members of a dyad provide abundant opportunities as well as data analytic challenges. The Actor-Partner Interdependence Model (APIM; Kashy & Kenny, 2000) was developed as a conceptual framework for collecting and analyzing dyadic data, primarily by stressing the importance of considering the interdependence that exists between dyad members. The goal of this paper is to detail how the APIM can be implemented in dyadic research, and how its effects can be estimated using hierarchical linear modeling, including PROC MIXED in SAS and HLM (version 5.04; Raudenbush, Bryk, Cheong, & Congdon, 2001). The paper describes the APIM and illustrates how the data set must be structured to use the data analytic methods proposed. It also presents the syntax needed to estimate the model, indicates how several types of interactions can be tested, and describes how the output can be interpreted.
People involved in dyadic relationships (or even brief dyadic interactions) often influence each other’s cognitions, emotions, and behaviors. This notion is certainly applicable to romantic relationships, where the potential for mutual influence may be the quintessential feature of closeness in relationships (Kelley et al., 1983). For instance, virtually all major theories of romantic relationships acknowledge the concept of interdependence, including theories of equity (Messick & Crook, 1983; Walster, Walster, & Berscheid, 1978), commitment (Rusbult, 1980), trust (Rempel, Holmes, & Zanna, 1985), interdependence (Kelley & Thibaut, 1978; Thibaut & Kelley, 1959), and attachment (Bowlby, 1969, 1973, 1980). Mutual influence is also germane to other types of dyadic relationships (e.g., friendships
Lorne Campbell is now at the University of Western Ontario. Correspondence should be addressed to Lorne Campbell, Social Services Centre, University of Western Ontario, London, ON, N6A 5C2, Canada, or Deborah A. Kashy, Department of Psychology, Michigan State University, East Lansing, MI, 48824, e-mail:
[email protected].
and parent-child relationships). One consequence of interdependence is that the attributes and behaviors of one dyad member can impact the outcomes of the other dyad member. Relationships researchers have struggled with ways to analyze dyadic data. Although some relationships researchers continue to analyze dyadic data by ignoring the interdependence and simply analyze the data as if it were derived from a set of individuals, most relationships researchers are cognizant of the problems inherent in such an approach (i.e., biased significance tests; Kenny, 1995). Concern about nonindependence has led many researchers who study dating or married couples to conduct separate analyses for men and women. This circumvents the nonindependence issue, but it produces its own problems. Perhaps the most substantial problem with separate analyses for men and women is the implicit assumption that gender is an important factor, and that differences between men and women exist. All too often when researchers find different prediction equations for men and women, they interpret their results as implying that there are significant differences between
327
328
men and women. Finding that a variable significantly predicts outcomes for men but not for women does not necessarily indicate that the relationship between the variable and the outcome differs significantly between men and women. In addition, if dyad members cannot be readily distinguished by some variable such as gender (as would be the case with same-sex friends or homosexual couples), data analytic issues become even more challenging because the assignment of dyad members into two groups (as is done in the heterosexual case) would be arbitrary. Kenny and colleagues (Kashy & Kenny, 2000; Kenny, 1988; 1990; 1996; Kenny & Cook, 1999) have proposed a model of dyadic data analysis, the Actor-Partner Interdependence Model (APIM), that uses the dyad as the unit of analysis, and that allows for, but does not require, gender interactions. It can be used for dyads that have distinguishable members (e.g., married or dating heterosexual couples) or for dyads that have nondistinguishable members (e.g., same-sex friends). This model suggests that a person’s independent variable score affects both his or her own dependent variable score (known as the actor effect), and his or her partner’s dependent variable score (known as the partner effect). The partner effect from the APIM directly models the mutual influence that may occur between individuals involved in a dyadic relationship. The connections between one partner’s activities or qualities and the other partner’s outcomes (i.e., a partner effect) are, to some degree, what defines a close relationship. For instance, as Kelley et al. (1983, p. 24) note, ‘‘A prominent feature of a ‘relationship’ is that events associated with one person are causally connected to those associated with the other person. Indeed, this is a necessary feature of ‘relationship’ as we define it.’’ Therefore, in order to fully understand relationship processes, research that uncovers the interdependent nature of close relationships needs to be conducted. To facilitate this research, statistical tools capable of properly assessing mutual influence in relationships need to be available and implemented. Some relationships research is beginning to test for and report partner effects. For example,
L. Campbell and D. A. Kashy
Murray, Holmes, and Griffin (1996a, 1996b) reported that people are generally more satisfied in their relationships when their partners perceive them in a particularly rosy fashion (a partner effect). Campbell, Simpson, Kashy, and Fletcher (2001) demonstrated that people are happier in their relationships when they perceive their partners as more closely matching their image of an ideal partner (an actor effect), and also when they more closely match their partners’ image of an ideal partner (a partner effect). Also, Robins, Caspi, and Moffitt (2000) illustrated how both partners’ personality traits shape the quality of their relationships. The partner effects reported in each of these studies are of particular interest as they model the interdependence that exists in close relationships, and expand theory in novel and interesting ways. Several authors have suggested ways to estimate Actor and Partner effects. Kashy and Kenny (2000) proposed a pooled regression approach in which the results of two regressions are combined to estimate the APIM effects. Gonzalez and Griffin (1999) suggested an approach that uses structural equation modeling. Each of these two approaches, however, has drawbacks. The pooled-regression approach can be computationally cumbersome and is fairly inflexible for testing restricted models— for example, it does not allow the researcher to specify that some variables only have actor effects whereas other variables only have partner effects. Structural equation modeling, on the other hand, is best suited for testing linear effects. However, interactions between both discrete and continuous variables are often of interest to researchers. Although many methods have been suggested to test for interaction and nonlinear effects in structural equation modeling (see Schumacker & Marcoulides, 1998, for a review), these methods are not very effective when many interactions are tested in the same model, and when interactions involving more than two variables are included in the model. The purpose of this paper is to provide a detailed description of how to use two multilevel modeling programs, specifically PROC MIXED in SAS and HLM (Raudenbush, Bryk, Cheong, & Congdon, 2001) to analyze dyadic data using the APIM. Both of these programs
Estimating actor, partner, and interaction effects for dyadic data
can be used for multilevel modeling (also known as hierarchical linear modeling; see Singer, 1998, for an excellent introduction to PROC MIXED) and, in the dyadic case we will describe, both treat the data from two dyad members as nested scores within a group that has an n = 2. Both programs provide estimates of actor and partner effects without requiring additional computations by investigators. Both programs also allow researchers to test the relative adequacy of various restricted models and both are well suited for the testing of interaction terms. Kenny and Cook (1999) briefly described how to use PROC MIXED to estimate actor and partner effects. However, because their paper did not provide detailed step-by-step information on how the procedure can be implemented, we feel that a paper that supplies a more detailed explanation will be useful for relationships researchers. Additionally, no detailed explanation of how to use HLM to estimate actor and partner effects is currently available. We hope that the step-by-step approach we take here will motivate (and help) researchers to adopt this data analytic model. Types of variables in dyadic research Before introducing the data analytic approaches, we must first introduce some definitions. There are three types of predictor variables in dyadic research: betweendyads variables, within-dyads variables, and mixed variables (Kenny, 1988, 1996). A between-dyads variable is one for which scores are the same for both members of a given dyad, but they differ from dyad to dyad. An example would be a study in which some dyads are randomly assigned to one experimental condition and other dyads are randomly assigned to a different condition. Another between-dyads variable might be length of marriage. In contrast, a within-dyads variable is one for which the scores for partners within each dyad are different, but the average score is the same for all dyads. In research involving heterosexual couples, gender is an example of a within-dyads variable. Another within-dyads variable might be the percentage of childcare done by each member of a couple (assuming that percentages add to 1.0 across the couple members).
329
A mixed predictor variable is one for which there is variation both within dyads and between dyads. Attachment avoidance is an example of a mixed variable because some people are more avoidant than others, and the average level of avoidance within a couple differs across different couples. Actor and partner effects can be directly estimated for mixed predictor variables only. Interactions It is important to note that although actor and partner effects cannot be estimated for purely between-dyads variables or purely withindyads variables, they can be estimated for interactions between mixed variables and between- or within-dyads variables. Thus, we can estimate whether actor and partner effects are stronger for men or women. Similarly, we can test whether actor and partner effects differ across experimental conditions. In addition to interactions between actor effects or partner effects on mixed variables and between- or within-dyads variables, there may be interactions between actor and partner effects on the mixed variables. Kenny and Cook (1999) suggest several forms that such interactions may take. It may be that a person’s outcome is uniquely predicted by the combination of his or her standing on the mixed predictor variable and his or her partner’s standing on that variable. For example, a secure individual with an avoidant partner may be reasonably satisfied with the relationship, but an avoidant individual with a secure partner may not. Similarity interactions can also be estimated by the absolute value of the difference between dyad members’ scores on the mixed variable. Perhaps being especially similar (or dissimilar) on a variable is beneficial to the relationship. The multiplicative and absolute difference methods of calculating actorpartner interactions will yield highly correlated results. Kenny and Cook (1999) suggest that the researcher choose between each type of method based on the theoretical perspective adopted for the study. Another specification of a unique combinatorial effect requires that only one partner’s score serve as the interaction term because
330
L. Campbell and D. A. Kashy
sometimes only one person in a relationship needs to have a certain skill or deficit for differences between couples to emerge. For example, a couple with one highly depressed member is likely to evidence lower levels of satisfaction than a couple in which neither partner is depressed. We will discuss how various forms of interactions can be tested with both the PROC MIXED and HLM programs. Example data set Consider a hypothetical study of 16 heterosexual dating couples. In this study couples are asked to discuss either a major or minor problem in their relationship. Eight couples are randomly assigned to each condition. The interactions are videotaped, and trained observers rate how emotionally withdrawn each person is during the discussion. Prior to the discussion each partner answers questionnaires designed to assess attachment security. This fictitious study has three types of predictor variables and one outcome variable. Attachment security is a mixed predictor variable (it varies both within and between dyads), gender is a within-dyads predictor variable, and experimental condition (major or minor problem) is a between-dyads predictor variable. Emotional withdrawal is the outcome variable. Estimating Actor and Partner Effects Using PROC MIXED in SAS Structuring the data set To use PROC MIXED, the data set needs to be arranged so that each individual’s outcome score is associated with his or her own predictor scores as well as with his or her partner’s predictor scores. Thus, there will be two lines of data for each couple. If the dyads are distinguishable, for example if each couple has a man and a woman, we would advise keeping the data entry consistent such that the first line for each couple contains the man’s
outcome scores and the second line contains the woman’s outcome scores. This type of approach allows greatest data analytic flexibility—although it is not necessary for the analyses we describe in this paper. If the dyads are nondistinguishable, order is irrelevant. To make the presentation of the data input structure easier, consider a couple in which there is a person X and another person Y. Each line of data would contain a variable that identifies dyad membership (ID). This number would be the same for both dyad members. After the value for ID, Person X’s outcome score(s) would be entered, followed by Person X’s predictor scores on any mixed variables, between-dyads variables, or within-dyads variables. On the same line, following Person X’s predictor scores, would be Person Y’s scores on all of the predictor variables. The second input line for the couple would contain the couple ID variable, Person Y’s outcome variable score(s) followed by Person Y’s predictor scores, followed finally by Person X’s predictor scores. For the example data set, the two lines of data would appear as shown in Table 1, where ID represents the couple identification number, DV the dependent variable (emotional withdrawal), Mixed IV the mixed predictor variable (attachment security), GEN the gender, and COND the experimental condition (discussion of a major or minor problem). Note that the value of COND will be equal across the two dyad members since it is a between-dyads variable in the example (see the Appendix for the structure of the example data set). Effect coding should be used for categorical variables. Thus, for gender men might be coded as 1 and women as 1, and so if person X is a man and Y is a woman the value of X_GEN should be 1 and the value for Y_GEN should be 1. Additionally, because there are two conditions in this study, the value of COND would be either 1 for both dyad members or 1 for both dyad members. This type of effect coding makes the intercept more interpretable. In the
Table 1. Example data set ID ID
X-DV Y-DV
X-Mixed IV Y-Mixed IV
X-GEN Y-GEN
Y-Mixed IV X-Mixed IV
Y-GEN X-GEN
COND COND
Estimating actor, partner, and interaction effects for dyadic data
present example we used 1 for men and 1 for women; for COND, the major problem discussion condition was set to equal 1 and the minor problem discussion condition was set to 1. Setting up the data set in SAS After the data are organized as described, the data set is read into the SAS program. The data are inputted for each individual independently such that each individual is treated as one case and there are two cases for each couple. Therefore, with 16 couples in the example the input statement will read in 32 individual cases. For the hypothetical study we are considering, the SAS input statement would look like the following: DATA EXAMPLE; INPUT ID WDRAW ASECURE AGEN PSECURE PGEN COND; The prefix ‘‘A’’ indicates that the variable refers to the actor—the individual who generated the outcome score (WDRAW) score on that line of data. The prefix ‘‘P’’ indicates that the variables are values for the partner. Also, WDRAW = emotional withdrawal, SECURE = attachment security, GEN = gender, and COND = experimental condition. This input statement allows us to say, as we will indicate shortly using SAS code, that a person’s level of withdrawal is a function of his or her own security (the effects of ASECURE, which is an actor effect) as well as his or her partner’s security (PSECURE, which is a partner effect), the person’s gender (AGEN; note that PGEN is not included because gender is a purely within-dyads variable and so knowing the actor’s gender automatically informs us of the partner’s gender), and treatment condition (COND). Centering the quantitative predictor variables around their means makes interpretation of the intercept more direct. It also is important when interactions might be of interest. To center the actor and partner scores for the mixed variable, the mean of that mixed variable must be computed and subtracted from each individual score. Note that the means for the actor and partner variables for the mixed variable should be identical because in both instances the means are calculated with data from all individuals in
331
the study. The SAS code for centering the actor and partner effects is: PROC MEANS DATA = EXAMPLE; VAR ASECURE; OUTPUT OUT = MEANDATA MEAN = MNSECURE; DATA NEW; IF N = 1 THEN SET MEANDATA; SET EXAMPLE; ASECURE = ASECURE MNSECURE; PSECURE = PSECURE MNSECURE; Using PROC MIXED Consider first an analysis that examines actor and partner effects on security as well as the main effects of gender and treatment condition. No interactions are included in this analysis. The SAS code for this main-effectsonly model is: PROC MIXED DATA = NEW; CLASS ID; MODEL WDRAW = ASECURE PSECURE AGEN COND/ SOLUTION DDFM = SATTERTH; REPEATED/TYPE = CS SUBJECT = ID; TITLE ‘PROC MIXED EXAMPLE: MODEL INCLUDES ONLY MAIN EFFECTS’; The default estimation method used by PROC MIXED is restricted maximum likelihood, and the estimates derived from this default exactly replicate those given using the pooled regression approach (Kashy & Kenny, 2000). As we will discuss shortly, there may be instances when maximum likelihood estimation is preferable. If maximum likelihood estimation is desired, the optional statement METHOD = ML is added to the PROC MIXED line. The CLASS statement indicates the variable that identifies dyad membership (ID). WDRAW is the individual’s outcome score, ASECURE is that person’s security score, PSECURE is the partner’s security score, AGEN is the person’s gender, and COND is
332
treatment condition. Only one of the two gender variables (AGEN or PGEN) should be included in the model because the value of one dictates exactly the value of the other (i.e., gender is a within-dyads variable rather than a mixed variable). The regression estimate for the AGEN effect measures whether a person’s gender affects withdrawal. Note that if PGEN were included rather than AGEN, the regression estimate for PGEN would be exactly equal in magnitude to the AGEN effect, but it would be opposite in sign. The SOLUTION option in the MODEL statement requests that SAS print the estimates for the intercept, the actor and partner slopes for security, as well as the slopes for gender and treatment condition. The DDFM = SATTERTH option requests the Satterthwaite (1946) approximation to determine the degrees of freedom for the intercept and slopes (Kashy & Kenny, 2000). The degrees of freedom for mixed predictor variables using the Satterthwaite approximation will be somewhere between the number of dyads and the number of individuals in the study. The REPEATED statement treats the individual scores as repeated measures in the dyad and CS implies what is called compound symmetry, which means that the degree of nonindependence between dyad members is equal. Nonindependence is estimated as a correlation and not as a variance. Results for the fictitious example data set The results for this main-effects-only model are presented in Table 2. The model information section simply summarizes how the model has been specified. The class level information shows that there are 16 levels, or couples, within the classification variable ID. The model converged very quickly, as evidenced by the information presented in the iteration history section. The covariance parameter estimates give the variance and covariance information for the dyads. The ratio of the CS (compound symmetry) estimate to the sum of the CS and residual estimates provides an estimate of the degree of nonindependence within a couple after controlling for the predictor variables in the model. In this example, this partial intraclass correlation is .82/(.82 + .40), which is .67. Thus,
L. Campbell and D. A. Kashy
in our example data set, after controlling for both partners’ security, gender, and treatment condition, withdrawal scores for the two dyad members were fairly strongly related. The fit statistics section presents four statistics assessing how well the data fit the model. Of particular interest here is the value of the 2 Res Log Likelihood. This statistic can be used to compare the relative fit of two nested models. For example, consider a more complex data set in which there are three mixed predictor variables. One model that could be estimated for this example might include actor and partner effects for each of the three variables. A simplified (i.e., nested) model might include only actor effects for two of the variables and both actor and partner effects for the third predictor. To test whether simplifying the model by removing the two partner effects significantly worsens the model fit, the likelihood ratio test can be conducted by computing the difference between the 2 Log Likelihood for the simpler model and the 2 Log Likelihood for the more complex model. This difference has a chi-square distribution with degrees of freedom equal to the difference in the number of parameters for the two models, which in this example is 2. If the test is significant, it indicates that the more complex model is a better fit with the data.1
1. Multilevel modeling programs generally use either maximum likelihood (ML) or restricted maximum likelihood (REML) estimation. ML uses an iterative solution to derive estimates of both fixed effects (e.g., intercepts and slopes) and random effects (e.g., variances and covariances). REML uses maximum likelihood techniques to estimate random effects but it uses generalized least squares to estimate fixed effects. REML is the default estimation technique in both PROC MIXED in SAS and the data analysis program HLM (Raudenbush et al., 2001). In most cases REML may be preferred over ML for APIM analyses because estimates of fixed effects using ML tend to be biased, particularly with small data sets. However, the likelihood ratio test described can be used only to compare models that involve differences in parameters estimated using ML. Thus, if one wishes to compare two models that differ in their fixed effects, ML would have to be specified as the estimation approach. Note however, that both ML and REML approaches provide t tests of individual fixed effects. It is only in the case when a test of a subset (greater than 1) of predictors is desired that the likelihood ratio test for fixed effects is of interest.
Estimating actor, partner, and interaction effects for dyadic data
The test of the Null Model evaluates whether it is necessary to model the covariance structure in the data at all. In this example, we specified that the covariance structure is one of compound symmetry. The significant null model test indicates that we have significantly improved the model fit by making the compound symmetry specification. The estimates for actor and partner effects, as well as the effects of gender and condition, are in the bottom section of Table 2. These estimates are unstandardized regression coefficients and can be interpreted in that fashion. Because we centered our quantitative variables (ASECURE and PSECURE) and used effect coding for our categorical variables (condition and gender), the intercept is an estimate of the mean for withdrawal at the mean levels (i.e., zero) of the quantitative variables. The effect of ASECURE estimates the degree to which a person’s level of security affects his or her own withdrawal. In the example, this value is b = .12, and is not significant. The partner effect is estimated by PSECURE. The estimate of b = .63, t(18.6) = 4.16, p < .001, indicates that, holding the other predictor variables constant, for each one point of increase in a person’s partner’s security, the person’s withdrawal decreases by.63 points. Finally, although there is no evidence of an overall gender difference in withdrawal (b = .12, ns), there is an effect of condition, b = .94, t(13) = 2.59, p = .02, such that greater withdrawal occurred during major problem discussions relative to minor problem discussions. Creating interaction terms with between-dyads and within-dyads predictors As we have noted previously, if interactions among the variables are of interest, it is important to center the mixed predictor variables with the means derived across the entire sample prior to creating the interactions (Aiken & West, 1991). To test for interactions between purely within-dyads variables and actor and partner effects for mixed variables, product terms would be computed between the actor and partner components of each mixed predictor variable with the scores on the withindyads variable. In our example, gender is purely within-dyads and interactions with
333
gender estimate and test the degree to which there are differences in the sizes of actor effects and partner effects across men and women. For our hypothetical example, the interactions would be calculated as: AGEN SEC = AGEN*ASECURE; PGEN SEC = PGEN*PSECURE; The AGEN_SEC interaction tests whether the relationship between an individual’s security and his or her withdrawal differs for men and women. The PGEN_SEC interaction measures whether the link between an individual’s withdrawal and his or her partner’s security differs for men and women. Note that in creating the interaction terms, AGEN is used for gender interactions involving actor effects, and PGEN is used for gender interactions involving partner effects. Similar procedures are used to test for actor and partner effect differences for couples in different conditions. For our hypothetical example, the interactions would be calculated as: ACON SEC = COND*ASECURE; PCON SEC = COND*PSECURE; The ACON_SEC interaction tests whether the relationship between an individual’s own level of security and his or her own withdrawal differs between the major and minor problem discussion conditions. The PCON_ SEC interaction tests whether the link between an individual’s withdrawal and his or her partner’s security differs depending on the severity of problem being discussed. To keep the analyses reasonably simple, two separate models were estimated. In the first model we included all of the main effects variables as well as the interactions between gender and the actor and partner effects. Thus the model statement was: MODEL WDRAW = ASECURE PSECURE AGEN COND AGEN SEC PGEN SEC/ SOLUTION DDFM = SATTERTH;
334
L. Campbell and D. A. Kashy
Table 2. SAS results from the main-effects-only model PROC MIXED EXAMPLE: MODEL INCLUDES ONLY MAIN EFFECTS The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Subject Effect Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method Class ID
WORK.NEW WDRAW Compound Symmetry ID REML Profile Model-Based Satterthwaite Class Level Information Levels Values 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Dimensions
Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations Iteration 0 1
Cov Parm CS Residual
2 5 0 16 2 32 0 32 Iteration History Evaluations 2 Res Log Like 1 99.61413559 1 91.34050626 Convergence criteria met.
Criterion 0.00000000
Covariance Parameter Estimates Subject ID
Estimate 0.8213 0.3993
Fit Statistics 2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) DF 1
91.3 95.3 95.8 96.9 Null Model Likelihood Ratio Test Chi–Square 8.27
Pr > ChiSq 0.0040 (continued)
Estimating actor, partner, and interaction effects for dyadic data
335
Table 2. (continued)
Effect Intercept ASECURE PSECURE AGEN COND Effect ASECURE PSECURE AGEN COND
Estimate 5.5625 0.1152 0.6254 0.1181 0.9433
Solution for Fixed Effects Standard Error DF 0.2526 13 0.1502 18.6 0.1502 18.6 0.1123 14 0.3647 13
t Value 22.02 0.77 4.16 1.05 2.59
Pr > |t| F 0.4525 0.0005 0.3111 0.0226
Results from this analysis showed no evidence of an interaction between either actor or partner effects on security and gender. The regression coefficient for the actor interaction was b = .08, t(13.7) = .54, p = .60, and the coefficient for the partner interaction was b = .03, t(13.7) = .18, p =.86. The interactions between treatment condition and actor or partner effects on security were included in the second model, along with all of the main effects variables. Thus the model statement was: MODEL WDRAW = ASECURE PSECURE AGEN COND ACON SEC PCON SEC/ SOLUTION DDFM = SATTERTH; Results from this analysis indicate that actor effects are marginally different in the two discussion conditions, and that partner effects are significantly different in the two conditions. The regression coefficient for the actor interaction was b = .26, t(18.9) = 1.89, p = .07, and the coefficient for the partner interaction was b = .33, t(18.9) = 2.43, p = .03. Taking the intercept and main effect coefficients into account [intercept = 6.136; actor effect for
security b = .04, ns; partner effect for security b = .71, t(18.9) = 5.21, p < .001; condition main effect b = .78, t(13) = 2.47, p = .03], the actor and partner effects for security within each level of condition can be calculated. In the major problem condition (COND = 1), the actor effect for security was b = .30 [6.136 + .78(COND) + .04(ASECURE) + .26 (ACON_SEC which is COND*ASECURE) = 6.916 + .30(ASECURE)]. In the minor problem condition (COND = 1), however, the actor effect for security was b = .22 [5.356 .22(ASECURE)]. Thus, for this fictitious example, in the minor problem condition, a person who was higher in security was less withdrawn, whereas in the major problem condition a person who was higher in security was more withdrawn. Similarly, the partner effect for security was b = .38 in the major problem condition and 1.04 in the minor problem condition. These coefficients indicate that the positive impact of a partner’s security on withdrawal (individuals with more secure partners are less withdrawn) is substantially greater in the minor problem condition. Creating actor/partner and similarity interactions Interactions between actor and partner effects may also be theoretically relevant. As has
336
been mentioned, it may be that the outcome is uniquely predicted by the combination of dyad members’ scores on a predictor variable. For example, there may be substantial benefits if both partners are highly secure. One possible way of specifying such an interaction would be to simply create a product term between the mean-deviated actor and partner components of the mixed predictor variable. For our hypothetical example, the multiplicative approach for the interaction could be calculated as AP INTERACTION = ASECURE *PSECURE; This interaction estimates whether withdrawal differs for couples whose partners have different combinations of security. For example, are couples more withdrawn when one partner is high on security but the other partner is low? When we added this interaction to the main effects model, the interaction coefficient was nonsignificant, b = .154, t(12) = 1.18, p = .26. This negative coefficient indicates that fictitious couples in which one partner was above average on security and the other partner was below average tended to have slightly lower withdrawal scores. As noted previously, the multiplicative approach to interactions is not the only type of interaction that can be tested. Kenny and Cook (1999) suggested that if researchers are more interested in investigating the effects of similarity between partners on the mixed variable, the absolute value of the difference between the two partners’ scores for the mixed variable can be calculated. The similarity approach would involve creating a new variable that is the absolute value of the difference between the partners’ security scores: AP SIMILARITY = ABS(ASECURE PSECURE); This interaction tests whether partners who are very similar in security are more or less withdrawn than couples where security varies greatly for the two partners. In our example, the similarity effect was nonsigni-
L. Campbell and D. A. Kashy
ficant, b = .09, t(12) = .31, p = .77. The positive value of this coefficient shows that there was a very slight tendency for members of couples who were less similar in security to be more withdrawn. Up to this point, we have described testing individual interaction terms using the tests of whether those terms have statistically significant regression coefficients. One reason we were able to do this is that in our example, treatment condition (our between-dyads variable) was a dichotomy, and as such required only a single-effect coded variable (COND). If a categorical predictor variable has more than two levels, or if a researcher wants a global test of whether the addition of a set of predictor variables improves the model fit, a likelihood ratio test would be needed. This test would involve estimating both the simple model and the more complex model specifying that the estimation method is maximum likelihood. Then the difference between the two 2 Log Likelihood values would be computed and tested for statistical significance.
Estimating Actor and Partner Effects Using HLM PROC MIXED is an excellent data analytic option for researchers who are conversant in SAS, but researchers who use other packages, SPSS being one of the most common, might want to investigate the HLM program (Raudenbush et al., 2001). Because we cannot detail all of the possible intricacies involved in using HLM, we strongly recommend that researchers interested in using HLM obtain a copy of the user manual to be used in concert with the discussion in this paper. Structuring the data sets Two separate data files need to be created to use the HLM program. The first data set is very similar to the data set described earlier for use with PROC MIXED. This Level 1 data set has one record for each individual, and for each individual contains a variable identifying dyad membership, as well as actor and partner values for any mixed predictor variables, and actor (or
Estimating actor, partner, and interaction effects for dyadic data
partner) values for any within-dyads variables. In our example, the Level 1 data set would include the couple ID (which is the same for both members of the dyad), as well as the outcome measure (WDRAW), actor and partner values for the mixed predictor variable (ASECURE and PSECURE), and actor values for within-dyads variables (AGEN). There would be 32 observations or records in this data set. The second data set is referred to as the Level 2 data set in HLM; it has one record for each couple, and includes the variable that identifies dyad membership (identical to the identification variable in the Level 1 data set) as well as any variables that vary only between dyads. In our example, this data set would include ID as well as COND, and would have 16 observations or records. Because HLM allows for centering variables within the program, these two data sets should contain the raw scores rather than the mean-centered scores. Specifying interactions with HLM requires some forethought. Interactions between actor and partner effects, or between within-dyads variables and actor (or partner) effects, cannot be created while running HLM. Instead they must exist as already computed values within the appropriate data set. For example AGEN_SEC and PGEN_SEC would need to be included in the Level 1 data set. However, because interactions between actor and partner effects are dyad-level effects, AP_INTERACTION and AP_SIMILARITY would need to be included in the Level 2 data set. Interactions between one Level 1 variable (e.g., ASECURE) and one Level 2 variable (e.g., COND) can be estimated within the HLM program itself, and need not be included in the data sets. HLM can run using data files imported from commercial software programs such as SPSS, as well as data files that are ASCII text files. Our experience has been that using imported files from SPSS is substantially easier, and we use this option in our discussion.
337
means, variances, and covariances for all of the variables in the analysis, and, in most cases, HLM uses this SSM for all subsequent analyses and does not refer back to the raw scores. To create this matrix, after opening the program, select FILE, then SSM, then NEW, then STAT PACKAGE INPUT. Four options are then available, from which HLM2 should be selected because there are two levels in the analysis (individual and dyad). A file name for the SSM needs to be provided, and then the input file type SPSS/WINDOWS should be selected. After identifying the Level 1 data set, variables to be included in the analyses need to be indicated by selecting CHOOSE VARIABLES. One variable needs to be designated as the dyad identification variable (in our example this would be ID), and all other individual-level variables should be selected for inclusion in the SSM. A similar procedure is followed for specifying the Level 2 data set information.2 The sufficient statistics matrix is created by selecting MAKE SSM. (If you have not already done so, you will need to save the response file.) After the SSM has been created, the program will not allow the user to proceed until the CHECK STATS button is pressed. It is a very good idea to look at this file to ensure that the proper number of observations has been read from each data file, and to check for any other problems that may have arisen. After the data have been checked, click the DONE button and HLM will move to the model specification stage. Using HLM: Main effects model analysis and results To estimate the main effects model for our fictitious data set, WDRAW needs to be identified as the outcome variable. Then ASECURE and PSECURE each need to be selected as predictors using the ‘‘Add Variable Grand Centered’’ option. This centers
Reading the data files into the HLM program The first step in an analysis using HLM is to create a Sufficient Statistics Matrix (SSM) from the raw data. The SSM contains the
2. HLM allows for inclusion of weight variables, but these are not necessary for an APIM analysis. If there are missing values, listwise deletion is recommended by the program authors.
338
L. Campbell and D. A. Kashy
Table 3. HLM results from the main-effects-only model Program: HLM 5 Hierarchical Linear and Nonlinear Modeling Authors: Stephen Raudenbush, Tony Bryk, & Richard Congdon Publisher: Scientific Software International, Inc. (c) 2000 The maximum number of level–2 units = 16 The maximum number of iterations = 100 Method of estimation: restricted maximum likelihood Weighting Specification Weighting? Level 1 no Level 2 no
Weight Variable Name
Normalized? no no
The outcome variable is WITHDRAW The model specified for the fixed effects was: Level–1 Coefficients INTRCPT1, B0
Level–2 Predictors INTRCPT2, G00 COND, G01 #% ASECURE slope, B1 INTRCPT2, G10 # AGEN slope, B2 INTRCPT2, G20 #% PSECURE slope, B3 INTRCPT2, G30 ‘#’ — The residual parameter variance for this level – 1 coefficient has been set to zero. ‘%’ — This level – 1 predictor has been centered around its grand mean. The model specified for the covariance components was: Sigma squared (constant across level – 2 units) Tau dimensions INTRCPT1 Summary of the model specified (in equation format) Level – 1 Model Y = B0 + B1*(ASECURE) + B2*(AGEN) + B3*(PSECURE) + R Level – 2 Model B0 = G00+ G01*(COND)+ U0 B1 = G10 B2 = G20 B3 = G30 The value of the likelihood function at iteration 1 = 4.483489E + 001 The value of the likelihood function at iteration 10 = 4.475131E + 001 Iterations stopped due to small change in likelihood function ******* ITERATION 11 ******* Sigma_squared = 0.39927 Tau INTRCPT1, B0 0.82129 Tau (as correlations) INTRCPT1, B0 1.000 (continued)
Estimating actor, partner, and interaction effects for dyadic data
339
Table 3. (continued) Random level – 1 coefficient INTRCPT1, B0 The value of the likelihood function at iteration The outcome variable is WITHDRAW Final estimation of fixed effects: Fixed Effect Coefficient Error For INTRCPT1, B0 INTRCPT2, G00 5.562500 0.252601 COND, G01 0.943284 0.364713 For ASECURE slope, B1 INTRCPT2, G10 0.115244 0.150193 For AGEN slope, B2 INTRCPT2, G20 0.118065 0.112348 For PSECURE slope, B3 INTRCPT2, G30 0.625402 0.150193
Reliability estimate 0.804 11 = 4.475131E + 001
the variable using the mean obtained across the entire sample. AGEN is also selected, but since it is effect coded, it is added uncentered. This should result in the following Level 1 model:
This random component, represented as the Tau parameter in HLM, provides a measure of the intraclass correlation between dyad members on the outcome variable. The second modification to the Level 2 models is that the between-dyads predictor variable, COND, must be entered as a predictor of the Level 1 intercept, b0. To add COND, the b0 level-2 model is selected and COND is chosen from the list of Level 2 predictors using the ‘‘Add Variable Uncentered’’ option (as with gender, COND is already effect coded). The resulting Level 2 models are:
WDRAW = b0 þ b1 (ASECURE) þ 2 (AGEN Þ þ 3 (PSECURE) þ r This model suggests that each individual’s withdrawal is a function of that person’s own security, gender, and the partner’s security. The level 2 models at this point should be: b0 ¼ g00 þ u0 ;
b1 ¼ g10 þ u1 ;
b2 ¼ g20 þ u2 ;
b3 ¼ g30 þ u3 :
These models suggest that each Level 1 coefficient (the intercept and three slopes) is a function of a fixed component, g, and a random component, u. Two modifications must be made to the Level 2 models. First, the random component must be removed from the level 2 models of the slopes (b1, b2, b3). This is accomplished by selecting each Level 2 model and then selecting the ‘‘Error Term for Currently Selected Level-2 Equation’’ option. The random component is not removed from b0.
Standard T-ratio
d.f.
Approx. P-value
22.021 2.586
14 14
0.000 0.022
0.767
27
0.450
1.051
27
0.303
4.164
27
0.000
b0 ¼ g00 þ g01 ðCONDÞ þ u0 ; b2 ¼ g20 ;
b1 ¼ g10 ;
b3 ¼ g30 :
The APIM model is estimated by first saving the file that contains the model specifications, and then selecting RUN ANALYSIS. To view the output, select VIEW OUTPUT from the FILE menu. The output contains four sets of results. The first two sets of results are from models that do not allow the intercepts to vary and are not of interest. The second two sets of results, which appear after the iteration history information, allow variance in the intercepts, and it is this variance that models the relationship between dyad members’
340
L. Campbell and D. A. Kashy
outcome scores. The first set of results that immediately follows the iteration history information is of primary interest.3 Table 3 presents a subset of the output for the main-effects-only model. Examination of the fixed effects shows that HLM and PROC MIXED estimated identical coefficients and standard errors for each effect in the model. One important difference between HLM and PROC MIXED is with the calculation of degrees of freedom used to test the significance of the model parameters. PROC MIXED uses the Satterthwaite (1946) approximation to determine the degrees of freedom for the intercept and slopes, resulting in degrees of freedom that are between the number of couples and individuals in the sample. In contrast, HLM bases the degrees of freedom for the actor and partner effects on the number of individuals in the sample. Therefore, the significance tests are more liberal in HLM than in PROC MIXED. Including interaction terms Testing interactions with HLM requires making modifications to either the Level 1 or Level 2 main effects models, and the modifications depend on the natures of the variables included in the interaction. To test interactions between a between-dyads variable such as COND and actor and/or partner effects for a mixed predictor variable (e.g., ASECURE and PSECURE), the betweendyads variable, COND, needs to be added as a predictor of the slope of the actor and partner effects for security in the Level 2 model. The resulting Level 2 models will then be:
With these additions, the model will test for differences in the slopes for the actor and partner effects across experimental conditions. To determine if the actor and partner effects differ for men and women, the interactions between the actor and partner effects with gender need to be included in the Level 1 model. (Recall that these interactions need to already exist in the Level 1 data set.) To include these interactions, add AGEN_SEC and PGEN_SEC (grand mean centered) as predictor variables in the Level 1 model: WDRAW ¼ b0 þ b1 ðASECUREÞ þ b2 ðAGENÞ þ b3 ðPSECUREÞ þ b4 ðAGEN SECÞ þ b5 ðPGEN SECÞ þ r When these predictors are added, a random error component for the slope of these variables is included in the Level 2 model section, and it is important to remove these random error components before running the analysis. Finally, to test the actor partner interactions (AP_INTERACTION and AP_SIMILARITY), each interaction needs to already exist in the Level 2 data set, and is included in the Level 2 model section. It is important to note that the multiplicative interaction (AP_INTERACTION) needs to be grand mean centered, whereas the similarity interaction (AP_SIMILARITY) is not centered since zero is a meaningful value for this type of interaction. To be added, either interaction variable must be entered as a predictor of the Level 1 intercept, b0. The resulting Level 2 model for AP_INTERACTION would be:
b0 ¼ 00 þ 01 ðCONDÞ þ u0 ;
b0 ¼ g00 þ g01 ðCONDÞ
b1 ¼ 10 ðCONDÞ;
þg02 ðAP INTERACTION Þ þ u0 ; b2 ¼ g20 ; b3 ¼ g30 : b1 ¼ g10 ;
b2 ¼ 20 ;
b3 ¼ 30 ðCONDÞ:
Conclusion 3. The last set of results uses robust standard errors to calculate the test statistic (to adjust for nonnormality in the data), and is only meaningful when a large number of observations are present.
Our intention in writing this paper was to provide a guide for dyadic researchers who are interested in expanding their data analytic
Estimating actor, partner, and interaction effects for dyadic data
repertoires. We have included the fictitious data set described in this paper in the Appendix, and we suggest that interested readers work with these data to replicate our analyses. This paper has provided a step-by-step account of how to use PROC MIXED in SAS and HLM to estimate the effects of the APIM, and to some extent it raises the question: Which one should you choose? Our recommendation depends on your familiarity with statistical packages such as SAS and SPSS. If you are comfortable with SAS, then PROC MIXED is clearly preferable. Not only is this program included in the general SAS/STAT software, it also uses what we consider to be a more appropriate set of degrees of freedom to test actor and partner effects. If you have little or no SAS experience and are comfortable with SPSS, HLM is a reasonable option.
341
In summary, the accessibility of easy-to-use data analytic tools for dyadic data is essential for theories about dyadic relationships to be tested adequately. The approaches outlined in this paper allow researchers to assess both actor and partner effects, gender differences on the actor and partner effects, and differences on the actor and partner effects when couples differ on important variables (e.g., experimental condition). This approach is also very flexible, allowing the researcher to specify models that contain only actor or partner effects, or both. Additionally, various types of interactions can be added to the model without complication. We hope that the ease of implementing this approach will be appealing to relationships researchers and will encourage researchers to consider the actor-partner model in their own work.
References Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. London: Sage. Bowlby, J. (1969). Attachment and loss. Vol 1, Attachment. New York: Basic Books. Bowlby, J. (1973). Attachment and loss. Vol 2, Separation: Anxiety and anger. New York: Basic Books. Bowlby, J. (1980). Attachment and loss. Vol 3, Loss. New York: Basic Books. Campbell, L., Simpson, J. A., Kashy, D. A., & Fletcher, G. J. O. (2001). Ideal standards, the self, and flexibility of ideals in close relationships. Personality and Social Psychology Bulletin, 27, 447–462. Gonzalez, R., & Griffin, D. (1999). The correlational analysis of dyad-level data in the distinguishable case. Personal Relationships, 6, 449–469. Kashy, D. A., & Kenny, D. A. (2000). The analysis of data from dyads and groups. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social psychology (pp. 451–477). New York: Cambridge University Press. Kelley, H. H., Berscheid, E., Christensen, A., Harvey, J. H., Huston, T. L., et al. (1983). Analyzing close relationships. In H. H. Kelley, E. Berscheid, A. Christensen, et al. (Eds.), Close relationships (pp. 20– 67). New York: Freeman. Kelley, H. H., & Thibaut, J. W. (1978). Interpersonal relations: A theory of interdependence. New York: Wiley. Kenny, D. A. (1988). The analysis of data from two person relationships. In S. Duck (Ed.), Handbook of interpersonal relationships (pp. 57–77). London: Wiley. Kenny, D. A. (1990). Design issues in dyadic research. In C. Hendrick & M. S. Clark (Eds.), Review of personality and social psychology: Research methods in personality and social psychology (pp. 164–184). Newbury Park, CA: Sage. Kenny, D. A. (1995). The effect of non-independence on
significance testing in dyadic research. Personal Relationships, 2, 67–75. Kenny, D. A. (1996). Models of interdependence in dyadic research. Journal of Social and Personal Relationships, 13, 279–294. Kenny, D. A., & Cook, W. (1999). Partner effects in relationship research: Conceptual issues, analytic difficulties, and illustrations. Personal Relationships, 6, 433–448. Messick, D. M., & Crook, K. S. (Eds.). (1983). Equity theory: Psychological and sociological perspectives. New York: Praeger. Murray, S. L., Holmes, J. G., & Griffin, D. W. (1996a). The benefits of positive illusions: Idealization and the construction of satisfaction in close relationships. Journal of Personality and Social Psychology, 70, 79–98. Murray, S. L., Holmes, J. G., & Griffin, D. W. (1996b). The self-fulfilling nature of positive illusions in romantic relationships: Love is not blind, but prescient. Journal of Personality and Social Psychology, 71, 1155–1180. Raudenbush, S. W., Bryk, A. S., Cheong, Y. F., & Congdon, R. (2001). HLM 5: Hierarchical linear and nonlinear modeling (2nd ed.). Scientific Software International. Rempel, J. K., Holmes, J. G., & Zanna, M. P. (1985). Trust in close relationships. Journal of Personality and Social Psychology, 49, 95–112. Robins, R. W., Caspi, A., & Moffitt, T. E. (2000). Two personalities, one relationship: Both partners’ personality traits shape the quality of their relationship. Journal of Personality and Social Psychology, 79, 251–259. Rusbult, C. E. (1980). Commitment and satisfaction in romantic associations: A test of the investment model. Journal of Experimental Social Psychology, 16, 172–186.
342
L. Campbell and D. A. Kashy
Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110–114. Schumacker, R. E., & Marcoulides, G. A. (1998). Interactions and nonlinear effects in structural equation modeling. Mahwah, NJ: Erlbaum. Singer, J. D. (1998). Using SAS PROC MIXED to fit
multilevel models, hierarchical models, and individuals growth models. Journal of Educational and Behavioral Statistics, 24, 323–355. Thibaut, J. W., & Kelley, H. H. (1959). The social psychology of groups. New York: Wiley. Walster, E., Walster, G. W., & Berscheid, E. (1978). Equity: Theory and research. Boston: Allyn & Bacon.
Appendix: Example dyadic data set ID
WDRAW
ASECURE
AGEN
PSECURE
PGEN
COND
001 001 002 002 003 003 004 004 005 005 006 006 007 007 008 008 009 009 010 010 011 011 012 012 013 013 014 014 015 015 016 016
3 4 6 4 3 2 4 5 3 2 6 5 5 3 4 7 8 7 6 5 5 6 7 8 9 6 8 8 8 6 7 8
7 6 5 4 7 6 5 6 7 7 6 4 7 5 4 7 2 3 5 4 3 5 2 4 5 1 6 6 5 4 3 4
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
6 7 4 5 6 7 6 5 7 7 4 6 5 7 7 4 3 2 4 5 5 3 4 2 1 5 6 6 4 5 4 3
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1