The Extrinsic Affective Simon Task Jan De Houwer Ghent University, Belgium
Abstract. A modified version of the Implicit Association Test (IAT) is described that is based on a comparison of performance on trials within a single task rather than on a comparison of performance on different tasks. In two experiments, participants saw white words that needed to be classified on the basis of stimulus valence and colored words that were to be classified on the basis of color. On trials where the colored word referred to a positive target concept (e.g., “flowers,” “self ”), performance was superior when the correct response was the response that was also assigned to positive white words. The reverse was true on trials where the colored word represented a negative target concept (e.g., “insect”). This variant of the IAT is less susceptible to nonassociative effects of task recoding and can be used to assess single and multiple attitudes. Key words: attitudes, social cognition, stimulus-response compatibility
During the past 15 years, a number of reaction time tasks have been developed that potentially allow researchers to study and assess attitudes indirectly (see De Houwer, in press; Fazio & Olson, in press, for reviews). Amongst other things, such indirect measures can and have been used as a tool (a) to test general theories of attitudes (e.g., Fazio, Sanbonmatsu, Powell, & Kardes, 1986), (b) to study the way in which groups of people differ in the attitudes that they hold (e.g., de Jong, 2002), and (c) to measure individual differences in attitudes (e.g., McConnell & Leibold, 2001). Currently, the most widely used indirect measure of attitudes is the Implicit Association Test (IAT) that was introduced by Greenwald, McGhee, and Schwartz (1998). In the IAT, participants are asked to categorize stimuli as belonging to one of four categories by pressing one of two keys. For instance, one might present names of flowers, names of insects, positive adjectives, and negative adjectives. Greenwald et al. (1998, Experiment 1) demonstrated that participants are faster to press one key for both flower names and positive adjectives and another key for both insect names and Experiment 1 was conducted while Jan De Houwer was a lecturer at the University of Southampton, UK. I would like to thank Tom Randell for his help in collecting the data. Thanks also to Klaus Fiedler, Constantine Sedikides, Aiden Gregg, Brian Nosek, Dirk Wentura, Melanie Steffens, and several anonymous reviewers for their comments on earlier drafts of this paper. Inquisit software for running an EAST experiment can be downloaded at http:// allserv.rug.ac.be/~jdhouwer/ DOI: 10.1027//1618-3169.50.2.77 ” 2003 Hogrefe & Huber Publishers
negative adjectives (compatible task) than to press one key for flower names and negative adjectives and the other key for insect names and positive adjectives (incompatible task). This pattern of results indicates that, as can be expected on a priori grounds, the attribute concept “positive” is more closely associated in memory with the target concept “flower” than with the target concept “insect” and/or that the attribute concept “negative” is more closely associated with “insect” than with “flower”. Assuming that an attitude is stored in memory as an association between the representation of the attitude object and the representation of positive and negative valence (e.g., Fazio, 1986), one can argue that the IAT provides an indirect way to measure attitudes. There are several reasons why the IAT has become so popular in such a short period of time (see De Houwer, 2002; Greenwald & Nosek, 2001, for reviews). First, the IAT is very flexible. It can be used not only to measure attitudes toward a large variety of attitude objects, but also to measure nonevaluative beliefs and associations (i.e., associations that do not directly involve the representations of positive and negative valence). Second, IAT effects generally have a large effect size and are thus easy to replicate. Third, the IAT is easy to implement and software for doing so is widely available. Fourth, individual differences in IAT effects tend to be reliable and are, at least in some cases, related to interindividual differences in behavior that can be assumed to reflect the measured attitudes. Nevertheless, a number of authors have pointed at potential problems with the IAT (e.g., Fiedler, Experimental Psychology 2003; Vol. 50(2): 77Ð85
78
Jan De Houwer
Messner, & Bluemke, 2002; Karpinski & Hilton, 2001; Mierke & Klauer, 2001; Rothermund & Wentura, 2001). Several of these problems are related to the fact that IAT effects are based on a comparison of performance in two separate tasks: A compatible task in which associated concepts are assigned to the same response and an incompatible task in which associated concepts are assigned to different responses. Although instructions inform participants about how to tackle the tasks, it is possible that participants will try to recode the tasks with the aim of simplifying them. In many cases, recoding can be based on the associations that one tries to measure. For example, in the compatible task of a flower-insect IAT (press left for flower and positive; press right for insect and negative), participants can simplify the task by pressing a left key for all positive stimuli (including flowers) and a right key for all negative stimuli (including insects). As a result, only two rather than four category-response assignments need to be applied. Because such a recoding is not possible in the incompatible task (press left for flower and negative; press right for insect and positive), performance will be superior in the compatible task (Mierke & Klauer, 2001).1 But recoding could sometimes also be based on information that is unrelated to the associations that one tries to measure. In principle, participants can exploit any type of similarity between concepts or stimuli. For instance, when asked to categorize pictures of snakes, rivers, coins, and pizzas, participants find it easier to press one key for snakes and rivers and the other key for coins and pizzas than to press one key for snakes and coins and the other key for rivers and pizzas (De Houwer & Geldof, 2002). This suggests that participants recode the tasks by exploiting the perceptual similarity between, on the one hand, snakes and rivers (both are winding) and, on the other hand, and coins and pizzas (both are round). Likewise, Brendl, Markman, and Messner (2001) found that participants perform better when nonwords and negative words are assigned to one key and positive words and insects to a second key than when the assignment is reversed (first key for nonwords and positive; second key for insects and negative). Rothermund and Wentura (2001) argued that this result is due to the fact that participants use the salience of the categories to recode the tasks (i.e., nonwords and negative are more salient categories than insect and positive). By influencing recod1 Note that recoding does not need to be intentional or explicit. When concepts or stimuli that are assigned to the same response share a certain feature (such as valence), the correct response can also be activated automatically on the basis of this feature (De Houwer, 2001; Fiedler et al., 2002; Mierke & Klauer, 2001). If recoding on the basis of stimulus valence is intentional, however, one could argue that IAT effects at least partially reflect explicit attitudes.
Experimental Psychology 2003; Vol. 50(2): 77Ð85
ing, nonassociative attributes such as perceptual features and salience could influence IAT effects. This could, in certain cases, reduce the validity of the observed IAT effects. Although it is not yet clear to which extent recoding on the basis of nonassociative attributes actually occurs in standard IAT tasks, it would be useful to have an alternative method in which problems as the result of recoding are less likely to occur. Recently, De Houwer (2001, Footnote 4) briefly reported a variant of the IAT that allows for a comparison of performance within a task. This variant had two unusual features. First, the target concepts did not differ in valence. Second, the valence of the target concept stimuli was manipulated. Participants were either asked to press the first key for positive and person words and the second key for negative and animal words (Task 1), or to press the same key for positive and animal words and the second key for negative and person words (Task 2). On the target concept trials, both positive and negative exemplars of the target concept “person” (e.g., FRIEND, ENEMY) and of the target concept “animal” (e.g., SWAN, COCKROACH) were presented. The overall performance in the two tasks did not differ (i.e., no IAT effect), which is in line with the assumption that the target concepts “person” and “animal” have a similar valence. However, in both tasks, responses to person and animal words were faster when the correct response was associated with the same valence as the valence of the presented word. For instance, in the first task (press left for positive and person; press right for negative and animal), responses to positive person and negative animal exemplars were faster than responses to negative person and positive animal exemplars. Interestingly, exemplar valence had no effect in a second experiment in which the target concepts clearly differed in valence. These results suggest an alternative way to assess attitudes by comparing performance on different trials within a task. Provided that the target concepts do not differ in valence, one can assess the attitude toward a stimulus by looking at target concept trials on which this stimulus is presented. If performance is better when the response associated with positive attribute stimuli is required, one can infer that the participants have a positive attitude toward the stimulus. If the reverse result is observed, one can infer that the stimulus is negative. De Houwer (2001) pointed out that the target concept trials in this variant of the IAT are structurally similar to trials in an affective Simon task (e.g., De Houwer & Eelen, 1998; De Houwer, Crombez, Baeyens, & Hermans, 2001). In affective Simon studies, participants are asked to choose between a positive or negative response on the basis of a nonevaluative feature of valenced words. For instance, they might be asked to say “GOOD” whenever a person word is pre” 2003 Hogrefe & Huber Publishers
Extrinsic Affective Simon Task
sented and to say “BAD” when an animal word is presented. Results show that responses are faster when the valence of the presented word and the correct response match (e.g., say “GOOD” to FRIEND because it is a person word) than when the valence of the word and response differ (e.g., say “GOOD” to ENEMY because it is a person word) (De Houwer et al., 2001, Experiment 1). The person and animal trials in the modified IAT of De Houwer (2001) are very similar to the trials in an affective Simon task. The main structural difference is that in the modified IAT participants give responses that are intrinsically unrelated to valence (i.e., press a left or right key) but that are extrinsically related to valence because of task instructions (see De Houwer, 2001, in press). Although pressing a left or a right key is not a positive or negative response as such, those responses are associated with positive or negative valence because, within the modified IAT task, one response is also assigned to positive words and the other response is also assigned to negative words. Just like it is more difficult to say “GOOD” to ENEMY (affective Simon effect), it is seems to be more difficult to give a neutral response to ENEMY when that neutral response is extrinsically associated with positive valence because it is also assigned to positive stimuli. Therefore, from a structural point of view, the term extrinsic affective Simon task (EAST) provides an accurate structural description of the modified IAT (De Houwer, in press).2 In the remainder of the manuscript, I will thus use the term EAST to refer to the modified IAT task. Whereas the original IAT effect corresponds to the difference in performance on two different tasks, an EAST effect can be calculated by comparing trials within the same task (i.e., trials on which the response and the target stimulus are associated with the same valence compared to trials on which they are associated with a different valence). Therefore, EAST effects are less likely to be influenced by nonassociative variables that determine how participants recode tasks. The EAST also has some other potential advantages compared to the original IAT. Most importantly, unlike the IAT, it could allow one to assess single associations and multiple associations. As pointed out by Greenwald and Farnham (2000), in 2 Neither the name EAST, nor the structural analysis underlying the selection of this name, implies one particular theoretical account of EAST effects. As De Houwer (in press), pointed out, a structural analysis of tasks highlights invariance in content-independent elements of tasks (e.g., that the match between an irrelevant stimulus feature and a relevant response feature differs from trial to trial). From this perspective, the target concept trials in the EAST only differ from the trials in an affective Simon task with regard to the content-specific nature of the responses (i.e., the way in which the responses are related to positive or negative valence).
” 2003 Hogrefe & Huber Publishers
79
the IAT, one has to use complementary pairs of concepts and attributes (e.g., positive-negative, blackwhite, self-other, . . .). For this reason, the IAT can provide only a relative measure of associations and attitudes. For instance, the typical result in a flowerinsect IAT (faster when flower and positive are assigned to one key and insect and negative to the other key, see above) suggests that the concept “flower” is more positive than the concept “insect.” But such an effect could be due to the fact that “flower” is positive and “insect” negative, that both concepts are positive but “flower” more so than “insect”, or that both concepts are negative but “insect” more so than “flower.” In the EAST, however, one can estimate the attitude toward a target concept by selecting stimuli that represent this concept and comparing the time needed to make an extrinsically positive response with the time needed to give an extrinsically negative response to those stimuli. Moreover, if one presents stimuli that represent several attitude objects, one should be able to estimate the attitude toward each of those attitude objects. The aim of the present experiments was to further develop the EAST and to start exploring its usefulness as an indirect measure of attitudes. Both experiments had the same format. On some trials, white words were presented whereas on the other trials words were colored green or blue. Participants were instructed to press a left or right key in response to the valence of the white words and the color of the colored words. By assigning one response to positive white words and the other response to negative white words, responses became extrinsically associated with positive or negative valence. The prediction was that performance would be superior on trials on which the participants needed to select the extrinsically positive response in response to a colored positive word and trials on which the extrinsically negative response had to be given in response to a colored negative word. In Experiment 1, the colored words were normatively positive and negative nouns. The main aim of this experiment was to replicate the EAST effect observed by De Houwer (2001, Footnote 4) using color rather than semantic category as the relevant feature of the target concept stimuli. The obvious advantage of color as a relevant feature is that every word can be presented in several colors whereas the semantic categories to which a word belongs are more or less fixed. When word color is relevant, one can vary the response that needs to be given to a word by varying its color. As a result, each word functions as its own control. The aim of Experiment 2 was to test whether the EAST can detect socially meaningful attitudes and to explore whether several attitudes can be measured simultaneously. On the colored trials, I presented the first name of the participant, the first name of another participant, the word FLOWERS, the word INSECT, Experimental Psychology 2003; Vol. 50(2): 77Ð85
80
Jan De Houwer
and the meaningless letter-string XXXXX. Assuming that most individuals have a positive attitude toward themselves (Sedikides, 1993; Sedikides & Strube, 1997), one can predict that participants will find it easier to give the extrinsically positive response to their own name than to give the extrinsically negative response. The name of another participant was presented primarily to check whether this effect would be specific to the name of the participant.
Experiment 1 Method Participants Fifty-one undergraduate students from various departments at the University of Southampton were paid £6 for their participation in this experiment.
Materials Five positive and five negative nouns were presented on the colored trials, whereas five positive and five negative adjectives were presented on the white trials (see Appendix). The nouns were either presented in a green or blue color. The blue color was created by setting the red, blue, and green values in the Turbo Pascal program at 0, 38, and 46 respectively. The red, blue, and green values for the green color were 0, 46, and 38 respectively. As a result, the green and blue colors were quite similar. The default Turbo Pascal values were used for the white color. All words were presented on a black background. A letter was 7 mm high and 5 mm wide. Presentations were controlled by a Turbo Pascal 5.0 program that operated in graphics mode. The program was implemented on an IBM compatible 486 computer that was situated in a darkened research cubicle. Participants were seated in front the computer at a distance of approximately 40 cm from the 15 inch screen. They could respond by pressing the key “q” or the key “p” of the (QWERTY) keyboard. The time between the onset of a word and the first key press was measured using a highly accurate (beyond 1 ms) Turbo Pascal Timer (Bovens & Brysbaert, 1990).
Procedure Participants completed the experiment individually. After filling in an informed consent form, they were given written instructions on the computer screen. These instructions informed participants that words Experimental Psychology 2003; Vol. 50(2): 77Ð85
would be presented in the middle of the computer screen. Their task was to classify these words by pressing the good key (i.e., key P) or the bad key (i.e., key Q) depending on the meaning or color of the presented word. They were told that, if the word was white (i.e., not colored), then the meaning of the words was important. All participants were instructed to press the good key (P) for white words with a positive meaning (e.g., KIND) and to press the bad key (Q) for white words with a negative meaning (e.g., HOSTILE). If the word was colored, however, they were instructed to press the good or bad key on the basis of the color of the word. Half of the participants were instructed to press the good key in response to words in a bluish color and the bad key in response to words in a greenish color. The other participants received the reversed colorresponse assignments. Next participants were informed that a red cross would appear underneath the word if they made an incorrect response. Both the cross and the word would remain on the screen until the participant gave the correct response. Participants were asked to respond as quickly but also as accurately as possible. Finally, they were told that there would be two practice blocks of 20 trials followed by four test blocks of 30 trials and that the experiment would take about 15 minutes. The experiment started with a practice block during which each of the 10 white words was presented twice in a random order. During the second practice block, each of the 10 nouns was presented, once in blue and once in green. Next there were four test blocks of 30 trials during which each of the 10 nouns was presented once in each color and each of the 10 adjectives was presented once in white. Instructions about the upcoming task were given before each practice and test block. These instructions informed the participants about what key to press in response to which type of stimulus. After reading those instructions, participants started the presentations by pressing the return key. In all practice and test blocks, stimuli were presented in a random order with the restriction that the same word could not be presented on two or more consecutive trials and that the required response could not be the same on four or more consecutive trials. The first test block started with four warm up trials, the other test blocks started with two warm up trials. Each of the 10 adjectives was presented in white on one of the 10 warm up trials. Which word was presented on which warm up trial was determined randomly. Each practice, test, and warm up trial consisted of the following sequence of events: A white fixation cross for 500 ms; the word until a correct response was given; if the participant made an incorrect response, a red cross appeared underneath the word until the participant pressed the correct key. The intertrial interval was 1500 ms. ” 2003 Hogrefe & Huber Publishers
Extrinsic Affective Simon Task
Results I analyzed the results of the test trials on which colored words were presented, only taking into account the time and accuracy of the first response on those trials and discarding reaction times on trials with an incorrect response. In accordance with Greenwald et al. (1998), reaction times below 300 ms or above 3000 ms were recoded to 300 ms and 3000 ms respectively, and latencies were log-transformed. I then calculated the mean log-transformed reaction time and the percentage of errors separately for trials on which a positive word was presented and an extrinsically positive response was required (i.e., the response that was assigned to positive white words), trials with a positive word and an extrinsically negative response (i.e., the response that was assigned to negative white words), trials with a negative word and an extrinsically positive response, and trials with a negative word and an extrinsically negative response. The resulting mean log-transformed reaction times and percentage of errors (see Table 1) were analyzed using a 2 (experiment half: first or last 60 trials) ¥ 2 (stimulus valence: positive or negative) ¥ 2 (extrinsic response valence: positive or negative) ANOVA with repeated measures on both variables. Error data were also analyzed because standard affective Simon effects often also emerge in error data (e.g., De Houwer & Eelen, 1998). The analysis of the log-transformed reaction times revealed a main effect of experiment half, F(1, 50) = 11.37, p = .001, resulting from slower responses in the first than in the second half of the experiment. The main effect of extrinsic response valence was marginally significant in the analysis of the reaction times, F(1, 50) = 3.78, p = .06, and significant in the analysis of the error data, F(1, 50) = 4.57, p = .04. Participants tended to be faster and made fewer errors when the response that was associated with negative valence was required. More imTable 1. Mean Untransformed Reaction Times in ms and Percentage of Errors (SD in Parentheses) on Target Stimulus Trials as a Function of Stimulus Valence and Extrinsic Response Valence in Experiment 1 Stimulus Valence Positive Reaction Time Percentage of Errors Negative Reaction Time Percentage of Errors
Extrinsic Response Valence Positive Negative 660 (138) 3.92 (5.94)
678 (130) 9.90 (8.63)
707 (131) 11.47 (9.71)
636 (104) 2.26 (4.72)
” 2003 Hogrefe & Huber Publishers
81
portantly, the crucial interaction between stimulus valence and extrinsic response valence was significant, F(1, 50) = 29.30, p ⬍ .001, for the reaction time data, F(1, 50) = 43.30, p ⬍ .001, for the error data. The ANOVAs did not reveal any other significant effects, Fs ⬍ 1. An EAST score was calculated separately for positive and negative words by deducting the mean logtransformed reaction time and percentage of errors on trials with an extrinsically positive response from the mean log-transformed reaction time and percentage of errors on trials with an extrinsically negative response. A positive EAST score thus signifies a positive attitude. The effect size estimates d for the reaction time EAST scores were based on the logtransformed data. For reasons of clarity, I will, however, report the mean EAST scores as calculated on the basis of untransformed reaction times. On trials with colored negative stimuli, positive responses were given more slowly, t(50) = 5.01, p ⬍ .001, mean EAST score of Ð71 ms, effect size of d = 0.70, and less accurately, t(50) = 6.35, p ⬍ .001, M = -9.21%, d = 0.89, then negative responses. On trials with colored positive words, positive responses tended to be faster, t(50) = 1.72, p = .09, M = 19 ms, d = 0.24, and were emitted more accurately, t(50) = 4.58, p ⬍ .001, M = 5.98%, d = .64, than negative responses. To assess the split-half reliability of the EAST effects, I calculated the EAST scores for positive and negative words, separately for the first and second half of the experiment. In the reaction time data, the correlation between the EAST score in the first half and the EAST score in the second half was significant for positive words, r = .35, p = .01, and marginally so for negative words, r = .26, p = .07. In the error data, the EAST score for negative, r = .47, p = .001, but not positive words, r = .15, was reliable. None of the other possible correlations (including those between reaction time and error EAST scores) was significant, all rs ⬍ .19, except for the correlation between the second half effect for positive words and the second half effect for negative words, r = -.30, p = .03.
Experiment 2 Method Participants Forty-nine psychology undergraduates at Ghent University took part in exchange for course credits. All were native Dutch speakers. Experimental Psychology 2003; Vol. 50(2): 77Ð85
82
Jan De Houwer
Materials and Procedure Experiment 2 differed from Experiment 1 with regard to the following points. First, on colored trials, the first name of the participant, the first name of the previous participant, the Dutch word BLOEMEN (flowers), the Dutch word INSECT (insect), and the letter-string “XXXXX” were presented. On the white trials, five positive and five negative Dutch adjectives were presented (see Appendix). Second, a Dutch translation of the instructions of Experiment 1 was presented. Third, the 20 white practice trials were followed by 20 colored practice trials during which each of the five colored words was presented twice in each color. The practice blocks were followed by 6 experimental blocks of 30 trials. In each experimental block, each of the colored words was presented four times, twice in each color, and each of the ten Dutch adjectives was presented once. In between blocks, participants were given information about how many blocks were already completed.
Results Means were calculated and analyzed in the same way as in the previous experiment. The data of one participant were excluded because both her mean reaction time and percentage of errors was more than three standard deviations higher than that of the total group. However, analyses that did include her data led to the same conclusions as the analyses that did not include her data. All relevant means can be found in Table 2. Table 2. Mean Untransformed Reaction Times in ms and Percentage of Errors (SD in Parentheses) on Target Stimulus Trials as a Function of Stimulus Category and Extrinsic Response Valence in Experiment 2 Stimulus Self-Name Reaction Time Percentage of Errors Other-Name Reaction Time Percentage of Errors Flowers Reaction Time Percentage of Errors Insect Reaction Time Percentage of Errors XXXXX Reaction Time Percentage of Errors
Extrinsic Response Valence Positive Negative 625 (107) 4.34 (9.26)
666 (115) 7.99 (10.59)
682 (121) 6.94 (10.22)
652 (119) 2.95 (5.83)
650 (130) 2.08 (4.38)
663 (117) 5.38 (7.19)
691 (157) 6.94 (8.65)
624 (118) 2.08 (4.38)
655 (132) 6.08 (10.70)
654 (121) 3.82 (6.42)
Experimental Psychology 2003; Vol. 50(2): 77Ð85
The Experiment Half ¥ Stimulus ¥ Response Valence ANOVAs with Greenhouse-Geisser corrections revealed a significant interaction between stimulus and response valence, both for the mean log-transformed reaction times, F(3.42, 160.80) = 6.40, p ⬍ .001, and for the percentage of errors, F(3.52, 165.58) = 5.97, p ⬍ .001. The main effect of experiment half was also significant in the analysis of the reaction times, F(1, 47) = 52.25, p ⬍ .001. These interactions were not modulated by experiment half, Fs ⬍ 1. No other effects approached significance, Fs ⬍ 2.03. A priori t-tests performed on the reaction time EAST scores revealed a significantly positive score for the self-name, t(48) = 2.78, p = .008, M = 40 ms, d = 0.32, a significantly negative score for INSECT, t(48) = 3.79, p ⬍ .001, M = -68 ms, d = 0.52, and a marginally significant negative score for the othername, t(48) = 1.79, p = .08, M = -30 ms, d = 0.23. The reaction time EAST score for FLOWERS, M = 13 ms, d = 0.17, and for the neutral letter-string, M = 0 ms, d = 0.02, did not differ significantly from zero, ts ⬍ 1. The error EAST score, however, was significantly positive for FLOWERS, t(48) = 3.25, p = .002, M = 3.23%, d = 0.47, and marginally so for the selfname, t(48) = 1.70, p = .10, M = 3.65%, d = 0.25. The error EAST score was significantly negative for INSECT, t(48) = 3.37, p = .002, M = Ð4.86%, d = 0.49, and for the other-name, t(48) = 2.23, p = .03, M = 3.99%, d = 0.32, but did not differ from zero for the neutral letter-string, t(48) = 1.15, M = Ð2.26%, d = 0.17. To estimate the reliability of the EAST scores, I calculated for each stimulus an EAST score on the basis of the data of the first three experimental blocks and a second score on the basis of the data of last three experimental blocks. In the reaction time data, these two EAST scores were correlated for the self-name, r = .48, p = .001, but not for the othername, r = .23, the word FLOWERS, r = Ð.21, the word INSECT, r = .16, or the Xs, r = .19. Of the remaining 40 possible correlations, only three were significant. In the error data, the EAST scores were correlated for the self-name, r = .44, p = .002, the other name, r = .42, p = .003, and the Xs, r = .55, p ⬍ .001, but not for the words FLOWERS, r = Ð.15, and INSECT, r = .18. Of the remaining 40 correlations, only 5 were significant. Finally, I correlated the five reaction time EAST scores with the five error EAST scores (these scores were based on the data of all blocks). A significant positive correlation was found for the self-name, r = .50, p ⬍ .001, the othername, r = .42, p = .003, and the word INSECT, r = .40, p = .005, but not for the word FLOWERS, r = .05, and the Xs, r = .13.
” 2003 Hogrefe & Huber Publishers
Extrinsic Affective Simon Task
General Discussion Despite the fact that the IAT was introduced only five years ago (Greenwald et al., 1998), it has become a popular tool to study attitudes and nonevaluative beliefs in an indirect way. In the present paper, I introduced a modified version of the IAT in which attitudes are assessed by comparing performance on different trials within the same task. Participants were asked to press one of two keys on the basis of the valence of white words and on the basis of the color of colored words. Results showed that responses to positive colored words (e.g., the first name of the participant, the word FLOWERS) were faster and/or more accurate on trials where the correct response was the response assigned to positive white words (i.e., the extrinsically positive response) than on trials where it was the response assigned to negative white words (i.e., the extrinsically negative response). The reverse was true for responses to negative colored words (e.g., the word INSECT). These results closely correspond to what one would predict on a priori grounds.3 This suggests that the EAST is a valid research tool that can be used to test theories of attitudes and differences in attitudes between groups of individuals.The present studies also provide some preliminary information about the stability of EAST effects and the reliability of interindividual differences in EAST scores. First, as was evidenced by the lack of an interaction between experiment half, stimulus valence, and response valence, EAST effects seem to be fairly unaffected by practice with the task (and the resulting decrease in reaction times). Second, when each EAST score was calculated twice on the basis of two different portions of the data, the correlation between corresponding EAST scores varied from Ð.21 to .55 (mean of .25). This suggests that EAST scores are not reliable enough to detect interindividual differences in attitudes. One should note, however, that there are several reasons why the reliability of the EAST effects was low in the present studies. First, one cannot expect that there are strong and meaningful interindividual differences in attitudes toward normatively positive and negative stimuli, such 3 A possible exception is the negative EAST score for the other-name in Experiment 2. It is possible that this EAST score was negative because participants (implicitly) contrasted the other-name with the self-name. A different EAST score for other-names might be found when the selfname is not presented during the same task. In another study, I did find a negative EAST score for the concept INSECT regardless of whether the other presented concept was FLOWERS or NONWORDS. However, more research is needed before any firm conclusions can be drawn with regard to whether EAST scores for a particular concept depend on the nature of the other concepts that are measured.
” 2003 Hogrefe & Huber Publishers
83
as the positive and negative words presented in Experiments 1 and 2. From this perspective, it is encouraging that the EAST score for the self-name (which can be regarded as an index of self-esteem) did tend to be reliable (correlations of .48 and .44 for the reaction time and error EAST score, respectively). Second, in Experiment 2, each attitude object was presented on only 12 trials with a positive response and 12 trials with a negative response. It likely that the reliability of EAST scores can be improved by increasing the number of trials. Finally, the reliability of reaction time measures such as the EAST can be improved by keeping procedural elements such as task assignments and trial order constant (Banse, 2001). Nosek and Banaji (2001) recently introduced another modification of the IAT. Like the IAT, their go/ no-go association task (GNAT) involves two separate tasks during which participants need to classify attribute and target stimuli. Unlike the IAT, however, participants have to give a response only to some but not all stimuli. For instance, a participant might see names of fruit, positive words, and negative words. To measure the attitude toward fruit, participants are either asked to press a key when they see the name of a fruit and when they see a positive word (Task 1) or to press the key when the presented word is the name of a fruit or a negative word (Task 2). Results typically show that performance in Task 1 is superior to performance in Task 2, which indicates that participants have a positive attitude toward the concept “fruit.” Both the EAST and the GNAT do not require the presence of a second target concept and therefore offer the possibility of assessing attitudes separately rather than in comparison to each other. The GNAT effect, however, is based on a comparison of performance in two different tasks that participants might sometimes recode in different ways. Therefore, as is the case for the IAT effects, GNAT effects could reflect nonassociative variables that influence task recoding. Although EAST effects appear to be smaller in size than IAT and GNAT effects, the EAST may thus be a valuable addition to existing indirect measures of attitudes. Its main strengths are that (1) confounds due to task recoding are less likely because EAST effects are based on a comparisons of trials within a single task, (2) single attitudes can be measured, and (3) multiple attitude objects can be examined in one task. Furthermore, it provides virtually the same flexibility as the IAT and GNAT. To study attitudes, the white words need to be positive and negative stimuli that are classified on the basis of their valence whereas the colored words can represent any attitude object that one wishes to assess. But the EAST could also be used to measure more specific, nonevaluative beliefs. For instance, if one wishes to examine whether the concept “intelligent” is associated more strongly with the concept “self ” than with the concept “others,” one can Experimental Psychology 2003; Vol. 50(2): 77Ð85
84
Jan De Houwer
present white words that can be classified according to whether they refer to the self or to others and present colored words that are related to the concept “intelligent.” In principle, one can also use stimuli other than words. For instance, on colored trials, one might present black-and-white pictures that have a color filter placed over them. Participants can then be asked to respond on the basis of the valence of white words (or pictures with no color filter) and the color of the filter of colored pictures. Such tests of the generality of EAST effects are currently being conducted and the results are encouraging.
References Banse, R. (2001). Affective priming with liked and disliked persons: Prime visibility determines congruency and incongruency effects. Cognition and Emotion, 15, 501Ð520. Bovens, N., & Brysbaert, M. (1990). IBM PC/XT/AT and PS/2 Turbo Pascal timing with extended resolution. Behavior Research Methods, Instruments, and Computers, 22, 332Ð334. Brendl, C. M., Markman, A. B., & Messner, C. (2001). How do indirect measures of evaluation work? Evaluating the inference of prejudice in the Implicit Association Test. Journal of Personality and Social Psychology, 81, 760Ð773. De Houwer, J. (in press). A structural analysis of indirect measures of attitudes. In J. Musch & K. C. Klauer (Eds.), The psychology of evaluation: Affective processes in cognition and emotion. Mahwah, NJ: Lawrence Erlbaum. De Houwer, J. (2002). The Implicit Association Test as a tool for studying dysfunctional associations in psychopathology: Strengths and limitations. Journal of Behavior Therapy and Experimental Psychiatry, 33, 115Ð 133. De Houwer, J. (2001). A structural and process analysis of the Implicit Association Test. Journal of Experimental Social Psychology, 37, 443Ð451. De Houwer, J., Crombez, G., Baeyens, F., & Hermans, D. (2001). On the generality of the affective Simon effect. Cognition and Emotion, 15, 189Ð206. De Houwer, J., & Eelen, P. (1998). An affective variant of the Simon paradigm. Cognition and Emotion, 12, 45Ð 61. De Houwer, J., & Geldof, T. (2002). The Implicit Association Test as a general measure of similarity. Manuscript in preparation. de Jong, P. (2002). Implicit self-esteem and social anxiety: Differential self-positivity effects in high and low anxious individuals. Behaviour Research and Therapy, 40, 501Ð508. Fazio, R. H. (1986). How do attitudes guide behavior? In R. M. Sorrentino & E. T. Higgins (Eds.), Handbook of motivation and cognition (Vol. 1, pp. 204Ð243). New York: Guilford Press. Fazio, R. H., & Olson, M. A. (in press). Implicit measures in social cognition research: Their meaning and use. Annual Review of Psychology. Fazio, R. H., Sanbonmatsu, D. M., Powell, M. C., & Kardes, F. R. (1986). On the automatic activation of Experimental Psychology 2003; Vol. 50(2): 77Ð85
attitudes. Journal of Personality and Social Psychology, 50, 229Ð238. Fiedler, K., Messner, C., & Bluemke, M. (2002). Unresolved problems with the “I”, the “A” and the “T”: Logical and psychometric critique of the Implicit Association Test (IAT). Manuscript submitted for publication. Greenwald, A. G., & Farnham, S. D. (2000). Using the Implicit Association Test to measure self-esteem and self-concept. Journal of Personality and Social Psychology, 79, 1022Ð1038. Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K (1998). Measuring individual differences in implicit cognition: The Implicit Association Test. Journal of Personality and Social Psychology, 74, 1464Ð1480. Greenwald, A. G., & Nosek, B. A. (2001). Health of the implicit association test ast age 3. Zeitschrift fur Experimentelle Psychologie, 48, 85Ð93. Karpinski, A., & Hilton, J. L. (2001). Attitudes and the Implicit Association Test. Journal of Personality and Social Psychology, 81, 774Ð788. McConnell, A. R., & Leibold, J. M. (2001). Relations among the Implicit Association Test, discriminatory behavior, and explicit measures of racial attitudes. Journal of Experimental Social Psychology, 37, 435Ð 442. Mierke, J., & Klauer, K. C. (2001). Implicit association measurement with the IAT: Evidence for effects of executive control processes. Zeitschrift für Experimentelle Psychologie, 48, 107Ð122. Nosek, B., & Banaji, M. R. (2001). The go/no-go association task. Social Cognition, 19, 625Ð666. Rothermund, K., & Wentura, D. (2001). Figure-ground asymmetries in the Implicit Association Test. Zeitschrift fur Experimentelle Psychologie, 48, 94Ð106. Sedikides, C. (1993). Assessment, enhancement, and verification determinants of the self-evaluation process. Journal of Personality and Social Psychology, 65, 317Ð338. Sedikides, C., & Strube, M. J. (1997). Self-evaluation: To thine own self be good, to thine own self be sure, to thine own self be true, and to thine own self be better. In M. P. Zanna (Ed.), Advances in Experimental Social Psychology (Vol. 29) (pp. 209Ð269). New York: Academic Press.
Received August 8, 2002 Final revision received November 29, 2002 Accepted December 2, 2002
Jan De Houwer Department of Psychology Ghent University Henri Dunantlaan 2 B-9000 Ghent Belgium Tel.: +32 9264 6445 Fax: +32 9264 6489 E-mail:
[email protected] ” 2003 Hogrefe & Huber Publishers
Extrinsic Affective Simon Task
85
Appendix Stimuli presented in Experiment 1
Stimuli presented in Experiment 2
Positive Attribute Words: HEALTHY, HONEST, SMART, FUNNY, OUTSTANDING Negative Attribute Words: EVIL, HORRIBLE, MEAN, VULGAR, REPULSIVE Positive Target Words: FRIEND, SUMMER, FLOWER, RAINBOW, BUTTERFLY Negative Target Words: MURDER, CANCER, COCKROACH, WAR, VOMIT
Positive Attribute Words: GELUK (happiness, BLIJ (happy), VRIENDELIJK (friendly), PRETTIG (nice), GOED (good) Negative Attribute Words: VALS (false), GEMEEN (mean), VIJANDIG (hostile), HATELIJK (hateful), VERVELEND (boring) Coloured words: first name of participant, first name of previous participant, BLOEMEN (flowers), INSECT (insect), XXXXX
” 2003 Hogrefe & Huber Publishers
Experimental Psychology 2003; Vol. 50(2): 77Ð85