University of California, Santa Barbara. To test a hypothesis from self-efficacy theory, we randomly assigned 149 subjects to verbal or mathematics and success ...
Journal of Counseling Psychology 1990, Vol. 37, No. 2, 169-177
Copyright 1990 by the American Psychological Association, Inc. 0022-0167/90/S00.75
Effects of Verbal and Mathematics Task Performance on Task and Career Self-Efficacy and Interest Nancy E. Betz Ohio State University
Gail Hackett Arizona State University
This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
M. Sean O'Halloran and Deborah S. Romac University of California, Santa Barbara To test a hypothesis from self-efficacy theory, we randomly assigned 149 subjects to verbal or mathematics and success or failure conditions in which they attempted to solve easy or difficult anagram or number series tasks. Changes in task self-efficacy and task interest as a result of task success or failure were in accordance with predictions from self-efficacy theory. We also examined the generalizabilty of the effects of task performance. The results indicated that task performance effects generalized to self-efficacy and interest ratings on an irrelevant task and to global ratings of math and verbal ability. Task performance effects did not generalize to career self-efficacy and career interest measures but consistent gender differences in self-efficacy emerged as a result of both math and verbal task performance.
Self-efficacy theory (Bandura, 1977, 1982, 1986) has received increasing attention over the past decade as a useful conceptual model for understanding various aspects of the career development process (Lent & Hackett, 1987). Hackett and Betz (1981) hypothesized that self-efficacy theory may be particularly useful in understanding women's career-related behavior and choices. Research thus far has demonstrated that career self-efficacy is predictive of academic persistence and achievement (Lent, Brown, & Larkin, 1984, 1986, 1987), career decision making (Taylor & Betz, 1983), math and science college major choices (Betz & Hackett, 1983; Hackett, 1985; Lent et al., 1984, 1986), willingness to engage in nontraditional career activities (Nevill & Schlecker, 1988), and perceived range and traditionality of occupational preferences (Betz & Hackett, 1981; Post-Kammer& Smith, 1985,1986; Rotberg, Brown, & Ware, 1987; Wheeler, 1983). Significant gender differences in careerrelated self-efficacy and in educational and career choices have also been found (Betz & Hackett, 1981, 1983, 1987; Hackett, 1985; Post-Kammer& Smith, 1985, 1986; Wheeler, 1983). Further evidence of the utility of the construct has been provided by Lent et al.'s (1987), Siegal, Galassi, and Ware's (1985) and Wheeler's (1983) investigations that have compared alternate theoretical perspectives. More recently research on career self-efficacy has begun to move beyond investigations of the most general aspects of the theory to focus on specific hypotheses, including the relation of career self-efficacy to other important career-related variables, for example, ability and vocational interests. Hackett and Betz (1989) reported a significant correspondence between math self-efficacy and math performance, but they found math self-efficacy to be the better predictor of mathrelated college major choices. Career self-efficacy has also
been reported to interact with academic aptitudes in predicting academic performance (Brown, Lent, & Larkin, 1988), to correlate moderately with inventoried vocational interests (Lent, Larkin, & Brown, 1989), and to mediate gender differences in interest profiles on the Strong-Campbell Interest Inventory (Lapan, Boggs, & Merrill, 1989).
Experimental Research Researchers involved in a related line of inquiry have begun, by using experimental rather than correlational methods, to explore the manner in which self-efficacy expectations are amenable to change. Change mechanisms are not only a key component of self-efficacy theory but are also the aspect of the theory most directly relevant to the design of counseling interventions (Lent & Hackett, 1987). Bandura (1977) delineated four sources of information influential in modifying efficacy expectations: performance accomplishments, vicarious learning, emotional arousal, and verbal persuasion. Of these four informational sources, performance accomplishments are hypothesized to be the most powerful (Bandura, 1977, 1982). In an initial experimental test of the effects of performance on vocationally related self-efficacy, Hackett and Betz (1984) investigated the effects of failure at a math or verbal task on general and specific measures of mathematics self-efficacy and on global math and verbal ability ratings. The findings indicated that task failure influenced self-efficacy expectations but not always in the expected direction. Gender x Task interactions were observed, and contrary to predictions, the effects of task failure in one domain, that is, the math or verbal task, generalized in some cases to positively influence self-efficacy expectations in the other domain. Hackett and Campbell (1987) conducted a second experimental study of the effects of task performance on self-efficacy and on task interest in order to explicate the confusing findings from Hackett and Betz (1984). The subjects were exposed
Correspondence concerning this article should be addressed to Gail Hackett, Counseling Psychology Program, Division of Psychology in Education, Arizona State University, Tempe, Arizona 85287.
169
This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
170
HACKETT, BETZ, O'HALLORAN, AND ROMAC
to success or failure experiences on a series of three verbal tasks (i.e., solving anagrams). The results were in keeping with theoretical predictions: Success at the verbal task caused task self-efficacy and task interest to rise, whereas task failure caused a corresponding drop in task self-efficacy and interest. Task performance similarly influenced a global measure of verbal ability. Global math ability ratings were affected by task success or failure but to a lesser degree than verbal ability estimates. Few gender differences emerged on this genderneutral task. An analysis of the subjects' attributions of their performance revealed that women more often than men attributed their success to luck and their failure to their lack of ability. In a companion study to the Hackett and Campbell (1987) investigation, Campbell and Hackett (1986) used a genderlinked (i.e., mathematical) task to explore potential gender differences and Gender x Treatment interactions in task selfefficacy and interest. The results of these two studies were very similar except for more consistent and stronger gender differences in Campbell and Hackett's study. The number series task used by Campbell and Hackett resulted in greater differentiation between the self-efficacy, interest, and ability ratings of men and women. Task interest scores were also found to be moderately correlated with self-efficacy level and strength. However, self-efficacy scores were more responsive to task success or failure than task interest ratings. The findings supported the contention that self-efficacy and interests are probably unique in their contributions to the prediction of career choice, a finding congruent with some of the correlational studies that have related career self-efficacy to vocational interests (Betz & Hackett, 1981;Lapanetal., 1989; Lentetal., 1989). Finally, Zilber (1988) extended the Hackett and Campbell (1987) work by directly testing the influence of attributions on self-efficacy for performance on a gender-neutral task. Her results were congruent with previous findings. Self-efficacy and attributions were significantly correlated, but a full moderator effect was not supported by the results. Finally, although few gender differences emerged overall, different attributional patterns predicted men's and women's efficacy expectations.
Purposes of the Study Because the general usefulness of self-efficacy theory in the career area has been supported and because the effects of performance accomplishments on self-efficacy are central to designing efficacy-based career counseling interventions, we believe that experimental investigations are now one of the most important research areas in the career self-efficacy literature. The purpose of our study, then, is to explore some of the remaining questions that have arisen out of experimental inquiry on the effects of performance on self-efficacy, interest, and attributions. In particular, we sought (a) to compare the influence of performance on a gender-linked (math) and a gender-neutral (verbal) task on self-efficacy, interests, and attributions within the context of the same study and (b) to examine the extent to which task performance effects on selfefficacy and interest generalize to other domains. This inves-
tigation may shed light on how performance experiences may ultimately be used to promote realistic career-related selfefficacy expectations in career clients. Essentially our study is a replication and extension of three earlier studies (Campbell & Hackett, 1986; Hackett & Betz, 1984; Hackett & Campbell, 1987). The generalizability of performance effects on self-efficacy and interests was measured at three levels: a task-specific level, a moderate level of generalizability (career-related self-efficacy and interest with regard to college courses in nontraditional and traditional areas), and a global level of generalizability (global ability ratings). An additional measure of the generalizability of performance effects was assessed at the task level by obtaining ratings of self-efficacy and interest on a task irrelevant to the domain in which subjects performed. The major hypotheses of the study were as follows: (a) Task success would result in an increase in the level and strength of task-relevant self-efficacy and interest; (b) task failure would produce a decrease in task-relevant self-efficacy and interest; (c) the effects of task performance would generalize moderately to other behavioral domains, that is, to task-irrelevant and career-related self-efficacy and interests; and (d) task success or failure would interact with gender, which interaction would in turn result in differential self-evaluations of performance for men and women on the gender-linked (math) but not the gender-neutral (verbal) tasks. In addition to these major hypotheses, the effects of task performance on attributions were also examined.
Method Subjects The subjects were 149 undergraduates (78 women and 71 men) enrolled in introductory psychology courses at a middle-sized public university in the West. Participation was voluntary; subjects received course credit for their involvement in the study.
Instruments Educational survey and global ability measures. A brief survey that contained a series of questions to elicit demographic information (e.g., age and gender) as well as information about educational background and major and career plans was administered. This survey was a modified version of the instrument used by Hackett and Betz (1984). In addition to the demographic information, two items to assess global ratings of subjects' perceptions of their mathematical and verbal skills were included. For both mathematical and verbal ability, the subjects were asked to rate themselves on a scale from extremely low ability (1) to extremely high ability (10) in comparison to other college students. These two measures were repeated on the postexperimental questionnaire (see later description). Task self-efficacy and interest ratings. Ratings of the strength and level of self-efficacy and interest in the task were obtained through a three-item scale administered as a pretest and then after both task attempts. Self-efficacy strength was assessed by asking subjects to rate their confidence in passing the verbal anagram or math number series tests on a scale from not confident at all (I) to very confident (10). Level of task self-efficacy was assessed by asking subjects to estimate the number of problems they expected to successfully solve (0-12),
This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
TASK PERFORMANCE AND SELF-EFFICACY and task interest was rated from no interest at all(\)\oa high degree of interest (10). The test-retest reliabilities of the self-efficacy level and strength ratings over a 1-week period have been found to be, respectively, .55 and .70 for the anagram task and .70 and .60 for the number series task. The test-retest reliability for the interest scores was .67 for the verbal task and .76 for the number series task (Hackett & O'Halloran, 1985). Career-related self-efficacy and interest measures. The College Course subscale of the Math Self-Efficacy Scale (MSES; Betz & Hackett, 1983) was used as a measure of career-related self-efficacy with regard to traditional (non-math-related) and nontraditional (math-related) college courses. This scale was chosen, in contrast to the possible choice of an occupational self-efficacy or interest measure, because it represents a moderate level of generalizability for examining the effects of performance on career-related self-efficacy and interests. That is, self-efficacy with regard to occupational pursuits was regarded as a more general measure than self-efficacy with regard to college courses because of the closer relation between the experimental tasks (academically oriented) and academic coursework. The scale consists of 22 college courses, 6 traditional (e.g., education and comparative literature) and 16 nontraditional (e.g., algebra and calculus). The subjects were requested to indicate on a scale from none (0) to complete (9) how much confidence they had in completing each course with a grade of B or better. The internal consistency reliability for the scale was found to be .93 (Betz & Hackett, 1983); the test-retest reliability for the nontraditional course subscale was .91 (Hackett & O'Halloran, 1985). The College Course subscale from the MSES was revised to serve as a measure of career-related interests parallel to the self-efficacy assessment. The subjects were asked to rate on a 0-9 scale from strongly dislike to strongly like their degree of interest in each of the 22 college courses. The College Course Interest scale therefore yielded the same two subscales, which indicated the level of subjects' interest in traditional (non-math-related) and nontraditional (math-related) courses. Postexperimental questionnaire. As in the earlier studies, a brief questionnaire to elicit subjects' reactions to the experimental task was administered after the completion of the study. The first 2 items served as experimental manipulation checks that required subjects (a) to indicate whether they had passed or failed the tasks and (b) to rate on a 10-point scale how successful they felt they were in solving the task. Four questions were concerned with self-evaluations of performance, that is, ratings of potential ability, amount of effort expended, task difficulty, and luck in solving the task. These are all factors found in the attribution literature to be important in performance selfassessment and future expectations of performance (e.g., Feather, 1966, 1969; Feather & Simon, 1971). All attribution ratings were obtained on a 0-9 scale from not at all to extremely. For example, for the task difficulty item, the subjects responded to the question "How difficult did you think this task was?" on a scale of not at all difficult (0) to extremely difficult (9). One other question on this instrument required subjects to rate their satisfaction with their performance on the same 0-9 scale. Global ratings of math and verbal abilities were assessed as on the educational survey. Test-retest reliability for the math ability rating scale was .87 and for the verbal ability rating scale, .89 (Hackett & O'Halloran, 1985).
Procedure Before the experimental sessions the subjects were randomly assigned to one of four conditions: verbal task success, verbal task failure, math task success, or math task failure. All subjects were met
171
by one of two male experimenters, graduate students in counseling psychology. The subjects were administered the instruments in small groups of 3 or 4. They received instructions, completed the educational survey, received a written description of the experimental task, and were asked to complete the self-efficacy and interest rating scale. They were informed that the task was a test of their abilities and that they had to successfully solve at least 6 of the 12 problems in order to pass the test. The self-efficacy and interest rating scale was then collected, and the subjects were instructed that they had 10 min to finish the task. Two different anagram or two different number series tasks were administered, and subjects completed the self-efficacy and interest scale after each task attempt. Three task-relevant self-efficacy and interest assessments resulted (i.e., Pretest, Posttest 1, and Posttest 2). Next, the subjects were given a description of an alternate task and asked for self-efficacy and interest ratings in this task-irrelevant domain. These self-efficacy and interest measures were thus conceptualized as an assessment of the generalization of performance effects at the task-specific level. The subjects who actually attempted two verbal anagram tasks were asked in the final task self-efficacy and interest assessment for ratings of their expected performance on a number series task, ratings which were thus irrelevant to the actual task they had performed. Those in the number series condition were asked for verbal anagram self-efficacy and interest ratings in their final task assessment. Finally, the subjects completed the career-related self-efficacy and interest measures, the postexperimental instrument was administered, and subjects were thoroughly debriefed.
Experimental tasks The verbal anagram tasks consisted of sets of 12 disarranged sixletter words that subjects had to rearrange into meaningful English words. Each math task in this study consisted of a set of 12 incomplete number series that subjects were asked to solve by determining the formula that underlay the series of numbers and completed the sequence. For example, the formula for the series 3, 12, 15, 60, 63 is multiply by 4, then add 3; the solution is 252. The problems were derived from the tasks used in previous studies (see Campbell & Hackett, 1986; Hackett & Betz, 1984; Hackett & Campbell, 1987). Difficulty levels of the problems were determined as a result of a series of pilot tests with beginning graduate students and in earlier experimental research (Feather, 1966, 1969; Feather & Simon, 1971). The subjects in the success group received a list of 12 anagrams or number series for each task attempt, 6 of which were relatively easy to solve, 3 of moderate difficulty, and 3 somewhat more difficult. The subjects in the failure group received 6 very difficult, 3 difficult, and 3 easy anagrams or number series items.
Results Experimental Manipulation Checks In order to check the experimental manipulation, the number of problems correctly solved by each subject for each separate task was analyzed in a four-way analysis of variance (ANOVA), Experimental Task (verbal or math) x Group (success or failure) x Gender x Repeated Measures. Significant main effects for the group variable, F(l, 141) = 451.75. p < .001, provided support for the success of the experimental manipulation. A significant Task x Group interaction indicated that subjects performed better on the verbal than on the mathematical task.
172
HACKETT, BETZ, O'HALLORAN, AND ROMAC
As a further check on subjects' perceptions, the two items on the postexperimental questionnaire that required ratings of perceived success were analyzed. All subjects responded according to their group assignment on the first question (passed the test vs. failed the test). A three-way ANOVA (Task x Group x Gender) resulted in a significant main effect for success or failure group in the expected direction.
This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Self-Efficacy
Strength, Level, and Task Interest
Significant main effects for experimental task (verbal or mathematical), F(l, 141) = 4.79, p < .05, group (success or failure), F(l, 141) = 57.5, p < .001, and gender variables, F(l, 141) = 4.84, p < .05, emerged on a four-way repeated measures ANOVA on self-efficacy strength. Significant two-way interactions for the Repeated Measures x Task, F(3, 423) = 4.27, p < .006, and Repeated Measures x Group, F(3,423) = 36.72, p < .001, interactions were also found. Table 1 presents the means and standard deviations for self-efficacy strength scores. The main effect for gender revealed that men scored higher than women on ratings of confidence in their abilities across all task and group conditions over time. Newman-Keuls post hoc analyses on the Repeated Measures x Task interaction revealed that self-efficacy strength scores were significantly higher (for all significant Newman-Keuls analyses, all ps < .05) for the verbal than for the math task for pretest and the first posttest but not for the second posttest. No significant differences were found between scores for the task-specific generalizability measure across task, nor were the task generalizability scores different from the scores at the second posttest. However, the generalizability scores were higher than pretest scores for subjects in the math task group; no significant differences were found between pretest and the task generalizability scores for subjects in the verbal task group. These findings suggest that subjects were generally more confident of their verbal than their math skills overall but particularly after a failure experience on the math task. Follow-up tests on the Repeated Measures x Group (success or failure) interaction revealed no significant differences between the success and failure groups on the pretest, but there were significant differences at every other testing. The subjects in the success group rated the strength of their efficacy expectations higher at the first and second posttest than subjects in the failure group. For the task generalizability measures, significant differences appeared between the pretest and these scores, wherein subjects in the success group rated their confidence in their abilities on the irrelevant task higher than their initial confidence in their task-relevant abilities. The opposite trend occurred for subjects in the failure group. Task generalizability ratings were also significantly higher for the success group than the failure group. Evidently task success had some effect on confidence in subjects' abilities at other, nonrelated tasks. The means and standard deviations for level of self-efficacy are presented in Table 1. As with the results for self-efficacy strength, a four-way ANOVA on the level scores yielded main effects for the group and gender factors, F(l, 141) = 76.66,
p < .001, and F(l, 141) = 5.97, p < .002, respectively. In addition, significant Repeated Measures x Task, F(3, 423) = 4.76, p < .003, and Repeated Measures x Group, F(3, 423) = 39.12, p < .001, interactions were found. The gender main effect revealed again that men expressed more confidence overall than women in their abilities. Newman-Keuls post hoc analyses of the Repeated Measures x Task interaction on level scores did not uncover any significant differences between groups across repeated measures. The follow-up tests on the Repeated Measures x Group interaction on level of self-efficacy revealed essentially the same significant differences that were found on this analysis of self-efficacy strength. It seems that for self-efficacy level as well as strength, there is a tendency for the higher or lower performance expectations produced by task success or failure to generalize to an alternate task. Table 1 also displays the means and standard deviations for the measure of task interest. Main effects for task, F( 1, 141) = 4.06, p < .05, and a Group x Gender interaction, F( 1, 141) = 4.10, p < .05, were uncovered. Two two-way interactions involving the repeated measures factor emerged: Repeated Measures x Task, F(3, 423) = 9.48, p < .001, and Repeated Measures x Group, F(3, 423) = 11.56, p < .001. The Repeated Measures x Task x Gender interaction was also significant, F(3, 423) = 3.65, p < .01. Newman-Keuls tests on the Group x Gender interaction on task interest failed to reveal any significant differences. The follow-up tests on the significant three-way interaction (Repeated Measures x Task x Gender) showed no changes in the task interest scores from pretest to the second posttest, which indicates that the type of task performed had no impact on task interest. The task generalizability scores for all subjects who attempted the verbal anagram task were not significantly different from their pretest scores; however, men's task generalizability scores (i.e., interest in the math task) were significantly higher than women's scores. For subjects in the math task condition, women's interest in the irrelevant (i.e., verbal) task was significantly greater than men's interest in that task and also significantly greater than women's own pretest interest in the math task. These results are congruent with past research that has demonstrated women's lower overall interest in math tasks (Campbell & Hackett, 1986). Finally, post hoc analyses of the Repeated Measures x Group interaction on task interest revealed significant pretest differences between subjects in the success and failure conditions. Pretest interest scores were equivalent to the task generalizability scores for subjects in the failure group, partly because pretest scores for these subjects were initially higher than pretest scores for subjects in the success group. For subjects in the failure group, task interest ratings were significantly lower at the first and second posttest, and then they were significantly higher on the task generalizability measure than on the second posttest. Pretest scores for subjects in the success group were significantly lower than interest ratings at any other assessment, including the task generalizability assessment. Thus, success and failure influenced task interest in the expected directions, and there is some evidence that increases in task interest as a result of task success generalize to an alternate task.
TASK PERFORMANCE AND SELF-EFFICACY
173 O ? so — H J2 W •*-»
Q
OO O ~- F-O
--O»^O
o o r s m t NH
NO rs —
fi >—« ««^ r~"
00 ON p
ON fi o ^
r^sornos " -^ -"
NO NO
& | g "^ 13-c
Ii
8 s- HoVmO «O
O*NOO '^•^•O^'
OsoOsO Ov(Nr-^H
S«|it -l2itB Oi «1 **•
a
1
< .005; men scored higher than women. Significant interactions were found also for Repeated Measures x Group, F(3, 423) = 10.46, p < .002, and Repeated Measures x Task x Group interactions, F(3, 423)= 11.27, p