conflict-induced uncertainty to evaluative ratings by adjusting the ratings of high- .... employer provided them a laptop computer and they were to rate items ...
Do I Know Whether I Like It?: Extending Work on Within-Alternative Conflict to Measures of Confidence in Consumer Judgments
Mary Frances Luce University of Pennsylvania The Wharton School
Jianmin Jia Chinese University of Hong Kong Faculty of Business Administration
Gregory W. Fischer Duke University Fuqua School of Business
1
Abstract Consumer researchers often study the decision conflict resulting from consideration of the pros and cons of one alternative versus others. We focus on the richness of the conflict construct itself by replicating Fischer et al.’s findings that within-alternative conflict, associated with integrating the pros and cons of a particular alternative, influences responses to a singleitem judgment task (Fischer, Jia and Luce 2000; Fischer Luce and Jia 2000). While Fischer et al. focus on test-retest reliability in judgment as a dependent variable, we extend their work by developing a measure of explicit preference uncertainty using subjective confidence intervals placed around evaluative judgments and investigating the measure in consumer purchase contexts. We validate this measure by showing that the presence of our confidence-interval task does not influence test-retest reliability or expressed ratings themselves. Finally, we find a small but statistically reliable effect of conflict on expressed ratings, again extending prior work on within-alternative conflict by demonstrating that subjects apparently extend their feelings of conflict-induced uncertainty to evaluative ratings by adjusting the ratings of high-conflict alternatives downward.
2
Consumer and behavioral decision research often investigates conflict. Conflict directly influences ratings of decision difficulty (e.g., Chatterjee & Heath 1996; Bettman, Johnson, Luce and Payne 1993) and emotionality (e.g., Luce, Bettman & Payne 2000). Further, conflict has been demonstrated to influence such behaviors as the likelihood of delay (e.g., Luce 1998; Tversky and Shafir 1992), consideration set size (Hauser and Wernerfelt 1990), the extent of information search (Moorthy, Ratchford and Taluekdar 1997), the degree of sensitivity to peripheral cues or arguments (Heath, McCarthy and Mothersbaugh 1994; Miniard, Sirdeshmukh and Innis 1992) and the degree to which decision processing proceeds by alternative or by attribute (Bettman, et al. 1993). In these literatures, conflict is generally operationalized as involving the relative advantages and disadvantages of one alternative as compared to another; that is, a between-alternative model for conflict is at least implicitly assumed. However, conflict can also be conceptualized on a within-alternative basis as the degree to which a specific alternative is characterized by attributes that are evaluatively distinct from one another. This between- versus within-alternative distinction is a fundamental issue addressed in the early conflict work by Lewin (1935) and Miller (1951). In this work, approach-approach (avoidance-avoidance) conflicts are generally defined as involving distinct and mutually desirable (undesirable) alternatives. Approach-avoidance conflicts are defined as resulting from the same alternative containing both desirable and undesirable characteristics. Approachavoidance conflicts are often considered the most psychologically difficult and meaningful, as they potentially result in vacillation when the draw of the desirable aspects of an alternative creates movement towards that alternative which in turn increases the negative psychological force associated with the alternative’s undesirable aspects. Within-alternative conflict is likely to be psychologically prominent during product evaluation or judgment. These tasks comprise a good deal of consumer behavior. For example, a product evaluation is necessarily the basis of go / no-go decisions about particular, lone 3
alternatives. Decisions such as those regarding whether to buy an extra pair of shoes or another item for one’s art collection likely involve a focus on the perceived merits of the specific item under consideration. In such cases, discrepancies between the item’s attributes (e.g., a sculpture that is striking and by a desired artist, yet expensive and too big for one’s house) are likely to have a meaningful impact on the decision maker. Single-alternative evaluations are also the building blocks of multi-alternative choice when the consumer implements sequential, alternative-based processing rules. For instance, a buyer choosing among cars or houses by sequentially investigating options might experience effects of within-alternative conflict as she attempts to draw a summary impression or evaluation of a particular item that is the current focus of attention (e.g., during a test drive). Consumer researchers have largely neglected the issue of within-alternative conflict (for one partial exception see Heath et al. 1994’s discussion of between-alternative conflict leading to greater loss aversion in evaluating particular alternatives). However, Fischer, Luce and Jia (2000; Fischer, Jia and Luce 2000) provide a theoretical and measurement framework for addressing the construct. In the current paper, we seek to replicate and extend Fischer et al.’s basic finding regarding within-alternative conflict. Our overall goal is to introduce the concept and operationalization of within-alternative conflict to consumer researchers. Fischer et al. argue that indeterminacy or lability in a decision maker’s attribute importance weights contributes to uncertainty in preferences. They more specifically argue that random fluctuation in weights will have a greater impact on overall evaluation as withinalternative conflict increases. Thus, in essence, they operationalize the important idea that preferences are constructed (see, for instance Payne, Bettman and Johnson 1993 or Slovic 1990 on preference lability) by developing and supporting the general hypothesis that fluctuations in weights have more influence on evaluation as within-alternative conflict increases. For instance, in the extreme case of zero conflict (e.g., all of an alternative’s attributes have the highest 4
possible rating) attribute importance weights are likely irrelevant to one’s summary evaluation (e.g., the overall evaluation is likely to be the highest possible summary rating). As conflict increases, however (e.g., some attributes have high ratings and some have moderate ratings), they argue that fluctuations in importance weights should lead to fluctuations in mentally simulated overall evaluations, causing feelings of conflict. In summary, they argue that because attribute importance weights are labile, within-alternative conflict leads to uncertainty in the evaluation of alternatives. Fischer et al. therefore illustrate that the concept of conflict is more complicated that is typically assumed, as conflict involves a within-alternative component that is operationally distinct from between-alternative conflict. Fischer et al. support their hypotheses through computer simulation and through a behavioral experiment using relatively subtle, implicit measures of preference uncertainty, namely variations in test-retest response errors associated with on average less than 10% of a rating scale and time differences within the context of overall judgment times of approximately 10 seconds. Thus, while having the advantage of representing an unobtrusive demonstration of within-alternative conflict effects, these measures also have the disadvantage of having a questionable direct utility to consumer research. First, such measures are likely difficult to implement in many consumer research domains; for instance, measuring test-retest reliability requires several trials be implemented with filler items or time delays, and it also requires substantial precision in actual evaluative estimates. Second, it seems that within-alternative conflict is more likely to influence consumer behavior if subjects can and do overtly express some feeling (e.g., of uncertainty or indeterminancy) associated with conflict. In this research, we seek to extend Fischer et al.’s work by moving towards more direct manifestations of preference uncertainty, in an attempt to increase the degree to which this work applies to consumer research. We develop a relatively direct measure of preference uncertainty in judgment that involves asking subjects to place bounds, similar to confidence intervals, around evaluative 5
ratings. This new measure has the advantage of requiring only one response per stimulus. We also extend Fischer et al.’s work by measuring whether within-alternative conflict influences expressed evaluations themselves. In particular, if the subjective uncertainty generated by stimulus characteristics has a discernable effect on a decision maker, then that decision maker may moderate her preferential judgment for the relevant alternative (e.g., see Kahn and Meyer 1989). In summary, we seek to replicate and extend Fischer et al.’s work within a consumer research domain. Like Fischer et al. we also focus on single-option judgment tasks in order to concentrate on the effects of within-alternative conflict on preference uncertainty where the additional, potentially complicating, effects of between-alternative conflict are absent. We seek to replicate Fischer et al.’s basic finding that increases in within-alternative conflict lead to increased test-retest discrepancies (denoted response error) in judgment. We also seek to extend their work to two dependent variables they did not consider, namely overt expressions of confidence in judgment and evaluative ratings themselves. In the context of our extension, we experimentally test for a disruption in the basic judgment process caused by asking subjects to overtly indicate confidence. We also experimentally manipulate task context. We believe this check on robustness is useful in that Fischer et al.’s work involved one task, outside of the typical domain of consumer research (i.e., the task was choosing elective college courses). Experiment Methods Subjects and design. One hundred and thirty undergraduate students individually completed this experiment in return for course credit; three subjects were removed from the analysis for failing to follow task instructions. Each subject was randomly assigned to one of
6
three task contexts (automobiles, laptops, or vacations) and one of two confidence measurement conditions (i.e., a subjective measure of confidence was either present or absent). Procedure. Instructions and stimuli were delivered in a computerized task environment (written in Visual Basic 6.0). After receiving instructions, subjects completed four practice judgments and then judged 22 experimental stimuli twice. Unique random stimulus orders were generated for each subject’s pre- and post-break judgment sets. A filler task asking subjects to respond to two hypothetical advertisements (for unrelated product classes) took approximately fifteen minutes and was inserted between judgment sessions. Subjects were instructed to rate items according to their expected liking or enjoyment of each. Rating scales were anchored by attribute endpoints, and each rating task began with a pointer positioned in the middle of the scale (see Figure 1a). Subjects were randomly assigned to complete an interval-assessment task, designed to be an explicit measure of the preference uncertainty generated by within-alternative conflict. Judgments completed by the remaining half of subjects comprise a (interval-task absent) control group against which we can judge the impact of the confidence interval measure. For subjects in the interval-task-present group, each rating was followed by a request for upper and lower bounds (see Figure 1b). Instructions regarding this confidence interval task stated: “On the next several screens, you will continue to rate {items}. You can think of these ratings as indicating your expected liking or enjoyment of each {item} you rate, relative to your expected liking or enjoyment of the best and worst possible {items} (on the scale ends). You will be asked to complete a second task after each rating of {an item}. This second task involves indicating upper and lower bounds around your ratings. Of course, we can never be sure how much we will enjoy a product or service until we actually experience it. But we can place some bounds (or limits) on our expected enjoyment. That is, we can indicate a range within which we think our enjoyment of that product or service is very likely to fall. We will ask you to put bounds around the rating you give each {item}, such that you are 90% confident that your actual liking of that {item} would fall within the bounds you give….”
7
Each confidence-interval screen contained the description of the relevant alternative and a scale expressing the rating the subject had just indicated. In addition, this screen displayed one scale above the rating scale and one below it, used to indicate upper and lower bounds around ratings. Both the upper and the lower bound rating scales were initially set to the value that the subject had just expressed on the preference-rating scale (which was also visible), to reduce asymmetries in scale anchoring. Thus, if a subject rated their preference for an item as ‘.40’ on the (0 to 1) preference rating scale, all three scales would be initialized at .40. Task Contexts. We desired to extend Fischer et al.’s work in part by demonstrating that within-alternative conflict effects are robust across judgment contexts. Thus, we created three between-subjects task conditions. For each task, alternatives were described in terms of three attributes (corresponding to X1, X2, and X3 in Table 1). The automobile task asked subjects to imagine that they were purchasing cars described in terms of Purchase Price (attribute X1; with levels of $24,000, $21,000, $18,000, $15,000 and $12,000), Safety (X2; with attribute levels described as government-assigned one-star to five-star ratings) and Styling (X3; Very Poor, Poor, Average, Good, or Very Good). The laptop task asked subjects to imagine that their employer provided them a laptop computer and they were to rate items according to how much they would like to be given each one. Laptops were described in terms of Processor Speed (233MHz, 266MHz, 300MHz, 333MHz, and 366MHz), Display or ‘Monitor’ Quality (Very Poor, Poor, Average, Good and Very Good), and Weight (6lbs, 5lbs, 4lbs, 3lbs, and 2lbs). The vacation task asked subjects to imagine that they were choosing a resort vacation; the task involved Cost Per Day ($500, $400, $300, $200 or $100), Nightlife (Very Boring, Boring, Average, Exciting and Very Exciting) and Sightseeing (Very Poor, Poor, Average, Good, or Very Good). Note that the three tasks mix qualitative and more quantitative attribute descriptions. In addition, the three task contexts differ in terms of how the attribute price was operationalized 8
(e.g., overall price, price per day, or price irrelevant due to third-party payment). We believe that this task variation is important because price itself might be seen as a signal for higher quality, and therefore (desirable) low prices might actually be seen as conflicting with (desirable) good attribute ratings. While our three task contexts are not a direct test of the differential effects of price versus other attributes, they do present some opportunity to determine whether our conceptualization of within-alternative conflict in terms of the evaluative implications of attributes is robust across qualitatively different attribute types. Stimulus characteristics and design. In order for the construct of within-alternative conflict to be useful to consumer researchers, it is necessary to develop a metric for it. Following Fischer et al., we do so by first normalizing attribute values between their best and worst levels, thereby allowing qualitative aspects of an alternative (“very good styling”) to become quantitative. Fischer et al. operationalize conflict in terms of the standard deviation of attribute values. However, we use the measure of conflict defined below, which is correlated 0.97 with standard deviation and gives equivalent substantive results. In the three-attribute case, using x1 , x 2 , and x3 to represent attribute values ranging from 0 (worst) to 1 (best), we calculate the
following measure of attribute conflict: Conflict = [( x1 − x 2 ) 2 + ( x1 − x 3 ) 2 + ( x 2 − x3 ) 2 ] / 3
(1)
The above measure can be derived from an additive utility model (see Jia, Luce and Fischer 2001). This measure could also be considered an extension of Chatterjee and Health’s (1996) use of tradeoff size to reflect between-alternative conflict in the two-attribute case. In general, our measure of attribute conflict is determined by differences between pairs of attribute values and is extremely simple to implement once attributes are scaled between best and worst values. As discussed in the Fischer et al. work, within-attribute conflict is related to a second, attribute-level stimulus characteristic, namely attribute extremity. Attribute extremity is likely to
9
influence preference uncertainty, in that more moderate attribute values might be harder to associate with preference judgments than are more extreme values. This construct is important to the current endeavor because higher levels of conflict are only possible for alternatives defined by extreme values. Levels of an attribute close to the upper or lower bound should lead to less uncertainty regarding the relevant single-attribute value, and so extremity should be associated with reduced uncertainty. We calculated levels of average extremity for each alternative by again scaling attribute values between 0 and 1, and then calculating deviations from the 0.50 midpoint with the following measure: Extremity =| x1 − 0.5 | + | x 2 − 0.5 | + | x 3 − 0.5 |
(2)
The above measure ranges from a value of 0 when all attributes are associated with the scale midpoint to 1.5 when all attributes are associated with scale endpoints. Note that in order to operationalize extremity, one must somehow synthesize the implications of extremity in individual dimensions across an alternative. Our measure of extremity sums these attribute-level characteristics for simplicity, but other combination rules (e.g., weighting attribute extremity by importance and/or differentially weighting lower levels to account for loss aversion) also seem reasonable. The correlation between the conflict and extremity measures for our 20-item stimulus set is 0.71.1 Finally, one goal in our current work is to establish whether Fischer et al.’s withinalternative conflict effects extend to evaluations themselves. To establish this effect, we felt it necessary to first control for attribute values– that is, we sought to test for the presence of a
1
Jia, Luce and Fischer (2001) derive the following, alternative extremity measure: Extremity = [( x1 ln x1 ) 2 + ( x 2 ln x 2 ) 2 + ( x3 ln x 3 ) 2 ] / 3 . This alternative measure is theoretically derived from the notion of variance in utility. The alternative measure is correlated –0.92 with the measure we provide above. The two measures provide substantively identical results; with the exception that the extremity measure in equation (2) is of course negatively related to preference uncertainty while the measure developed in Jia, Luce and Fischer (2001) is positively related to uncertainty.
10
unique influence of conflict over and above the obvious influence of attribute values on judgment. Thus, we constructed a particularly strong value covariate by calculating predicted values from nonlinear regressions run individually for each subject. These nonlinear regressions accounted for potential attribute curvature (e.g., loss aversion for individual attribute values) by adding a power-function parameter to each attribute value. That is, instead of basing the value predictor on the attribute value xi, the value was scaled into utility through the use of a power parameter α i , as in: xi α . In addition, we allowed for attribute substitutability or complementarity i
by including interactions between attributes in each model. Below, we test the influence of conflict on judgments after using hierarchical sums of squares to control for the effects of the value covariate (i.e., we use Type I sums of squares in the Proc GLM routine in SAS, and we enter the value covariate as the first within-subjects term in the model). Note that the effects for all of our dependent variables remain significant and substantively identical (e.g., in relative magnitude) after controlling for value in this manner. Because the value effect is less clearly an alternative hypothesis for our other dependent variables, we report the more standard tests (i.e., Type III SS in SAS without a value covariate) for these variables. Note that in supplementary analyses, value has a significant but small effect on variables other than ratings. We chose the 22 stimuli in Table 1 to cover the entire range of possible conflict levels. We included scale-endpoint alternatives (profiles 0.0, 0.0, 0.0 and 1.0, 1.0, 1.0) as zero-conflict benchmarks. However, the ‘correct’ response to these scale endpoint alternatives (e.g., a rating of 0 and 1.0 respectively) was almost undoubtedly obvious to subjects, so we deleted these trials from the analyses reported below. Results are substantively identical if these trials are retained. Dependent Measures. For our direct replication of the Fischer et al. results, we calculated their primary measure of implicit preference uncertainty, namely the absolute value of the difference between an item’s time one and time two evaluations (denoted ‘response error’). We
11
also calculated two measures that are more consistent with the types of variables typically considered by consumer researchers. A subjective confidence measure for each trial was calculated as the value of a subject’s expressed upper bound minus his or her expressed lower bound (each at an accuracy of two decimal places and bounded by 0 and 1.0), and then the two confidence measures for each subject / trial combination were averaged for a measure denoted confidence interval width. This procedure has the disadvantage of constraining one side of the interval at extreme ratings (e.g., given a rating of 0.99, the upper confidence interval boundary must be within .01 of the subject’s rating, although the lower confidence interval boundary would be unconstrained). Supplementary tests on an alternative measure calculated based on the absolute difference between the subject’s rating and the unconstrained side of the confidence interval boundary indicate our results are not an artifact of scale-boundary constraints. Finally, each subject’s rating of each alternative was measured at an accuracy of twodecimal places, assigning a 0 to the lower scale boundary and a 1 to the upper scale boundary (denoted ‘evaluative rating’). Results Analyses of confidence intervals involve only between-subjects variable (judgment context), while analyses of response error and evaluative ratings involve both judgment context and interval-task (present or absent). Response error. The r-square for the model predicting response error is 0.19. The main effect of conflict is significant (F(1, 2435) = 11.70, p < .0001; $ = .06; T2 = .03). The main effect of extremity is not significant (F(1, 2435) = 0.25, n.s.). Conflict and extremity interact (F(1, 2435) = 6.38, p < .01; T2 = .01); however, the effect of conflict is positive in sign (in both cases, $ = .06) and highly significant (p < .001) within both low (below-median) and high extremity.
12
In addition to replicating Fischer et al.’s conflict effect, our response error results are robust across both judgment context and the presence of the confidence interval task. Specifically, response error does not show significant effects involving context, interval, or their interaction (Fs < 1). Similarly, there are no significant interactions between conflict and these between-subjects measures (Fs < 1). Thus, we demonstrate the robustness of Fischer et al.’s within-alternative conflict result across consumer judgment tasks. We also help to validate our new (confidence interval width) preference uncertainty measure by showing that the presence of this measure does not influence the degree of implicit preference uncertainty expressed. Confidence Interval Width. The r-square for the model predicting confidence interval width is 0.54. Conflict has a significant effect on confidence interval width (F(1, 1207) = 88.29, p < .0001; T2 = .04) increasing expressions of subjective uncertainty, as expected ($ = 0.16). The extremity index also affects expressed confidence (F(1, 1207) = 50.64, p < .0001; T2 = .02), with greater extremity decreasing confidence interval width (increasing expressed confidence), as expected ($= -0.08). Conflict and extremity marginally interact (F(1, 1207) = 3.07, p = .08). The conflict effect is stronger within high-extremity trials ($=.13) than within low-extremity trials ($=.07), although statistically significant within both extremity ranges. The main effect of judgment context is not significant (F(2, 61) < 1), nor does context interact with conflict or extremity (Fs < 1.25, p > .25). Thus, effects are robust across task contexts. Evaluative Ratings. Finally, we desired to determine whether the effects of withinalternative conflict would extend to subjects’ actual evaluations of items. Our extremely powerful value covariate correlates .96 with average ratings (p < .0001) across the data set; thus, the r-square for the overall model is 0.92. The value covariate has a highly significant effect, as would be expected (F(1, 2431) = 2490.51, p < .0001, $ = 1.05). However, even after controlling for value, the conflict effect is significant (F(1, 2431) = 18.72, p < .0001; $ = -0.14; T2 = 0.001), while extremity is not (F < 1). There is a conflict by extremity interaction (F(1,2431) = 37.61, p 13
< .0001 T2 = .01); however, the negative, significant (but small) effect of value is consistent across the range for extremity. The effects of context, interval, and their interaction are nonsignificant (Fs < 1). The lack of a main effect for the interval task, in particular, indicates that the presence of the confidence-width measurement did not disrupt subjects’ processes of rating the alternatives. Context does, however, interact with conflict (F(1, 2431) = 4.23, p < .01). Again, though, robustness is indicated because the effects of conflict are consistently significant and negative across the three task contexts ($ ‘s between -.07 and -.13). Thus, across task and measurement contexts, our subjects seem to adjust their ratings downward to account for increasing conflict. While this effect is extremely small in magnitude, note that it is obtained even after implementation of a strong statistical control for an alternative’s estimated value. Discussion Summary In this paper, we replicate Fischer, Luce & Jia’s (2000) findings that within-alternative conflict leads to implicit manifestations of preference uncertainty operationalized through testretest reliability in evaluative ratings. In doing so, we demonstrate that within-alternative context effects appear robust across task contexts, even for instance extending to situations where the attribute “price” is relevant and therefore consumers might be expected to associate (undesirable) high prices with (desirable) high quality. In addition, we extend Fischer et al.’s results to a new task environment, one where subjects are requested to provide a more explicit manifestation of preference uncertainty by indicating bounds around ratings. We believe consumer researchers are likely to be concerned with such explicit manifestations rather than more implicit ones, for two reasons. First, consumer researchers may want to study consumers’ direct feelings about confidence in their judgments. Second, it is more reasonable to conduct pretests of withinalternative conflict effects if subjects can be asked to provide an explicit expression of certainty.
14
While we used a computer-mediated task in order to provide the many, precise ratings necessary for test-retest reliability, we believe that the lack of (confidence interval present versus absent) task differences indicate that the confidence-interval task might be fruitfully used to assess particular tasks in isolation (e.g., through paper and pencil ratings of a few alternatives). We also find that within-alternative conflict effects extend to actual evaluative ratings, in that higher conflict alternatives are associated with lower evaluations, even after we implement a reasonably powerful control for estimated alternative values. While the effect on ratings is certainly small in magnitude, it does indicate that conflict-driven feelings of preference uncertainty may be extended (by consumers) to overall assessments of the goodness of alternatives. Further, our evaluative rating results are not influenced by whether subjects are asked to complete the subjective confidence task, indicating no disruption of evaluation. Future Work We believe that the construct of within-alternative conflict has important theoretical and methodological implications for consumer researchers. We replicate Fischer et al.’s main conflict results; in the process, we extend these results to a task and measurement (of preference uncertainty) domain that is potentially more relevant to consumer research. It is our hope that this work on within-alternative conflict can provide a building block for additional work on consumers’ expressions of preference and their feelings about such expressions. Methodological Implications. Consumer researchers often study conflict, operationalizing it in a between-alternative manner. However, conflict might have distinct, between versus within-alternative, sources. For instance, consider the set of alternatives defined in Table 2. These alternatives represent stylized stimuli of the type that are commonly used in consumer research. A researcher seeking to operationalize conflict within this sort of context might inadvertently mix operationalizations of between- versus within-alternative conflict. For instance, consider the choice between Alternatives 1 and 2 in the table; then, consider the choice 15
between Alternatives 3 and 4. In terms of between-alternative conflict, both choices involve a tradeoff of a .25 advantage on attribute 1 versus a 0.25 disadvantage on attributes 2 and 3; thus, the two choice sets might be characterized as having identical levels of between-alternative conflict. However, the first choice set involves alternatives with higher levels of withinalternative conflict; that is, it involves more discrepancy among attribute values within an alternative. Consumer reactions to this choice might therefore implicate both sources of conflict. One basic concern, then, is that studies of reactions to between-alternative conflict will be complicated or confused by unmeasured variations in within-alternative conflict. Of course, the relative prominence of between- versus within-alternative conflict is likely to be dependent on predictable background conditions. For instance, within-alternative conflict should have more impact on decision processing when decision makers use more alternative-based processing (considering each alternative in isolation during choice, for instance when trying to calculate an expected utility for each option). Certain response modes (e.g., those requiring some summary judgment of each alternative such as a willingness to pay for each item) should encourage more alternative-based processing, again bringing within-alternative conflict to the forefront. We hope that by providing and validating a way to easily measure within-alternative conflict (e.g., through pretest tasks measuring confidence intervals) we might allow consumer researchers to make better use of the more general conflict construct. We believe that our confidence interval task will migrate very well to standard (e.g., paper and pencil pretest) consumer research environments. Thus, within-alternative conflict effects should be considered as a factor in research designs assessing preference and in considerations of market structure. The presence of within-alternative conflict effects should be relatively easy to predict, as the calculation of the proposed measure requires only knowledge about consumers’ perceptions of attribute values; this kind of information is likely to be gained via standard basic and applied consumer research techniques. 16
Theoretical Implications. In this paper, we replicate and extend work that demonstrates within-alternative conflict can affect judgments about alternatives in a consumer setting. The current work does not establish whether and how reactions to within-alternative conflict might differ from reactions to between-alternative conflict. For instance, it is possible that withinalternative conflicts are associated with phenomenologically different types of reactions (e.g., more akin to pre-decisional ambivalence and post-decisional disappointment, see Otnes, Lowrey & Shrum 1997 and Bell 19XX) than are between-alternative conflicts (e.g., feelings of indecision or regret, see Bell 19XX). It is also possible that within-alternative conflict is simply another path to the same phenomenological outcomes as those generated by between-alternative conflict. We believe there are opportunities for future research regarding consumers’ emotional reactions to various forms of conflict. In a recent review of calibration in consumer knowledge, Alba and Hutchinson (2001) note that little is known about determinants of confidence in preference, and in fact insight into one’s own preferences often seems to be surprisingly poor (e.g., Nisbett and Wilson 1977). Most work regarding calibration in confidence involves items for which objective benchmarks can easily be constructed (e.g., general knowledge questions). The typical finding is that individuals are generally poorly calibrated, with the primary trend being towards overconfidence in one’s own knowledge or judgment. We believe that the dependent measures developed for our experiment might be useful in beginning to bridge the gap between research on confidence in knowledge and the need to understand the determinants of confidence in consumer choice. In particular, our confidence interval measure is similar in form to many measures used in research on calibration or confidence, but it can be more directly related to evaluative judgments of consumer items. For example, our proposed manifestations of preference uncertainty allow for sensitive tests of interventions designed to reduce preference uncertainty. For instance, consumer information and education techniques designed to facilitate tradeoffs in choice should also lead 17
to alternations in both subjective confidence and test-retest response error. Similarly, levels of consumer expertise may influence the sensitivity of uncertainty measures to levels of conflict. More generally, future work should address the theoretical pathways between conflict, error, confidence, and ultimate preference expression (e.g., ratings, or choices in the betweenalternative situation). At least two underlying causal models may operate. First, conflict may generate response error (the basic Fischer et al. result), which in turn both reduces subjective confidence and moderates evaluative ratings. Alternatively, conflict may directly influence subjective confidence, response error, and expressed preferences. In the current study, we failed to find evidence that response error mediated conflict’s effects on other variables, indicating that the latter process is operating. Other potential mediators include the difficulty mentally integrating attribute values, potentially leading to increased error and reduced confidence. Again, we believe that these questions pose opportunities for future work.
18
References FIX>>> Alba, Joseph W. and J. Wesley Hutchinson (forthcoming). “Knowledge Calibration: What Consumers Know and What They Think They Know,” Journal of Consumer Research Bettman, J.R., Johnson, E.J., Luce, M.F., and Payne, J.W. (1993). Correlation, conflict, and choice. Journal of Experimental Psychology: Learning, Memory, and Cognition. 19, 931951. Bettman, J.R., Luce, M.F., and Payne, J.W. (1998). Constructive consumer choice processes. Journal of Consumer Research, 25 (December), 187-217. Chatterjee, S. and Heath, T.B. (1996). Conflict and loss aversion in multiattribute choice: The effects of trade-off size and reference dependence on decision difficulty. Organizational Behavior and Human Decision Processes, 67, 144-155. Fischer, G W. Luce, M. F., and Jia. J. (2000). Attribute conflict and preference uncertainty, Part 1: Effects on judgment time and error. Management Science, 46 (January), 88-103. Fischer, G W. Jia, J., and Luce, M. F. (2000). Attribute conflict and preference uncertainty, Part 2: The RandMAU model. Management Science, 46 (May), 669-684. Fischer, G. W. (1976). Multi-dimensional utility models for risky and riskless choice. Organizational Behavior and Human Performance, 17, 127-146. Heath, Timothy B., Michael S. McCarthy, and David L. Mothersbaugh (1994), “Spokesperson Fame and Vividness Effects in the Context of Issue-Relevant Thinking: The Moderating Role of Competitive Setting,” Journal of Consumer Research, 20 (March), 520-534.
19
Jia, J., Luce, M.F. and Fischer, G. W. (1999). Consumer preference uncertainty: Measures of attribute conflict and extremity. Working paper. Lewin, K. A Dynamic Theory of Personality. New York: McGraw-ill. Luce, M. F. (1998). Choosing to avoid: Coping with negatively emotion-laden consumer decisions. Journal of Consumer Research, 24, 409-433. Miller, Neal E. (1959), “Liberalization of Basic S-R Concepts: Extensions to Conflict Behavior, Motivation, and Social Learning,” in Psychology: A Study of a Science, Vol. 2, ed. Sigmund Koch, New York: McGraw-Hill, 196-292. Miniard, Paul W., Deepak Sirdeshmukh, and Daniel E. Innis (1992), “Peripheral Persuasion and Brand Choice,” Journal of Consumer Research, 19 (September), 226-239. Nisbett, Richard E. and Timothy DeCamp Wilson (1977), “Telling More than We Can Know: Verbal Reports on Mental Processes,” Psychological Review, 84 (May) 231-259. Otnes, C., Lowrey, T.M. and Shrum, L.J. (1997). Toward an understanding of consumer ambivalence. Journal of Consumer Research, 24, 80-93. Payne, J. W., Bettman, J. R. and Johnson, E. J. (1993). The Adaptive Decision Maker. Cambridge, England: Cambridge University Press. Slovic (1990)///// Tversky, A. and Shafir, E. (1992). Choice under conflict: The dynamics of deferred decision. Psychological Science, 3, 358-361.
20
Trial
X1
X2
X3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
0.50 0.75 0.25 0.50 0.00 0.00 0.75 0.25 1.00 0.75 0.25 0.00 0.50 0.75 0.00 0.00 1.00 0.00 0.00 1.00 0.00 1.00
0.50 1.00 0.00 0.75 0.25 0.50 0.25 0.75 0.50 0.00 1.00 0.75 1.00 0.00 1.00 0.75 0.00 1.00 0.00 0.00 0.00 1.00
0.50 1.00 0.00 1.00 0.50 0.00 0.25 0.00 0.25 0.00 0.25 0.75 0.00 1.00 0.25 1.00 0.00 0.00 1.00 1.00 0.00 1.00
Table 1. Experimental Stimuli and Average Data Rating Interval Interval Error Width Error Conflict Extremity 0.000 0.042 0.042 0.125 0.125 0.167 0.167 0.292 0.292 0.375 0.375 0.375 0.500 0.542 0.542 0.542 0.667 0.667 0.667 0.667 0.000 0.000
0.00 1.25 1.25 0.75 0.75 1.00 0.75 1.00 0.75 1.25 1.00 1.00 1.00 1.25 1.25 1.25 1.50 1.50 1.50 1.50 1.50 1.50
0.10 0.08 0.10 0.11 0.12 0.12 0.13 0.13 0.15 0.13 0.11 0.15 0.13 0.13 0.13 0.13 0.11 0.15 0.13 0.13 0.05 0.04
0.22 0.17 0.13 0.19 0.19 0.18 0.20 0.23 0.22 0.17 0.24 0.23 0.25 0.25 0.26 0.24 0.18 0.23 0.25 0.26 0.09 0.11
0.13 0.08 0.08 0.09 0.11 0.09 0.10 0.09 0.09 0.09 0.09 0.09 0.10 0.12 0.12 0.12 0.12 0.11 0.13 0.10 0.06 0.08
Average Estimated Value
Average Rating
0.45 0.89 0.10 0.77 0.23 0.20 0.36 0.37 0.47 0.21 0.49 0.51 0.48 0.48 0.41 0.66 0.26 0.35 0.31 0.52 ---
0.50 0.90 0.12 0.78 0.22 0.24 0.29 0.39 0.45 0.24 0.41 0.56 0.49 0.47 0.37 0.62 0.26 0.37 0.30 0.51 0.04 0.97
Note: Trials 21 and 22 were not included in statistical analyses of conflict effects. Columns labeled X1, X2 and X3 represent attribute values (0.00 = worst value). Data are averaged over subjects, and time 1 / time 2 trials, where appropriate. Rating error is the absolute value of the difference between the two evaluative values assigned to an alternative. Interval width is the confidence interval reflecting subjective preference uncertainty. Interval error is the absolute value of the difference between the time 1 versus time 2 interval widths. Estimated value is based on subject-specific nonlinear regressions accounting for attribute curvature and attribute substitutability / complementarity. Average rating is the (0 – 1.0) evaluative value assigned to each alternative.
21
Table 2: Example Stimuli Attribute 1 Attribute 2 Attribute 3 75 25 25 Alternative 1 100 0 0 Alternative 2 0 25 25 Alternative 3 25 0 0 Alternative 4 Note: For example purposes, assume attributes are scaled between 0 (= worst value) and 100 (= best value).
22
Conflict and Subjective Preference Uncertainty
23
23
Rating Task Screen
Interval Assessment Task Screen:
NOTE: The rating task shown above corresponds to the interval-task-present condition. The only difference associated with the interval-task-absent condition was that the lower button read: ”Click here to go on to the next screen.”