Research Report
Comparability Effects in Probability Judgments
Psychological Science 23(8) 848–854 © The Author(s) 2012 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/0956797612439423 http://pss.sagepub.com
Timothy J. Pleskac Michigan State University
Abstract Psychological theories of subjective probability judgments assume that accumulated evidence (support) mediates the relation between the description of a to-be-judged event (hypothesis) and the judgment. To scale a probability from the support for a hypothesis, these psychological theories make a strong independence assumption. This assumption is stated in the form of the product rule, in which the support garnered for a particular hypothesis is independent of the support for the alternative hypothesis. In the study reported here, I asked participants to judge the likelihood of a bicyclist winning a simulated race. Results showed that the independence assumption was systematically violated. Observed judgments suggest that when a probability judgment is made, the comparability or similarity of the hypotheses on one dimension increases the weight that judges allocate to differences on the other dimensions. These results speak against the simple scalability-processing assumption of support theory, and they illustrate the need for a theory of judgment processes that describes how the similarity between hypotheses shapes judgment. Keywords judgment, cognitive processes, decision making, subjective probabilities, support, independence, comparability effect Received 7/8/11; Revision accepted 1/24/12
How people should and do reason about uncertain events has long been a topic of interest (e.g., French, 1986; Griffiths & Tenenbaum, 2009; Pearl, 1988; Phillips & Edwards, 1966; Shafer, 1976; Tversky & Kahneman, 1974). These accounts often take as a given that people are comfortable describing their belief that an event will occur with a number between 0 and 1. Yet how people form these subjective probability judgments from the evidence at hand is much less well understood. Perhaps the most prominent descriptive theory about how people form subjective probabilities is support theory (Rottenstreich & Tversky, 1997; Tversky & Koehler, 1994). According to this theory, each description of an event or hypothesis (e.g., a favorite bicyclist winning a race) has a fixed value representing the strength of evidence or support for that event occurring, s(A). The probability that Bicyclist A will win is then estimated from the balance of support for the focal hypothesis compared with the total support of both Hypothesis A and Hypothesis B (i.e., that Bicyclist B will win) in the evaluation frame: P ( A, B ) =
s( A) . s ( A ) + s( B )
Rottenstreich, 2003; Koehler, White, & Grondin, 2003; Merkle & Van Zandt, 2006; Thomas, Dougherty, Sprenger, & Harbison, 2008). The model is also assumed in analyses of subadditivity (e.g., Bearden, Wallsten, & Fox, 2007) and decisions made under uncertainty (Fox & Tversky, 1998). Support theory is based on a set of strong assumptions. Considerable focus has been allocated to the subadditivity assumption, which states that the support for a packed hypothesis (e.g., that a bicyclist from the United States will win a race) is less than the sum of the support given in total to the constituent hypotheses (e.g., Lance Armstrong, Levi Leipheimer, or any other bicyclist from the United States will win) (Dougherty & Sprenger, 2006; Fox, 1999; Koehler, Brenner, & Tversky, 1997; Rottenstreich & Tversky, 1997; Tversky & Koehler, 1994; though see Bearden et al., 2007; Sloman, Rottenstreich, Wisniewski, Hadjichristidis, & Fox, 2004). However, the independence assumptions of support theory are more pertinent to the present research. One assumption is binary complementarity, which states that the probability that
(1)
This model is used as a basis for estimating subjective probabilities in many theories (e.g., Brenner, 2003; Fox &
Corresponding Author: Timothy J. Pleskac, Department of Psychology, Michigan State University, East Lansing, MI 48824 E-mail:
[email protected]
849
Probability Judgments Hypothesis A will occur plus the probability that the alternative Hypothesis B will occur should equal 1: 1 = P(A, B) + P(B, A).
(2)
The assumption is immediately apparent in Equation 1. Binary complementarity implies that the support generated for a hypothesis is independent of whether the hypothesis is focal or alternative. Testing this property is straightforward and holds under most conditions (Fox, 1999; Tversky & Koehler, 1994; though see Macchi, Osherson, & Krantz, 1999). Support theory also makes a second independence assumption known as the product rule. The product rule implies that there is no interaction between the competing hypotheses. It is stated as follows: R(A, B) × R(B, D) = R(A, C) × R(C, D),
(3)
where R(A, B) is the probability ratio P(A, B)/P(B, A). Substituting Equation 1 into Equation 3 shows that both sides of the equation equal s(A)/s(D), and this demonstrates that the support given to a hypothesis is the same regardless of competing hypotheses. The product rule serves a similar role as the assumption of strong stochastic transitivity does in theories of choice: It helps ensure the existence of a positive real valued support function s from which a probability judgment is scaled (see Equation 1).1 This is a powerful assumption. For example, it means that people can estimate a probability judgment directly from assessments of the strength of a hypothesis or examine what happens when the alternative hypotheses change in a judgment while holding the focal hypothesis constant, as is necessary for subadditivity analyses (Tversky & Koehler, 1994). Although people’s behavior is sometimes consistent with the product rule (Fox, 1999), we also know from 50 years of work in psychology that when people choose between options with multiple attributes, the comparability or similarity of the options in a choice set on one dimension influences the perceived differences of those options on the other dimensions (Krantz, 1967; Mellers & Biagini, 1994; Rumelhart & Greeno, 1971; Tversky & Russo, 1969). For example, when choosing between jobs on the basis of commute time and salary, people will put more emphasis on a difference in salary when the jobs have similar commute times (Mellers & Biagini, 1994). Understanding this systematic violation of a simple consistency principle of choice helped pave the way for the development of more psychologically plausible theories of choice that describe how even the context of the choice set shapes people’s choices (Rieskamp, Busemeyer, & Mellers, 2006). Do people commit a similar violation of consistency during subjective probability estimation? If the answer is yes, then violations in the product rule can occur and even a single alternative hypothesis could shape the evaluation of the focal hypothesis. The comparability effect may also not generalize to probabilities. Sometimes, choice and judgment processes appear to be
quantitatively and qualitatively quite different (Pettibone & Wedell, 2000). To investigate whether people systematically violate principles of consistency when estimating subjective probabilities, I asked participants in the study reported here to make a series of probability judgments predicting the outcome of a bicycle race between two competitors on the basis of each bicyclist’s ability to sprint and climb. The comparability of the two dimensions was manipulated to investigate whether when two bicyclists had similar abilities on one dimension, participants put more emphasis on the other dimension.
Method Participants Twenty-seven participants were recruited from a university community for a 2-hr experiment. Participants were paid $8 per hour plus a performance reward (on average about $6).
Design During each trial, the computer presented two bicyclists along with their sprinting and climbing abilities. This information was presented in a 2 × 2 table, with a bicyclist in each column and a statistic in each row. The row order of the sprinting and climbing statistics was counterbalanced between subjects. The 225 experimental pairs of bicyclists were constructed from a 25 (Team A) × 9 (Team B) factorial design. The statistics for Team A bicyclists were determined by crossing five levels of sprinting ability with five levels of climbing ability. Sprinting ability was indexed by how quickly each bicyclist completed an 800 meter lap during a past race. Times of 45 s, 55 s, 85 s, 115 s, and 125 s were used. Participants were told that the fastest time ever was 35 s, the average lap time was 85 s, and the slowest time was 135 s. Climbing ability was indexed by how far up a steep mountain a bicyclist got in 20 min. The distances of 3 km, 3.6 km, 5.5 km, 7.4 km, and 8 km were used. Participants were told that the best grade was 8.6 km, the average was 5.5 km, and the worst was 2.4 km. Team B had a similar structure but was created with a 3 (sprinting ability) × 3 (climbing ability) factorial with sprinting times of 45 s, 85 s, and 125 s, and climbing distances of 8 km, 5.5 km, and 3 km. The computer simulated the winner of each race using Luce’s (1959) choice rule to determine the probability of each bicyclist winning from the sum of the standardized sprinting and climbing statistics (i.e., overall fitness). This procedure ensured independence in the statistical environment.
Procedure Participants completed three tasks. First, they completed training trials, in each of which they were shown two bicyclists and asked to predict who would win a two-person race. Climbing and sprinting statistics for each bicyclist were randomly
850
Results One subject was removed from the data set because of erratic behavior that resulted in many very short (< 1 s) and very long trials (> 10 s); thus, data from 26 subjects were analyzed. The reliability of the judgments was examined via tests of binary complementarity (Equation 2). Figure 1 displays the relation between the complementary judgments for the median probability judgments assigned to each event, collapsed across participants. The mean total probability for these judgments was 1.03 (SD = 0.04), which is significantly different from 1.00 (p < .05). At the individual level, 18 out of the 26 subjects also had an average total probability that was significantly different from 1.00. These deviations were quite small. The average total probability across participants (assuming random effects) was 1.02 (range = 0.96–1.10; between-subjects SD = 0.03, SE = 0.01, p < .05). Participants were also well tuned to the true objective probability of the event occurring (determined from the overall fitnesses of both bicyclists). Figure 2 shows that the subjective probabilities (averaged across complementary judgments for each pair of cyclists) tracked the objective probabilities. Using Goodman and Kruskal’s gamma (G) (Gonzalez & Nelson, 1996), I calculated the rank order correlation between the median subjective probabilities and the objective probabilities as .92 (SE = .01). At the individual level, the relationship between subjective and objective probabilities was slightly lower but still good. The average association was .78 (range = .57–.88; between-subjects SD = .08, SE = .01, p < .05). The plot does show some mild underestimation of probabilities below .5 and overestimation above .5.
Subjective Probability (Cyclist B, Cyclist A)
1.00
.75
.50
.25
.00 .00
.25
.50
.75
1.00
Subjective Probability (Cyclist A, Cyclist B) Fig. 1. Scatter plot showing complementary judgments. Each circle represents the median probability judgments for the two members of a given pair. Each pair was presented twice, with the left-right positions of the two cyclists switched. On each trial, participants indicated the probability that the cyclist on the right would win. The condition of binary complementarity implies the judgments should fall along the diagonal (dashed line).
Turning to the comparability effect, the hypothesis is that when two bicyclists in a pair are highly comparable on one dimension (e.g., climbing ability), then their differences on the other dimension (e.g., sprinting ability) will have greater impact than when the two bicyclists have low comparability on the first dimension. A subset of judgments was used as a
1.00
Subjective Probability (Cyclist A, Cyclist B)
generated from within the range of the values used in the experimental trials. For the first 10 trials, participants were shown only the sprinting abilities; on the next 10 trials, they were shown only the climbing abilities. On the final 50 trials, they were shown both the sprinting and climbing abilities. After each trial, the computer simulated the race, informed participants of the race’s outcome, and reported the frequency out of 100 times that the chosen bicyclist would have won. The second training task familiarized participants with the probability scale used in the experimental trials. The scale was a semicircular scale at the bottom of the screen that stretched from 0% (impossible) to 50% (toss up) to 100% (certain). Participants clicked a mouse to select a probability. For the experimental task, participants made 450 judgments about the probability that the bicyclist in the right column of the table would win the race. Participants first made probability judgments about the 225 pairs of bicyclists (presented in random order). Then participants made judgments about the same 225 pairs, but the bicyclists were shown in the opposite positions. To minimize recall of previous responses, I presented the trials in the same order as on the first set of prediction trials. Feedback was not given during the prediction trials. Participants were rewarded for accurate predictions on each trial using a linear transformation of the quadratic scoring rule.
Pleskac
.75
.50
.25
.00 .00
.25
.50
.75
1.00
Objective Probability (Cyclist A, Cyclist B) Fig. 2. Scatter plot showing the relation between the subjective probability and the objective probability that Cyclist A would beat Cyclist B. Each circle represents the median probability judgments for the two members of a given pair. The subjective probability was averaged across the two complementary judgments after transforming the complement to be in the same direction. Perfectly accurate subjective probability judgments would fall along the diagonal (dashed line).
851
Probability Judgments first test of the comparability effect. The subset consisted of the subjective probabilities for the nine cyclists from Team A who had the same sprinting ability (45 s, 85 s, and 125 s) and climbing ability (8 km, 5.5 km, and 3 km) as the nine cyclists from Team B did. Figure 3 plots the median probabilities assigned to these bicyclist pairs (averaged across complementary judgments for each pair) for each possible difference in sprinting ability and climbing ability. As Figure 3 shows, there was an interaction on the subjective-probability scale between the climbing and sprinting differences: The impact of the climbing difference was different when the cyclists were comparable (0 s difference) on their sprinting ability and vice versa, F(16, 56) = 7.6, p < .05, η2 = .03. At the individual level, 22 out of 26 (85%) participants showed a similar significant interaction. The average effect size (η2) across participants was .07 (SD = .03). These interactions are evidence that the comparability effect—so pervasive in perceptual and preferential choice—generalizes to subjective probabilities.2 To examine the impact of the comparability effect on the product rule, I calculated an empirical estimate of probability ratios (R(A, B) = SP(A, B)/SP(B, A)) using the complementary judgments. Before doing so, all subjective probabilities of .0 and 1.0 were altered to .995 and .005, respectively. This adjustment was made for a total of 3.8% of the judgments. Next, I found all possible sets of four bicyclists (quartets) to test the product rule in Equation 3. Excluding judgments about bicyclists with identical climbing and sprinting abilities, there were 9,108 total quartets. Finally, for each quartet, the following test statistic was calculated:
d = ln[R(A, B)] + ln[R(B, A′)] – ln[R(A, B′)] – ln[R(B′, A′)] (4)
(Fox, 1999), where A and A′ are cyclists from Team A, and B and B′ are cyclists from Team B. The product rule implies that d should be 0, but the comparability effect implies systematic deviations away from 0. Next, the quartets were separated into categories according to how many ratios within the quartet contained a dimension on which the two bicyclists were comparable. In total, there can be zero, one, two, three, or four ratios containing a comparable dimension. The comparability effect predicts that a quartet with 0 comparable dimensions should not deviate from 0 and that as ratios containing comparable dimensions are added, the deviation from 0 should increase. An exception to this prediction is that if there are multiple ratios, each with a comparability effect, then these effects can be discordant so as to counteract each other (producing a predicted d score of 0). For example, looking at Equation 4, if A is a better sprinter than B, but the two are comparable on climbing, and at the same time, if B is a worse climber than A′ but the two are comparable on sprinting, then this would produce discordant comparability effects on the product rule. This can happen for quartets with two and three ratios, and it did happen for all quartets with four ratios. Figure 4 shows the average observed d score for the seven categories (recoded so that d was positive if it was in the predicted direction of the comparability effect). Consistent with the predictions, results showed that quartets containing bicyclists with no comparable dimensions were not significantly
Subjective Probability (Cyclist A, Cyclist B)
1.00 80-s Sprinting Difference 40-s Sprinting Difference 0-s Sprinting Difference
.80
.60
−40-s Sprinting Difference −80-s Sprinting Difference
.40
.20
.00 −5.0
−2.5
0.0
2.5
5.0
Climbing Difference (km) Fig. 3. Mean subjective probability that each cyclist in a pair would win a race as a function of their differences in climbing and sprinting abilities. The subjective probability was averaged across the two complementary judgments. Error bars represent ±1 SE (estimated from the mean-squared error of the analysis of variance).
852
Pleskac
1.4 1.2
Observed d Score
1.0 0.8 0.6 0.4 0.2 0.0 −0.2
0 Ratios
1 Ratio
2 Ratios, 2 Ratios, 3 Ratios, 3 Ratios, 4 Ratios Discordant Concordant Discordant Concordant
Quartet Category Fig. 4. Average observed d score as a function of quartet category. Quartets consisted of all possible combinations of four bicyclists and were separated into categories according to how many ratios of bicyclists within the quartet contained a dimension on which the bicyclists were comparable. Quartets with two and three ratios of comparable dimensions were further analyzed in terms of whether the comparable dimensions predicted deviations in the same (concordant) or different (discordant) directions. Error bars show standard errors of the mean (estimated from the mean square error of the analysis of variance).
different from 0, and as more ratios with comparable effects were added, the deviation from 0 increased. This was confirmed with a single-factor repeated measures analysis of variance with the five groups (using only the concordant pairs), F(4, 100) = 6.9, p < .05, ηG2 = .23. At the individual level, 23 out of the 26 participants showed a similar pattern (average η2 = .04, SD = .06). Also, as Figure 4 shows, the quartets with two or three comparable concordant dimensions had larger positive d scores than the quartets with two or three comparable discordant dimensions did, F(1, 25) = 5.8, p < .05, ηG2 = .17. Twenty-two of the 26 participants showed a similar pattern at the individual level.
Discussion Descriptive theories of probability judgments assume that people estimate subjective probabilities using the balance of support for the focal hypothesis relative to the total support of all hypotheses in the evaluation frame (Tversky & Koehler, 1994). This support model makes an independence assumption analogous to the consistency assumptions of early choice theories (e.g., Luce, 1959): The level of strength or support assigned to each hypothesis in a given choice set is independent of the competing hypotheses in that choice set. The study reported here is the first to show that people systematically violate the independence assumption known as the product rule. The violation is a result of the comparability effect—a phenomenon common in choice—in which as two competing
hypotheses grow more similar on one dimension, the weight given to the differences on the other dimension increases. The current study speaks strongly against a simple strength model in which the support for a given hypothesis (i.e., a bicyclist winning a race) is a constant regardless of the alternative hypotheses it is paired with. Perhaps, however, participants just form hypotheses differently when they consider a twoevent sample space (e.g., two competing bicyclists) as opposed to a larger sample space (e.g., four competing bicyclists). So when asked to make a probability judgment that Adam will beat Bob, P(A, B), the estimator’s focal hypothesis is “Adam beats Bob,” and the alternative is “Bob beats Adam” rather than the focal hypothesis being “Adam wins” and the alternative “Bob wins.” Another test of the product rule then would be to use probability judgments in which people judge the probability of a particular hypothesis occurring rather than the probability of another hypothesis from a larger sample space occurring (e.g., in a race among a larger set of bicyclists). This would certainly be an interesting study to run, though I suspect the outcome would be much the same as the outcome of the present study. Cognitively, the comparability effect implies that violations of independence in probability judgment emerge not just because of memory search (Thomas et al., 2008) or a change in the distribution of support among multiple alternative hypotheses (Windschitl & Wells, 1998) but via the cognitive act of comparing one hypothesis with the other hypothesis. The question then is how do these results inform
Probability Judgments thinking on how people accumulate evidence to make probability judgments? For an answer, it may be useful to look at how theories of choice account for the comparability effect. In general, preference formation has been modeled as a process in which options are compared intradimensionally, and the weight given to each dimension is partly influenced by the magnitude of the differences on the other dimension. Respondents may change the weight directly (Mellers & Biagini, 1994). Alternatively, the change in weight may be the result of a stochastic process of attention switching back and forth between dimensions, so that when two options are comparable on one dimension, there is lower overall variability in the dimensional evaluations, which in turn increases the relative impact of a dimensional difference (Roe, Busemeyer, & Townsend, 2001). Either way, these results raise the intriguing hypothesis that belief formation may operate under the same cognitive principles that preference formation does. Acknowledgments I thank David McFarlane for help in programming the experiment, Don Zhang for help in designing the study and collecting data, and Matt Zeigenfuse, Thomas Wallsten, Jerome Busemeyer, and the Laboratory for Cognitive and Decision Sciences for input on the work.
Declaration of Conflicting Interests The author declared that he had no conflicts of interest with respect to his authorship or the publication of this article.
Funding This work was supported by a grant from the National Science Foundation (0955410).
Notes 1. If a third condition is added R(A, B) × R(B, D) = R(A, D), then binary complementarity and the product rule are necessary and sufficient for Equation 1 (Rottenstreich & Tversky, 1997). 2. All comparability conclusions remained the same when analyses were repeated with log-odds transformation of the subjective probabilities.
References Bearden, J. N., Wallsten, T. S., & Fox, C. R. (2007). Contrasting stochastic and support theory accounts of subadditivity. Journal of Mathematical Psychology, 51, 229–241. doi:10.1016/J .Jmp.2007.04.001 Brenner, L. A. (2003). A random support model of the calibration of subjective probabilities. Organizational Behavior and Human Decision Processes, 90, 87–110. doi:10.1016/S07495978(03)00004-9 Dougherty, M. R., & Sprenger, A. M. (2006). The influence of improper sets of information on judgment: How irrelevant information can bias judged probability. Journal of Experimental Psychology: General, 138, 262–281. doi:10.1037/0096-3445.135 .2.262
853 Fox, C. R. (1999). Strength of evidence, judged probability, and choice under uncertainty. Cognitive Psychology, 38, 167–189. doi:10.1006/cogp.1998.0711 Fox, C. R., & Rottenstreich, Y. (2003). Partition priming in judgment under uncertainty. Psychological Science, 14, 195–200. doi:10.1111/1467-9280.02431 Fox, C. R., & Tversky, A. (1998). A belief-based account of decision under uncertainty. Management Science, 44, 879–895. doi:10.1287/mnsc.44.7.879 French, S. (1986). Calibration and the expert problem. Management Science, 32, 315–321. doi:10.1287/mnsc.32.3.315 Gonzalez, R., & Nelson, T. O. (1996). Measuring ordinal association in situations that contain tied scores. Psychological Bulletin, 119, 159–165. doi:10.1037/0033-2909.119.1.159 Griffiths, T. L., & Tenenbaum, J. B. (2009). Theory-based causal induction. Psychological Review, 116, 661–716. doi:10.1037/a0017201 Koehler, D. J., Brenner, L. A., & Tversky, A. (1997). The enhancement effect in probability judgment. Journal of Behavioral Decision Making, 10, 293–313. doi:10.1002/(SICI)1099-0771(199712)10: 43.0.CO;2-P Koehler, D. J., White, C. M., & Grondin, R. (2003). An evidential support accumulation model of subjective probability. Cognitive Psychology, 46, 152–197. doi:10.1016/S0010-0285(02) 00515-7 Krantz, D. H. (1967). Rational distance functions for multidimensional scaling. Journal of Mathematical Psychology, 4, 226–245. doi:10.1016/0022-2496(67)90051-X Luce, R. D. (1959). Individual choice behavior. New York, NY: John Wiley. Macchi, L., Osherson, D., & Krantz, D. H. (1999). A note on superadditive probability judgment. Psychological Review, 106, 210– 214. doi:10.1037//0033-295X.106.1.210 Mellers, B. A., & Biagini, K. (1994). Similarity and choice. Psychological Review, 101, 505–518. doi:10.1037//0033-295X .101.3.505 Merkle, E. C., & Van Zandt, T. (2006). An application of the Poisson race model to confidence calibration. Journal of Experimental Psychology: General, 135, 391–408. doi:10.1037/ 0096-3445.135.3.391 Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco, CA: Morgan Kaufmann Publishers. Pettibone, J. C., & Wedell, D. H. (2000). Examining models of nondominated decoy effects across judgment and choice. Organizational Behavior and Human Decision Processes, 81, 300–328. Phillips, L. D., & Edwards, W. (1966). Conservatism in a simple probability inference task. Journal of Experimental Psychology, 72, 346–354. Rieskamp, J., Busemeyer, J. R., & Mellers, B. A. (2006). Extending the bounds of rationality: Evidence and theories of preferential choice. Journal of Economic Literature, 44, 631–661. doi:10.1257/jel.44.3.631 Roe, R. M., Busemeyer, J. R., & Townsend, J. T. (2001). Multialternative decision field theory: A dynamic connectionist model of decision making. Psychological Review, 108, 370–392. doi:10.1037//0033-295X.108.2.370
854 Rottenstreich, Y., & Tversky, A. (1997). Unpacking, repacking, and anchoring: Advances in support theory. Psychological Review, 104, 406–415. doi:10.1037//0033-295X.104.2.406 Rumelhart, D. L., & Greeno, J. G. (1971). Similarity between stimuli: An experimental test of the Luce and Restle choice models. Journal of Mathematical Psychology, 8, 370–381. Shafer, G. (1976). A mathematical theory of evidence. Princeton, NJ: Princeton University Press. Sloman, S., Rottenstreich, Y., Wisniewski, E., Hadjichristidis, C., & Fox, C. R. (2004). Typical versus atypical unpacking and superadditive probability judgment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 573–582. doi:10.1037/0278-7393.30.3.573 Thomas, R. P., Dougherty, M. R., Sprenger, A. M., & Harbison, J. I. (2008). Diagnostic hypothesis generation and human judgment.
Pleskac Psychological Review, 115, 155–185. doi:10.1037/0033295x.115.1.155 Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. doi:10.1126/ science.185.4157.1124 Tversky, A., & Koehler, D. J. (1994). Support theory: A nonextensional representation of subjective probability. Psychological Review, 101, 547–567. doi:10.1037//0033-295X.101.4 .547 Tversky, A., & Russo, J. E. (1969). Substitutability and similarity in binary choices. Journal of Mathematical Psychology, 6, 1–12. doi:10.1016/0022-2496(69)90027-3 Windschitl, P. D., & Wells, G. L. (1998). The alternative-outcomes effect. Journal of Personality and Social Psychology, 75, 1411– 1423. doi:10.1111/j.1468-5884.2004.00235.x