Nov 25, 2003 - Level-n theory of bounded rationality from its original domain of ...... and J. Walker (1988), "Theory and individual behavior of first-price auction,".
Level-n Bounded Rationality in Two-Player Two-Stage Games by Dale O. Stahl Department of Economics University of Texas at Austin and Ernan Haruvy School of Management University of Texas at Dallas
November 25, 2003
ABSTRACT We extend the Level-n theory of bounded rationality from the domain of symmetric normal-form games to the domain simple two-player, two-stage extensive-form games. We designed and conducted experiments to test pertinent hypotheses. The extended Level-n model fits the data remarkably well and significantly better than subgame perfect equilibrium theory. Moreover, we find that the vast majority of behavior appears individualistic, which we suggest is due to the "level playing field" induced by our multi-task design. Further, we characterize the non-individualistic behavior as stemming from a combination of utilitarian and Fehr-Schmidt other-regarding preferences.
1
1. Introduction. Experimental studies have cast doubt on the usefulness of game theoretic predictions such as subgame perfection. 1 Ultimatum game bargaining indicates that people are willing to punish unfair or unkind behavior at a monetary cost to themselves. In trust, gift exchange, and contribution games, people exhibit a willingness to incur a cost to attain a fair outcome and reward kind behavior. Possible explanations include other-regarding preferences, hot and cold emotional decisions, path-dependent phenomena, bounded rationality, and experimental procedures. Recent works (e.g., Johnson et al., 2002; Binmore et al., 2002) posit and find evidence for an alternative explanation to other-regarding behavior in sequential games. Specifically, it appears that many subjects are unsuccessful at backward induction. Though this is a useful insight, it is important to identify what models of beliefs subjects apply instead. Since the classification of behavior into hierarchies of bounded rationality has been shown useful in numerous games and settings2, and since the underlying econometric structure as a mixture model of behavioral types easily accommodates alternative hypotheses, we extend the Level-n theory of bounded rationality from its original domain of symmetric normal-form games (Stahl-Wilson, 1994-5) to the domain of two-player, two-stage games. We designed and conducted experiments to gather data to fit this model and test pertinent hypotheses. This extended Level-n model fits the data remarkably well and significantly better than subgame perfect equilibrium theory. We embrace the scientific methodology of constructing alternative theories/models followed by empirical testing. Drawing conclusions from falsifications entails the thorny problem of untangling joint hypotheses. For example, the poor performance of subgame perfection is a falsification of the joint hypothesis of (i) the theory of subgame perfection and (ii) the premise that the preferences of the experiment participants were aligned with the game payoffs as specified in the theory and implemented in the experiment. The mere appearance of other-regarding behavior implies that the latter premise was very likely false, thereby leaving us unable to falsify subgame perfection itself. 1
E.g.: Guth, Schmittberger and Schwarz (1982), Andreoni (1988), Thaler (1988), Berg, Dichaut and McCabe (1995), Charness (1996), and Fehr, Kirchler, Weichbold, and Gachter (1998).
2
Stahl, 1993; Stahl and Wilson, 1994, 1995; Nagel, 1995; Duffy and Nagel, 1997; Sonsino, Erev, and Gilat, 1999; Costa-Gomes, Crawford and Broseta, 2001; Haruvy, Stahl, and Wilson, 2001; Ho, Camerer, and Wiegelt, 1998, Gneezy, 2002.
2
In our experimental design, we endeavor to implement the premise about preferences and game payoffs, thereby strengthening our conclusions about any hypothesis conditioned on that premise. For this paper we use a multi-task design that we suggest creates the perception of a "level playing field," which relieves the participants of responsibility for others since competition on a level playing field is presumed to be fair a priori.3 Thus relieved, the participants seek to maximize individual payoffs. There have been other attempts in the literature to reduce other-regarding behavior (e.g., Bolton, 1991), but we believe the present approach corresponds to many real-world situations4. The tasks in our design are familiar two-stage games such as Entry, Ultimatum, GiftExchange, Public Good Contribution, and Trust. Consistent with comparable studies5, we find that the vast majority of behavior appears individualistic. Nonetheless, a substantial amount of first-mover behavior is inconsistent with subgame perfection. Section 2 presents the extension of the Level-n model, and section 3 presents the experimental design and data. Section 4 gives the results of fitting the Level-n model to this data and finds that while the fit is reasonably good, there does appear to be other-regarding behavior. Section 5 shows how hypotheses about other-regarding behavior can be easily added to the Level-n model and tested. In particular, we find that a combination of utilitarianism and FehrSchmidt (1999) type preferences can explain the 16% of the data that is not well-fit by the basic Level-n model.
2. Level-n Bounded Rationality for Two-Player Two-Stage Games. The Level-n model of bounded rationality (Stahl and Wilson, 1994-5) has a straightforward extension to two-player two-stage games. It postulates a population of types, called Level-n types, in which Level-0 types are uniformly random, Level-1 types believe that all
3
Diffusion of responsibility has been studied by Darley and Latane (1968), Fleishman (1980) and Barron and Yechiam (2002). According to diffusion of responsibility, when there are many individuals in a position to help, each individual feels less compelled to do so.
4
In the tournament approach of Bolton (1991), participants in alternating offer bargaining games were paid by their payoff rank relative to other participants in the same role, rather than directly by the amount they earned. Bolton found this method to move behavior towards equilibrium.
5
The comparable studies, to be described in section 3.2, use a multi-task design and discretized games.
3
other players are Level-0 types, and Level-2 types believe that all other players are Level-0 and Level-1 types. The proportions of each type in the population are free variables to be estimated. The simplest hypothesis is that these proportions are the same whether a player is a first-mover or a second-mover. Of course, this hypothesis can and will be tested. We start with the null hypothesis of role-independent proportions.
2.1. Second-Movers. In accordance with backward induction, we begin our analysis with the second movers. Since the second movers are the last movers in these games, they do not need a mental model of the other players. A Level-n (n ≥ 1) type simply chooses a logit best response given the available choices at each node. Letting ν denote the precision of this logit best response, the probability of a Level-1 (for brevity) type choosing j at node m in game g is
Pgm (j,ν ) ≡ exp (νy 2gmj )/
∑k exp(νy 2gmk )
(1)
where k ranges over the choices at node m in game g, and y2gmk is the respective second mover payoff (as given in Figure 1). To accommodate unsystematic behavior and trembles, we also define a Level-0 type as being equally likely to choose each available action at each node. Letting α0 denote the proportion of the second mover population that is Level-0, leaving (1-α0) as the proportion of Level-1 types, and letting xgm(j) denote the aggregate second-mover choices of j at node m in game g, the log-likelihood of the second-mover data is
LL(x; α0,ν) =
∑g ∑n ∑j xgm(j) ln[α0/2 + (1-α0) Pgm (j,ν ) ] .
(2)
Note also that there is no difference between Nash behavior and Level-1 behavior, so Nash types and Stahl-Wilson worldly types are indistinguishable from Level-1 types.
4
2.2. First Movers.
Level-0 first-movers choose each first-mover branch with equal probability. Level-1 types choose a logit best response to the belief that each second mover choice is equally likely.6 Letting ν denote the precision of this logit best response, the probability of a Level-1 first mover choosing branch leading to node m in game g is Pg1 (m, ν ) ≡ exp(νy1gm )/ ∑k exp(νy1gk ) ,
(3)
where k ranges over the branches of game g, and y1gk is the average first-mover payoff following branch k. Level-2 first movers believe that second movers are Level-1 types with precision µ,7 and therefore believe that second-mover choice probabilities are given by eq(1): Pgm (j, µ ) . Hence, the first-mover's expected payoff for choosing branch leading to node m is
Ey1gm =
∑j [ Pgm (j, µ ) y1gm(j)] .
(4)
Then, the probability that a Level-2 first mover chooses branch leading to node m is given by Pg2 (m, ν ,µ) ≡ exp(νEy1gm )/ ∑ k exp (νEy1gk ) .
(5)
Let α0 denote the proportion of the first mover population that is Level-0, α1 the proportion that is Level-1, leaving α2 = (1-α0-α1) as the proportion that is Level-2. Let xg(m) denote the aggregate first-mover choices of the branch leading to node m in game g, and let Bg denote the number of first-mover branches in game g. Then, the log-likelihood of the first-mover data is
6
Alternative specifications of level-1 behavior will be discussed later.
7
As noted in Stahl-Wilson (1994, 1995) allowing µ to be a free variable is observationally equivalent to specifying
Level-2 beliefs as a convex combination of Pgm (j,0) and Pgm (j, ν ) [i.e. Level-0 and Level-1 types].
5
LL(x; α0, α1,ν,µ) =
∑g ∑m xg(m) ln[α0/Bg + α1 Pg1 (m, ν) + α2 Pg2 (m, ν, µ) ] .
(6)
Note that subgame perfect behavior corresponds to Level-2 behavior, so such types are indistinguishable from Level-2 types.
3. The Experiment and Data.
Game theory assumes that a player's preferences over potential material consequences to all players are represented by the player's own expected utility function of those consequences and the player's beliefs (subjective probabilities). Any regard for the consequences impacting other players is implicitly included in this expected utility function. The definition of a game gives these expected utility "payoffs" as a primitive. Under the standard interpretation, a player's own expected utility payoffs and beliefs are sufficient to determine Bayesian rational choices. In other words, every player is by definition "individualistic" in the expected utility payoffs. Unfortunately, this strict interpretation presents an obstacle to empirical testing of game theory, since in practice the expected utility functions are not observable. While the study of specifications of expected utility functions that incorporate other-regarding preferences, path effects, etc. is a vital research area, the testing of theory predictions conditional on the utility payoffs is also important. The latter requires players to be individualistic in the experimenter designated payoffs. Experimental designs attempt to control for unobservables in various ways. For example, unobservable risk attitudes can sometimes be controlled for by using binary lotteries as payoffs. 8 Group sizes of six or more have been shown to induce individualistic behavior in many-person dictator games (Haruvy and Stahl, 2002) and in public good contribution games (Isaac and Walker, 1988), which suggests that mean-matching designs with typically sized laboratory groups tend to induce individualistic behavior. In this paper we report results using a multi-task design that we suggest creates the perception of a "level playing field," which relieves the participants of responsibility for others since competition on a level playing field is presumed to be fair a priori. Several variations in the experimental protocols were used, and the behavior was found to be robust.
6
3.1. The Experimental Design.
Figure 1 shows the six two-stage games used in these experiments. The payoffs are in terms of binary lotteries, giving the probability of winning $5.00 for each game. The games chosen are discretized versions of the sometimes more complex parallels in the literature. For example, offers in the traditional Ultimatum game are usually continuous. The traditional GiftExchange game similarly has more choices for wage and effort levels, whereas we have two choices for each. The Trust game traditionally has a continuum of choices, whereas we had two. These simplifications are necessary if these games are to be presented together in a unified fashion and in a way the subjects can and will comprehend following simple instructions. Nevertheless, we believe that the essence of each game was captured in these simple choices. Four sessions were run at the Harvard Business School; the first three of these involved 36 participants each and the first five games listed in Figure 1. The fourth Harvard session involved 30 participants, and the sixth game listed in Figure 1 was substituted for the fourth game; in other words, a fifth branch with payoffs of (100, 0) was added to the Ultimatum game. The Harvard subjects consisted of mostly undergraduate students from Harvard and surrounding schools (MIT, Boston University, Tufts, etc.). They were recruited through signs in these schools and through emails to subjects in the Computer Laboratory for Experimental Research database. Subjects who had participated in similar experiments were excluded. Fifth and sixth sessions with 18 participants each were conducted at the University of Texas at Austin; those participants where upper division undergraduate and graduate students recruited campus-wide, except that Economics graduate students were not admitted since their training and experience would be atypical in the wider population. All six sessions entailed reading of the instructions, and an independent choice prior to the five extensive-form games. The independent task was a six-person dictator game, in which the participants were anonymously assigned to groups of six, and one member of each group was anonymously picked to be the dictator for that group. Due to regulations at Harvard, individually signed receipts had to be obtained9, so we were concerned that the participants 8
E.g., Cox, Smith, and Walker (1985, 1988), Walker, Smith, and Cox (1990), and Cox and Oaxaca (1995). At the University of Texas at Austin, three non-participant witnesses signed an affidavit verifying the total amount of money dispersed to the session participants; thus, names, ID numbers, etc. could not be associated with any participant's earnings or choices.
9
7
would worry that their decisions could be exposed using the signed receipts. To make such linkage impossible, all sessions at Harvard were preceded by a simple lottery that gave each participant a prize of $20 with probability 1/10. Neither the lottery outcome nor the dictator game outcomes were revealed prior to playing the extensive-form games; it was common knowledge that there was no linkage between the dictator game and the extensive-form games; and absolute privacy was guaranteed. Instructions are given in the Appendix. All lotteries were carried out independently at the end of the experiment after all decisions had been made. In social psychology it is well known that group labels can create a "we-they" dichotomy (Messick and Mackie, 1989). Since the sequential games we investigate are natural for a twopopulation design (first movers and second movers), we initially called one population the Red team and the other population the Blue team. Theoretically, this Red-Blue team framing should induce a competitive environment between first and second movers neutralizing other-regarding behavior across teams, and indeed we found a preponderance of individualistic behavior. The first session also used mean matching and the strategy method. Each first mover was matched with each second mover, and received the average payoff from all such matches. Each second mover had to provide a conditional choice at each decision node of the game. To test whether the Red-Blue team framing was driving the individualistic behavior, the second session dropped the team framing, but kept everything else unchanged. This made no discernible difference in the behavior (measured by standard Chi-square tests at the 5% significance level). Next to test whether the mean matching was driving the individualistic behavior, the third session also dropped the team framing and used pairwise random matching with reshuffled pairings for each game. Again, this made no discernible difference in behavior. To test whether the multiple prizes were driving the behavior, the fourth session was like the third except that only one of the five games was selected for payment, and the binary lottery prize was increased to $25; this game was selected by the random draw of a card at the beginning of the experiment, but not revealed until the end of the experiment. Once again, this made no discernible difference in behavior. The two sessions at the University of Texas at Austin replicated the second and third Harvard sessions, with no discernible difference in behavior. Finally, to test whether the preliminary choice task in the six-person dictator game was driving the individualistic behavior, we conducted two additional sessions at UT-Austin without
8
the preceding dictator game, but otherwise identical to session 6. Table I summarizes the protocols of all eight sessions. Once again, this made no discernible difference in behavior. In total, we have 111 observations of first-mover and second-mover choices for Games 13 and 5; 96 observations for Game 4; and 15 observations for Game 6. Chi-square tests of the aggregate choices across the eight sessions could not reject the hypothesis that they were generated by the same process.10 Therefore, our subsequent analysis focuses on the pooled aggregate data.
Table I. Protocols of Experimental Sessions. Session
1
Location
Harvard
No. Part.
36
36
36
30
Games
1-5
1-5
1-5
Teams
Y
N
Matching
Mean
Prizes Dictator
2
3
4
5
6
7
8
Austin
Austin
Austin
18
18
24
24
1-3,5,6
1-5
1-5
1-5
1-5
N
N
N
N
N
N
Mean
R. Pair
R. Pair
Mean
R. Pair
R. Pair
R. Pair
5 $5
5 $5
5 $5
1 $25
5 $5
5 $5
5 $5
5 $5
Y
Y
Y
Y
Y
Y
N
N
Harvard Harvard Harvard Austin
3.2. The Data.
Figure 1 also displays the aggregate choices for each branch of the game trees. The data are remarkable in their lack of other-regarding behavior. Overall, 84.6% of all choices are consistent with subgame perfect equilibrium (indicated by thickened branches in Figure 1): 91.9% for second movers and 65.2% for first movers. It does not follow that first movers had more regard for others since such first-mover behavior could be rationalized with individualistic preferences and non-equilibrium beliefs. Of the first-mover choices that are not subgame perfect, two-thirds are consistent with level-1 types (indicated by “L1” in Figure 111); in all, 88.5% of first-mover choices appear individualistic. Particularly remarkable are the Ultimatum game results, which contrast with the median division of 60:40 typically observed (e.g., Roth, 10
The p-value for first and second movers is 0.733 and 0.368 respectively.
9
1995); further, the 80:20 offer was rarely rejected. In the Contribution, Gift-Exchange and Trust games, there is little positive reciprocity to first-mover generosity, and in the Entry and Ultimatum games, there is little negative reciprocity. It is helpful to compare the behavior in our experiments with other experiments that have used similar games and multi-task designs. Guth, Huck and Muller (2001) examined a binary choice ultimatum game. They found that given an equal proposed split of (10, 10) and a selfish proposed split (17, 3), two thirds of proposers choose the equal split. Replacing the equal split (10, 10) with a slightly unequal split (9, 11) resulted in two thirds of proposers choosing the selfish outcome. This remarkable reversal indicates that without an exactly equal-split option, subjects behave selfishly. This is consistent with our result. Falk, Fehr and Fischbacher (2003) examine a binary ultimatum game with choices (8,2) and (2,8). Similar to Guth et al (2001), 73% of proposers chose (8, 2). They also found that relative to a condition which allows for an equal split (5, 5) vs. (8, 2), the condition without an equal split option resulted in a sharp drop of responder rejection of (8,2) from 44.4% to 26.7%. Note that our two ultimatum games share the no-equal-split feature with Guth et al (2001) and Falk et al. (2003). Our ultimatum game 4 closely resembles the Falk et al. (2003) game, with (8, 2) and (2, 8) being the extreme points, but with two additional intermediate choices (6, 4) and (4, 6). Our somewhat more level playing field (Falk et al. had a multi-game design with four games) resulted in a higher proportion of selfish behavior on the part of proposers (79.2% vs. Falk et al.’s 73%) and on the part of responders (6.3% vs. Falk et al.’s 26.7%). Results on trust games are somewhat less consistent with each other and with our results. There many examples of binary choice trust games in various settings (e.g., Camerer and Wiegelt, 1988; Cox and Deck, 2002; Engle-Warnick and Slonim, 2001; McCabe and Smith, 2000; McCabe, Rigdon and Smith, 2003). An interesting comparison, which provides some intuition for the selfish behavior we observe in our experiment, is that between McCabe and Smith (2000) and Cox and Deck (2002). McCabe and Smith (2000) studied a binary choice trust game with a first mover exit strategy of an equal split and a second mover defection strategy that gives nothing back to the first mover. In their game, second movers exhibited a low defection rate of 25%. In contrast, the same game, by Cox and Deck (2002) yielded a 76% defection rate by second movers. Cox and Deck explain this stark reversal by noting that in their experiment 11
Except for Game 5 in which all first-mover branches are consistent with level-1 behavior.
10
there was a higher “social distance.” That is, the large group size in the lab (14 to 20 subjects) and the inability of the experimenter to know any individual’s actions from the amount they received resulted in a greater willingness to behave selfishly. It stands to reason that with our experimental design (with groups even larger then Cox and Deck, with a virtually double-blind protocol, and with the more level playing field), one would expect even greater rates of defection. It is also important to keep in mind that our defection action was not as extreme as in McCabe and Smith and Cox and Deck (where the first mover received 0 tokens), so our defectors may not suffer as much guilt. In two-player entry games, the first mover (the entrant) can stay out and receive a smaller payoff than the second mover (the incumbent) or enter and potentially receive a bigger payoff at the expense of the second mover but at the risk of a costly retaliation. That is, the incumbent, upon observing entry, can retaliate by incurring some cost. Similar games, within a multi-game environment, were studied by Charness and Rabin (forthcoming). In one such game, the first mover could stay out (400, 1200) or enter. If the first mover enters, the second mover would face (400,200) or (0, 0). In this game, 76.9% of first movers entered and 11.5% (3 out of 26) of second movers retaliated against entrants. In our game, 75.7% of first movers entered and 6.3% of second movers retaliated. The remarkable similarity of our results to Charness and Rabin can be potentially explained by the similarity of the experimental protocol—specifically, a multigame design and a level playing field. Charness and Rabin also have a “contribution” game, where the first mover chooses between (700,200) and giving the move to the second mover. The second mover chooses between (200,700) and (600,600). 14/32 first movers (43.8%) chose to contribute, and 25/32 (78.1%) second movers contributed. Remarkably like Charness and Rabin, 43.2% of our first movers contributed, but in contrast to Charness and Rabin, only 23.4% of second movers contributed. This could be due to the fact that our second movers had a choice at the end of each first mover branch and so they could punish first mover’s selfish choices.
11
4. Maximum Likelihood Estimation of Level-n Model.
In previous research12, we have found strong evidence that when fitting choice data with probabilistic choice functions of the logit form with a single precision parameter, the fit is better when the payoffs are rescaled to [0, 100] for each game. Accordingly, in the subsequent analysis, each payoff (y) in game g was transformed to 100(y - mg)/(Mg - mg), where Mg (mg) is the maximum (minimum) payoff over both players in game g. Note that this is a zero-parameter transformation that depends only on the payoffs of the game, and hence is far more parsimonious than the alternative of allowing separating logit precision parameters for each game. Summing eqs(2) and (6) produces the log-likelihood function under the null hypothesis that the proportions of types and the logit precision are independent of roles. Maximizing this function yields LL = -803.99. As a benchmark the entropy of the data is -767.04, so we have a pseudo-R2 of 0.954. Table I gives the estimates and confidence intervals for this model.
Table I. Parameter Estimates of Level-n Model. Confidence Intervals: MLE
5%
95%
ν
0.173
0.150
0.204
µ
0.400
0.329
0.529
α0
0.080
0.061
0.107
α1
0.327
0.276
0.378
α2
0.592
0.541
0.639
Setting α1 = 0 yields a model with level-0 types and level-2 types; the latter can be interpreted as logit subgame perfect Nash equilibrium types. The maximized LL drops dramatically to -885.10; thereby strongly rejecting the restriction (p < 0.001). In other words, level-1 behavior makes a significant contribution to explaining the data, so we must reject subgame perfection.
12
E.g. Haruvy, Stahl and Wilson (2001).
12
The Root-Mean Squared Error (RMSE) is 0.064 for first movers and 0.042 for second movers. On the other hand, the Pearson Chi-Square (PCS) good-of-fit statistic is 72.31, which with 32 degrees of freedom13 has a p-value of 6×10-5; thus, we can reject the hypothesis that the data were generated by this fitted model. Figure 2 displays for each game the predicted choice frequencies of each Level-n type, the prediction of the mixture model, and actual choice frequencies, and three goodness-of-fit (GOF) measures: (i) the log-likelihood less the entropy, (ii) Pearson's Chi-squared (PCS), and (iii) root mean squared error (RMSE). The Level-0 type is not shown in Figure 2 since it is the uniform distribution over the available strategies. For second movers, since there are two branches at every such node, only the probability of choosing the left branch is displayed. Comparing the predictions of the mixture model with the actual choice frequencies, the fit is remarkably good, with four exceptions. For second movers the fit is poor for the Contribution game (2) by PCS and RMSE standards; the participants show positive reciprocity when getting more than the first mover (23% choose LA versus the model prediction of 14%), and negative reciprocity when getting less (7% choose RB versus the model prediction of 4%). Utilitarian or egalitarian preferences could help predict the positive reciprocity, but spitefulness is needed to predict the negative reciprocity. For first movers, the fit is poor for the Entry, Ultimatum and Trust games (games 1, 4 and 5, respectively): more first movers enter in game 1 than predicted (76% versus 61%); more first movers offer close to half in the Ultimatum game (18% versus 9%); and more first movers invest fully in the Trust game than predicted (15% versus 8%). Utilitarian or egalitarian preferences could help predict the Ultimatum game behavior, and utilitarian but not egalitarian preferences could help predict the Trust game behavior. Extreme egalitarian preferences could help predict the Entry game behavior. Thus, it appears that the departures from the predictions of the fitted Level-n model cannot be explained by a single other-regarding preference type. In the next section, we present a parsimonious model of this apparently other-regarding behavior. It is also interesting to note from Figure 2 that Level-2 first movers in the Gift-Exchange game (3) are somewhat ambivalent about choices A and B rather than strongly preferring the
13
From Figure 1, it is obvious that there are 14 independent first mover decision nodes and 18 independent second mover decision nodes.
13
subgame perfect choice (B). The reason the Level-n model predicts this is due to the small payoff difference for the second mover following A, implying that Level-2 first movers expect a significant proportion of second movers to choose LA (as if reciprocating).
5. A Level-n Model with Other-Regarding Behavior.
An attractive feature of the underlying mixture structure of the Level-n model is the ease with which alternative types can be introduced and tested. In particular, in addition to the Level0 type, and the individualistic Level-1 and Level-2 types, suppose there is an other-regarding (OR) type, and let α3 denote the latter proportion in the population. For ease of presentation, we invoke the “projection” hypothesis of psychology: that a person tends to model others like oneself. 14 Specifically, we assume that an individualistic type believes the other players are individualistic, and we assume that the OR type (being forward looking) believes all other types are Level-1 types with OR preferences and precision µ. Appendix I presents statistical tests that confirm these assumptions. Deferring the specification of OR preferences until later, let Q g2 (m, ν, µ) denote the OR first-mover logistic choice function with precision ν given the belief that the second-mover has OR preferences and precision µ. Also, let Q gm (j,ν ) denote the OR second-mover logistic choice function with precision ν for game g, node m and strategy j. Weighting by preference type, for a random subject, the probabilistic choice function in the first-mover role is given by PgFM (m) ≡ α0/Bg + α1 Pg1 (m, ν) + α2 Pg2 (m, ν, µ) + α3 Q g2 (m, ν, µ) ,
(7a)
and in the second-mover role by SM 1 Pgm ( j) ≡ α0/2 + (α1+α2) Pgm (j, ν ) + α3 Q gm (j,ν ) .
(7b)
We now come to the challenging problem of specifying the OR preferences. From section 4, instances of poor fit suggest that two types of OR behavior are present: utilitarianism to account for the amount of trust in game 5, and spite to account for negative reciprocity in 14
See for example, Croson and Miller (2003).
14
game 2. To capture such behavior, we use a two-parameter model in the spirit of Fehr-Schmidt (1999)15: U1(y1,y2) = y1 + θ1 min{y2-y1,0} + θ2 max{y2-y1,0} , and (8) U2(y1,y2) = y2 + θ1 min{y1-y2,0} + θ2 max{y1-y2,0} , where θ1 ≥ 0 and (for convexity) θ2 ≤ θ1. Utilitarian preferences are represented by θ1 = θ2 = 1/2. Spiteful preferences are represented by θ2 < 0 < θ1. As an initial hypothesis, we assume that θ1 = 1/2, and that θ2 = 1/2 or -1 with equal probability.16 Thus, with one additional parameter (α3), we have an encompassing model with individualistic and OR preferences. The maximized LL of this encompassing model is -782.99, an increase of 21.00 which has a p-value less than 9×10-11. Therefore, this addition of OR preferences makes a very statistically significant contribution to explaining the data. The pseudo-R2 is a remarkable 0.980. Moreover, the PCS goodness-of-fit statistic decreases to 32.72, which has a p-value of 0.431; therefore, we cannot reject the hypothesis that this model is the data generating process. Finally, the RMSE decreases to 0.032. Table II gives the parameter estimates and confidence intervals for this model. Table II. Parameter Estimates of OR Level-n Model. Confidence Intervals: MLE
5%
95%
ν
0.230
0.196
0.278
µ
0.383
0.328
0.458
α0
0.021
0.009
0.045
α1
0.221
0.170
0.279
α2
0.600
0.548
0.645
α3
0.158
0.118
0.192
15
There are several other prominent models of fairness (e.g., Bolton, 1991; Bolton and Ockenfels, 2001). We chose this specification because it nested several other models, as will be explained shortly. 16
This is equivalent to specifying two types of other-regarding preferences, each equally likely.
15
The estimate of the new parameter (α3) indicates that about 16% of the population behaves as if they have OR preferences, which is consistent with our analysis of poor fits when OR preferences are excluded. Conversely, about 84% of the population behaves as if they have individualistic preferences. In other words, the level playing field design appears to have worked for 84% of the participants. It is also noteworthy that the estimated proportion of Level-0 types decreases substantially and the estimated precision parameter (ν) increases, because the way a model with only individualistic preferences tries to fit OR behavior is as random error (i.e. a larger α0 or a smaller ν). The proportion of individualistic Level-2 types remains essentially the same at about 60%. Thus, allowing for OR behavior reduces the estimated proportions of Level-0 and individualistic Level-1 behavior, but not individualistic Level-2 behavior. Figure 3 displays the predicted choice frequencies of the Level-1, Level-2 and OR types for each game, as well as the prediction of the mixture model, and the three GOF measures (similar to Figure 2). Compared to the Level-n model without OR types, the overall fit is better, especially in those games for which the former model fits poorly. For first movers in Game 1 (Entry), the increased precision of Level-2 types raises the predicted frequency of entries (from 61% up to 68%), and the PCS test now passes. In the Ultimatum games (games 4 and 6), the spiteful FS types strongly prefer the 60-40 split, which makes the mixture prediction closer to the observed frequencies. In the Trust game (5), the utilitarian types prefer the maximal investment, while the more egalitarian Fehr-Schmidt types prefer the intermediate level of investment; both behaviors make the mixture prediction closer to the observed frequencies. For second mover in the Contribution game (game 2), both OR types exhibit positive reciprocity when getting more than the first mover, which increases the mixture prediction from 14% to 20%; at the other node (when the second mover is getting less than the first mover), the spiteful OR types are indifferent between LB and RB, whereas all other types (except Level-0) strongly prefer LB. This increase in predicted frequency of RB choices is offset somewhat by the decreased estimate of the proportion of Level-0 types, so the mixture prediction for negative reciprocity (RA) increases only slightly from 4% to 5%. Nonetheless, the PCS declines dramatically (from 12.27 to 1.77) and the RMSE declines from (0.074 to 0.027). In all other games, the fit generally improves due to the replacement of Level-0 types with OR types. For
16
instance, in the Ultimatum games (games 4 and 6), the spiteful FS type rejects offers of 80-20 or worse and accepts all better offers; clearly substituting this behavior for the Level-0 behavior fits the data better. We tested the restrictions on OR preferences that θ1 = 1/2, and that θ2 = 1/2 or -1 with equal probability by considering an encompassing model with two subtypes of OR preferences: (i) utilitarian (θ1 = θ2 = 1/2), and (ii) general Fehr-Schmidt (θ1 ≥ 0 and θ2 ≤ θ1). Further, we allowed the relative proportion of these subtypes to be a free parameter. The maximized LL increased by only 0.67, which with 3 degrees of freedom has a p-value of 0.72. Therefore, we cannot reject the restrictions entailed by our specification of OR preferences.17 To test the validity of pooling first-mover and second-mover behavior, we fitted the model for each role separately. The increase in the maximized LL was 2.40, which with three degrees of freedom18 has a p-value of 0.188. Therefore, we cannot reject the hypothesis of a common set of parameters for both roles. The overall earning performance of each type is the expected payoff that type would have received when facing the actual empirical distribution of play in the data set. Sophisticated learning (e.g., rule learning as proposed by Stahl, 2000) and evolutionary dynamics would predict that those types with above average earning performance would increase their share of the population, while those types with below average earning performance would decrease their share.19 Table III gives this overall performance measure20 for the five types; since the performance calculated separately for first movers and second movers are very similar, only the combined performance measure is reported here. The column labeled “Util” stands for the utilitarian component of the OR type, while “FS” stands for the Fehr-Schmidt component with θ1 = 1/2, and θ2 = -1. The “Average” row is computed by weighting each game by the number of subjects.
17
Alternative specifications of OR preferences such as CES and Cox-Friedman (2002) made no improvement.
The first-mover model has the full 5 parameters, while the second-mover model has only 3 parameters (no µ and no α2); thus, the unrestricted model has 8 parameters, while the pooled model has 5 parameters. 18
19
We are assuming, as in Guth, Kliemt and Peleg (2000), that survival fitness depends on the material payoffs rather than the utility payoffs for the other-regarding types as well as the individualistic types. 20
In rescaled payoffs, which does not affect the relative performance comparisons.
17
Table III. Relative Performance of Types. Game 1 2 3 4 5 6
Level-0 55.18 65.26 20.35 38.64 35.42 28.80
Level-1 55.21 65.28 24.12 63.05 40.23 33.92
Level-2 60.16 82.06 24.01 63.61 47.16 46.52
Util. 55.40 63.57 16.47 47.04 28.55 33.47
FS 60.55 55.55 24.37 44.00 40.50 33.31
Average
42.70
48.79
54.94
41.84
44.71
The Level-2 type (which leads to subgame perfect behavior) is the highest overall performer, and hence we would expect to see the Level-2 share increase in a subsequent period. The Utilitarian type and the Level-0 type are the poorest performers, and hence we would expect to see their shares decrease. This change in the population would make the Level-2 type perform even better in subsequent periods. Consequently, rapid convergence to the subgame perfect equilibria is quite likely. This conjecture can be empirically tested in a future study.
6. Discussion.
We extended the Level-n theory of bounded rationality from the domain of symmetric normal-form games to the domain simple two-player, two-stage extensive-form games. We designed and conducted experiments to test pertinent hypotheses. The extended Level-n model was found to fit the data remarkably well and significantly better than subgame perfect equilibrium theory. We found that about 62% of the population behavior is consistent with Level-2 behavior, about 22% is consistent with Level-1 behavior, and about 2% is random Level-0 behavior. The remaining 16% of the behavior is consistent with a combination of utilitarian and Fehr-Schmidt other-regarding preferences. A large body of experimental literature has demonstrated that utility functions may be more complex than selfish monetary considerations alone. Unfortunately, it may take many more years before all possible non-pecuniary considerations are mapped and game theoretic predictions can be tested with these new ‘complete’ utility functions. Until then, we have argued that proper testing of game theoretic predictions requires experimental designs in which participants care only about their own payoffs. We found that participants were predominantly individualistic in our design. The maximum-likelihood results for both the individualistic Level18
n model and the extended model with OR types suggest that our design succeeded in inducing the intended game-theoretic payoffs for at least 84% of the participants. Further, behavior appears quite robust to many variations of the multi-task design, including team framing, meanmatching versus random pairwise matching, and single versus multiple prizes. This evidence suggests that the multi-task design succeeds by creating the perception of a level playing field thereby absolving participants of responsibility for other participants. Clearly, more research is needed into what design features are necessary and sufficient for predominantly individualistic behavior. Whatever the outcome of that research, however, our results on the performance of the Level-n model will stand. We presented the OR extension to satisfy the anticipated curiosity of readers, and to gain some insight about the behavior that is not well-fit by the individualistic Level-n model. That behavior contains elements of utilitarianism and spitefulness. We presented a model with individualistic Level-0, Level-1 and Level-2 types and one hybrid OR type which was either utilitarian or spiteful with equal probability. The improvement in all goodness-of-fit measures was substantial and significant, and we could not reject the hypothesis that the data were generated by the fitted model. We are surprised that the simple Level-n model of first movers does so well. Based on past studies of the Ultimatum game, which found near equal splits to be the modal choice, we expected that the naive Level-1 belief that second movers are equally likely to make any choice at every node would be unreasonable, since such belief would lead to the first mover demanding all or most of the pie. Instead, we contemplated an alternative conceptualization of secondmover strategies in terms of reservation levels: accept any offer equal to or exceeding, say, r. Then, if each such reservation level is equally likely, the first mover's optimal proposal would be a 50-50 split. In contrast, the 80-20 split is the predominant first-mover proposal in our data, consistent with the naive Level-1 type. Also pertinent to explaining our Ultimatum behavior is the large proportion of Level-2 types (about 60%) in comparison to Stahl-Wilson (1995), and Haruvy, Stahl and Wilson (2001). However, since Level-2, naïve Nash and Worldly types are indistinguishable in our extensiveform data, the comparison should be with the sum of those types in the previous studies, and that sum is around 60%.
19
Future research will explore how robust our findings are to the games used in a multi-task design, and what design features are necessary and sufficient to create the perception of a level playing field. For instance, we suspect that it is important to present all games simultaneously, so participants can ascertain from the very beginning that there are no obvious biases in role assignments.
20
References
Andreoni, J. (1988), “Why Free Ride? Strategies and Learning in Public Goods Experiments,” Journal of Public Economics, 37(3), 291-304. Berg, J.; J. Dickhaut, and K. McCabe (1995), “Trust, Reciprocity, and Social History,” Games and Economic Behavior, 10(1), 122-142. Barron, G. and E. Yechiam (2002). “Private E-mail requests and the diffusion of responsibility,” Computers in Human Behavior, 18, 507-520. Binmore, K., J. McCarthy, G. Ponti, L. Samuelson and A. Shaked (2002), A Backward Induction Experiment,” Journal of Economic Theory, 104, 48-88. Bolton, G. (1991), “A Comparative Model of Bargaining: Theory and Evidence,” American Economic Review, 81, 1096-1136. Bolton, G. and A. Ockenfels (2000), “ERC: A Theory of Equity, Reciprocity, and Competition,” American Economic Review, 90, 166-193. Camerer, C. F, and K.Weigelt (1988), “Experimental Tests of a Sequential Equilibrium Reputation Model,” Econometrica, 56, 1-36. Charness, G. (1996), "Attribution and Reciprocity in an Experimental Labor Market," mimeo. Charness, G. and M. Rabin. (forthcoming), “Understanding Social Preferences with Simple Tests,” forthcoming in Quarterly Journal of Economics. Costa-Gomes, M., V. Crawford, and B. Broseta (2001), “Cognition and Behavior in Normal From Games: An Experimental Study,” Econometrica, 69, 1193-1235. Cox, J. C. and C. Deck (2002), “On the Nature of Reciprocal Motives,” working paper, University of Arizona. Cox, J., and D. Friedman (2002), “A Tractable Model of Reciprocity and Fairness,” mimeo. Cox, J., and R. Oaxaca (1995), "Inducing risk-neutral preferences: further analysis of the data," Journal of Risk and Uncertainty, 11, 65-79. Cox, J., V. Smith, and J. Walker (1985), "Experimental development of sealed-bid auction theory; calibrating controls for risk aversion," American Economic Review, 75(2), 160-165. Cox, J., V. Smith, and J. Walker (1988), "Theory and individual behavior of first-price auction," Journal of Risk and Uncertainty, 1(1), 61-99. Croson, R. and M. Miller (2003), “Explaining the Relationship between Actions and Beliefs: Projection vs. Reaction.” Darley, J. and B. Latane (1968) “When will people help in a crisis?” Psychology Today, December 2, 54-57, 70-71. Duffy, J. and R. Nagel (1997), “On the Robustness of Behavior in Experimental ‘Beauty Contest’ Games,” Economic Journal, 107, 1684-1700.
21
Engle-Warnick, J. and R. Slonim (2001), “Inferring Repeated Game Strategies From Actions: Evidence From Trust Game Experiments,” working paper, Case Western Reserve University. Falk, A., E. Fehr and U. Fischbacher (2003), “On the Nature of Fair Behavior,” Economic Inquiry, 41, 20-26. Fehr, E., E. Kirchler, A. Weichbold, and S. Gachter (1998), “When Social Forces Overpower Competition: Gift Exchange in Experimental Labor Markets,” Journal of Labor Economics, 16, 324-351 Fehr, E., G. Kirchsteiger, and A. Riedl (1993), “Does Fairness Prevent Market Clearing? An Experimental Investigation,” Quarterly Journal of Economics, 108, 437-459. Fehr, E. and K. Schmidt (1999), “A Theory of Fairness, Competition, and Cooperation”, Quarterly Journal of Economics, 114, 817-868. Fleishman, J. A. (1980), “Collective action as helping behavior: effects of responsibility diffusion on contributions to a public good,” Journal of Personality and Social Psychology, 38, 629-637. Gneezy, U. (2002), “On the Relation between Guessing Games and Bidding in Auctions,” working paper, The University of Chicago Graduate School of Business. Guth, W., S. Huck and W. Muller (2001), “The Relevance of Equal Splits in Ultimatum Games,” Games and Economic Behavior, 37, 161-169 Guth, W., H. Kliemt, and B. Peleg (2000), “Co-evolution of Preferences and Information in Simple Games of Trust,” German Economic Review, 1, 83-110. Guth, W., R. Schmittberger and B. Schwarz (1982), “An experimental analysis of ultimatum bargaining,” Journal of Economic Behavior and Organization, 3, 367-388. Haruvy, E., D. Stahl, and P. Wilson (2001), “Modeling and Testing for Heterogeneity in Observed Strategic Behavior,” Review of Economics & Statistics, 83, 146-57. Haruvy, E., and D. Stahl (2002), "Other-Regarding Preferences: Egalitarian Warm Glow, Empathy, and Group Size", mimeo.Ho, T-H., C. Camerer and K. Weigelt (1998), “Iterated Dominance and Iterated Best Response in Experimental p-Beauty Contests,” American Economic Review, 88, 947-969. Isaac, R. and J. Walker (1988), "Group Size Effects in Public Goods Provision: The Voluntary Contributions Mechanism," Quarterly Journal of Economics, 103, 179-199. Johnson, E., C. Camerer, S. Sen and T. Rymon (2002), “Detecting Failures of Backward Induction: Monitoring Information Search in Sequential Bargaining,” Journal of Economic Theory, 104, 16-47 McCabe, K. and V. Smith (2000), “A Comparison of Naïve and Sophisticated Subject Behavior with Game Theoretic Predictions,” Proceedings of the National Academy of Sciences, 97, 3777-3781. McCabe, K., M. Rigdon, and V. Smith (2003), “Sustaining Cooperation in Trust Games,” working paper, George Mason University.
22
Messick, D., and D. Mackie (1989), "Intergroup Relations," American Review of Psychology, 40, 45-81. Nagel, R. (1995), “Unraveling in Guessing Games: An Experimental Study,” American Economic Review, 85, 1313-1326. Roth, A. (1995), “Bargaining Experiments,” in The Handbook of Experimental Economics, J. Kagel and A. Roth (eds.), Princeton University Press, Princeton, N.J. Sonsino, D., I. Erev, and S. Gilat (1999), “On Rationality, Learning, and Zero-Sum Betting—An Experimental Study of the No Betting Conjecture,” mimeo. Stahl, D. (1993), “Evolution of Smartn players,” Games and Economic Behavior, 5, 604-617. Stahl, D. (2000), "Rule Learning in Symmetric Normal-Form Games: Theory and Evidence," Games and Economic Behavior, 32, 105-138. Stahl, D. and P. Wilson (1994), “Experimental Evidence of Players’ Models of Other Players,” Journal of Economic Behavior and Organization, 25, 309-327. Stahl, D. and P. Wilson (1995), "On Players’ Models of Other Players: Theory and Experimental Evidence, Games and Economic Behavior," 10, 218-254. Thaler, R. (1988), “Anomalies: The Ultimatum Game.” Journal of Economic Perspectives, 2, 195-206. Walker, J., V. Smith,, and J. Cox (1990), "Inducing risk neutral preferences: An examination in a controlled market environment," Journal of Risk and Uncertainty, 3(1): 5-24.
23
Figure 1. Experiment Games. Game 1 - Entry. RED (84)
(27) B L1
A BLUE (7) LA
40 (104) RA
10
50
40
50
80
Game 2 - Contribution. RED (63) B
(48) A L1
BLUE
BLUE
(8) RB
(26) LA
(85) RA
(103) LB
55
40
60
20
55
60
40
20
24
Game 3 - Gift Exchange. RED (33) A
(78) B
L1 BLUE
BLUE
(103) RB
(8) LB
(26) LA
(85) RA
75
10
100
20
25
30
0
20
Game 4 - Ultimatum. RED L1 BLUE
(76) A
D (3) C (7)
B (10) BLUE
(90) LA
(6) RA
(95) LB
80
0
20
0
BLUE
BLUE
(1) RB
(94) LC
(2) RC
(96) LD
(0) RD
60
0
40
0
20
0
40
0
60
0
80
0
25
Game 5 - Trust. RED A (57) B (19) 50
BLUE
10
BLUE
E (17)
D (6)
C (12)
BLUE
BLUE
(109) RC
(7) LD
(104) RD
(5) LE
(106) RE
60
40
65
35
70
30
20
40
25
55
30
70
(4) LB
(107) (2) LC RB
55
45
15
25
Game 6 - Ultimatum. RED L1
A
(4) B(9)
BLUE
BLUE
C(1)
D(1)
BLUE
(7) LA
(8) RA
(14) LB
(1) RB
(15) LC
100
0
80
0
0
0
20
0
26
E (0)
BLUE
BLUE
(0) RC
(15) LD
(0) RD
(15) LE
(0) RE
60
0
40
0
20
0
40
0
60
0
80
0
Figure 2. Level-n: Predicted and actual choice frequencies and goodness-of-fit (GOF) measures. The three GOF measures reported are (i) the Log-likelihood less the entropy, (ii) PCS, and (iii) RMSE. First Movers: Game 1 Level-1 Level-2 Mixture Actual GOF
A 0.0779 0.9197 0.6104 0.7568 -5.335
B 0.9221 0.0803 0.3896 0.2432 9.998
Second Movers: Game 1 LA Level-1 0.0779
Game 2 Level-1 Level-2 Mixture Actual GOF
A 0.9624 0.0002 0.3554 0.4324 -1.400
B 0.0376 0.9998 0.6446 0.5676 2.878
Game 3 Level-1 Level-2 Mixture Actual GOF
A 0.0462 0.4034 0.2942 0.2973 -0.0026
B 0.9538 0.5966 0.7058 0.7027 0.0053
0.0031
Game 4 Level-1 Level-2 Mixture Actual GOF
A 0.8851 0.9867 0.8944 0.7917 -1.4138
B 0.1018 0.0131 0.0611 0.1042 14.168
C 0.0117 0.0002 0.0240 0.0729 0.0611
D 0.0013 0.0000 0.0205 0.0313
Game 5 Level-1 Level-2 Mixture Actual GOF
A 0.2000 0.7635 0.5338 0.5135 -3.595
B 0.2000 0.1813 0.1889 0.1712 8.289
C 0.2000 0.0427 0.1068 0.1081 0.0368
D 0.2000 0.0101 0.0875 0.0541
Game 6 Level-1 Level-2 Mixture Actual GOF
A 0.8228 0.0054 0.2887 0.2667 -0.935
B 0.1459 0.9632 0.6344 0.6000 2.458
C 0.0259 0.0304 0.0425 0.0667 0.0312
D 0.0046 0.0010 0.0181 0.0667
Mixture Actual GOF
0.1117 0.0631 -1.545
2.651
Game 2 Level-1
LA 0.1032
LB 0.9998
Mixture Actual GOF
0.1350 0.2342 -5.157
0.9598 0.9279 12.270
Game 3 Level-1
LA 0.2963
LB 0.0305
Mixture Actual GOF
0.3126 0.2342 -1.688
0.0681 0.0721 3.203
0.0555
Game 4 Level-1
LA 0.9869
LB 0.9998
LC 1.0000
LD 1.0000
Mixture Actual GOF
0.9479 0.9375 -6.140
0.9598 0.9896 7.355
0.9599 0.9792 0.0273
0.9599 1.0000
E 0.2000 0.0024 0.0829 0.1532
Game 5 Level-1
LA 0.0530
LB 0.0031
LC 0.0002
LD 0.0000
Mixture Actual GOF
0.0888 0.0360 -4.160
0.0430 0.0180 7.070
0.0402 0.0631 0.0314
0.0401 0.0451
E 0.0008 0.0000 0.0163 0.0000
Game 6 Level-1
LA 0.5000
LB 0.9700
LC 0.9990
LD 1.0000
LE 1.0000
Mixture Actual GOF
0.5000 0.4667 -1.889
0.9319 0.9333 1.962
0.9590 1.0000 0.0347
0.9599 1.0000
0.9599 1.0000
0.1464
0.0771
27
0.0487
0.0737
Figure 3. Level-n with other-regarding types: Predicted and actual choice frequencies and goodness-of-fit (GOF) measures. The three GOF measures reported are (i) the Log-likelihood less the entropy, (ii) PCS, and (iii) RMSE. First Movers:
Second Movers:
Game 1 Level-1 Level-2 Utilitarian Spiteful FS Mixture Actual GOF
A 0.0360 0.9620 0.0360 1.0000 0.6775 0.7568 -1.6761
B 0.9640 0.0380 0.9640 0.0000 0.3225 0.2432 3.188
Game 1 Level-1
Game 2 Level-1 Level-2 Utilitarian Spiteful FS Mixture Actual GOF
A 0.9868 0.0000 0.9455 1.0000 0.3821 0.4324 -0.58628
B 0.0132 1.0000 0.0545 0.0000 0.6179 0.5676 1.19
Game 3 Level-1 Level-2 Utilitarian Spiteful FS Mixture Actual GOF
A 0.0175 0.4043 0.5000 0.0010 0.2966 0.2973 -0.0001
B 0.9825 0.5957 0.5000 0.9990 0.7034 0.7027 0.0003
0.0007
Game 4 Level-1 Level-2 Utilitarian Spiteful FS Mixture Actual GOF
A 0.9437 0.9968 0.2500 0.0000 0.8317 0.7917 -3.0045
B 0.0532 0.0032 0.2500 0.9998 0.1175 0.1042 8.804
C 0.0030 0.0000 0.2500 0.0002 0.0257 0.0729 0.0318
Utilitarian Spiteful FS Mixture Actual GOF
0.0792
Game 2 Level-1 Utilitarian Spiteful FS Mixture Actual GOF
0.0503
Game 3 Level-1 Utilitarian Spiteful FS Mixture Actual GOF D 0.0002 0.0000 0.2500 0.0000 0.0251 0.0313
Game 4 Level-1 Utilitarian Spiteful FS Mixture Actual GOF
28
LA 0.0360 0.0003 0.0003 0.0402 0.0631 -0.6438
1.5020
LA 0.0533
LB 1.0000
0.9467 0.9467 0.2037 0.2342 -0.8089
1.0000 0.5000 0.9500 0.9279 1.7690
LA 0.2404
LB 0.0099
0.9990 0.0000 0.2868 0.2342 -1.2257
0.9990 0.0000 0.0976 0.0721 2.3160
0.0413
LA 0.9968
LB 1.0000
LC 1.0000
LD 1.0000
1.0000 0.0000 0.9079 0.9375 -1.9583
1.0000 0.9968 0.9891 0.9896 2.9920
1.0000 1.0000 0.9894 0.9792 0.0165
1.0000 1.0000 0.9894 1.0000
0.0229
0.0266
Figure 3 continued Game 5 Level-1 Level-2 Utilitarian Spiteful FS Mixture Actual GOF
A 0.2000 0.8524 0.0004 0.0185 0.5614 0.5135 -1.5752
B 0.2000 0.1261 0.0027 0.1257 0.1342 0.1712 3.328
C 0.2000 0.0184 0.0184 0.8557 0.1284 0.1081 0.0331
D 0.2000 0.0027 0.1254 0.0001 0.0600 0.0541
E 0.2000 0.0004 0.8531 0.0000 0.1160 0.1532
Game 6 Level-1 Level-2 Utilitarian Spiteful FS Mixture Actual GOF
A 0.8999 0.0010 0.2000 0.0000 0.2195 0.2667 -1.0075
B 0.0901 0.9889 0.2000 0.0000 0.6333 0.6000 2.2860
C 0.0090 0.0100 0.2000 0.9990 0.1068 0.0667 0.0387
D 0.0009 0.0001 0.2000 0.0010 0.0204 0.0667
E 0.0001 0.0000 0.2000 0.0000 0.0200 0.0000
Game 5 Level-1
LB 0.0211
LC 0.0005
LD 0.0000
LE 0.0000
Utilitarian Spiteful FS Mixture Actual GOF
0.5000 0.0000 0.0674 0.0360 -2.8476
0.5000 0.0000 0.0504 0.0180 4.6250
0.5000 0.0000 0.0501 0.0631 0.0236
0.5000 0.0000 0.0500 0.0451
Game 6 Level-1
LA 0.5000 0.5000 1.0000 0.0000 0.5000 0.4667 -0.6172
LB 0.9759 0.9901 1.0000 0.0001 0.9024 0.9333 0.7256
LC 0.9994 0.9999 1.0000 0.9901 0.9885 1.0000 0.0220
LD 1.0000 1.0000 1.0000 1.0000 0.9894 1.0000
Utilitarian Spiteful FS Mixture Actual GOF
29
LE 1.0000 1.0000 1.0000 1.0000 0.9894 1.0000
EXPERIMENT INSTRUCTIONS KEY: Explanatory text has been inserted into these Appendices and is italicized. All nonitalicized text is directly from the actual instructions which were read aloud, except for the text in brackets []. Appendix A: Harvard Instructions for Team-Framing Treatment.
Welcome. [Introduce self and helpers. Say how everyone was recruited.] This is an experiment about economic decision making. If you follow the instructions carefully you might earn a considerable amount of money. This money will be paid at the end of the experiment in private and in cash. [Wave cash.] Any participant that showed up at least 5 minutes early will receive a $5 bonus. In addition, each participant will receive $5 as compensation for participating in this experiment, which will last about one hour. Additional bonus payments will depend on the decisions that you and other participants make in the experiment; how these additional payments will be determined will be explained in detail momentarily. It is important that during the experiment you remain SILENT. If you have any questions, or need assistance of any kind, RAISE YOUR HAND but DO NOT SPEAK. One of the experiment administrators will come to you and you may whisper your question to him. If you talk, laugh, or exclaim out loud, you will be asked to leave and will not be paid. We expect and appreciate your cooperation. You will be making choices using the computer mouse and keyboard. You may reposition the mouse pad so it is comfortable for you. Your mouse cursor should move when you slide the mouse on the pad. If not, please raise your hand. Do NOT click the mouse buttons or use the keypad until told to do so. [The screen will be blank until everybody clicks on the begin button - but not yet.] You will be making a number of decisions, but the outcomes of these decisions will not be revealed to anyone until the end of the experiment after all decisions have been made. Each decision will count in determining your total earnings. Please click on the BEGIN button on your screen. Once you have done this, the screen will display the first decision. Please listen to my instructions. We will now describe the first decision task. You will click on the box on the right side of Decision 1 and enter an integer from 1 to 10. At the end of the experiment, an integer from 1 to 10 will be randomly drawn, and if it is equal to your choice, you will win $20. Click on the box for Decision 1 now and make your decision. [Pause 30 seconds.] If you haven't made a decision yet, do it now and stop.
30
Following this decision, some of you will have earned $30 ($20 from Decision 1, $5 for participating, plus $5 for showing up early). Others will have earned $25, $10, or only $5. The remaining decisions in this experiment will add further earnings. In the end, there will be no way to infer from your total earnings what choice you made on any particular decision. Also, your consent form, which by now should be signed by you, made it clear that your receipts will not be kept by the researcher after the experiment, and that your name and identifying information will never be linked with any of the data. Any attempt to link decision to a person would be deemed a serious violation of the Human Subjects Protocol and Harvard Protection of Human Subjects. Therefore, rest assured that your choices are completely private and anonymous. Now please click on "WAIT FOR THE EXPERIMENTER'S SIGNAL". Your screen now says "to move onto the next decision click on the button below. Click on OK now. Please listen to my instructions. We will now describe the second decision task. The computer has randomly assigned each of you to six groups of six participants each (show slide of groups), as illustrated on the video monitors. One participant in each group has been randomly selected to be the Decision Maker. Your screen will indicate in the text for Decision 2 whether or not you are the Decision Maker in your group. If you are the Decision Maker, then you decide how much of $20 (0 - 20) to keep for yourself; the amount you do not keep will be divided equally among the other 5 members of your group. Let me repeat this ... [Pause 5 seconds] If you are not the Decision Maker, then another participant in your group is the decision maker. Nonetheless we would like to know what decision you would have made had you been selected as the decision maker. Everyone, please use the next minute to make your decision. [Pause 60 seconds.] If you haven't made your decision yet, do so now and stop. The groupings and designation of the Decision Makers for the previous decision task, will not apply to any of the remaining decision tasks. Each of the next five decisions you face will be described by a Tree that will be displayed on the video monitor. I will now display an example of a Tree in order to explain what it means. Please look at a video monitor while I describe the tree. CAUTION, this is an example and is not the Tree for any of your decision tasks. A Tree consists of nodes and branches. Some nodes and branches are Red and others are Blue. Participants in this room will be divided into a Red team and a Blue team with 18 participants on each team. At the Red nodes, a participant from the Red team gets to choose among the Red branches. At the Blue nodes, a participant from the Blue team gets to choose among the Blue branches [show Slide 1].
31
RED A
C
B
BLUE
BLUE
BLUE
LA
RA
LB
RB
LC
RC
20
40
10
90
70
30
35
15
20
75
50
50
Slide 1.
In the example Tree displayed, if you are on the Red team, you get to move first. If you are on the Blue team, you move second. The first mover in the above tree decides between A, B, or C. The second mover, a member of the blue team, must make three distinct choices: one choice between LA and RA that will be carried out if and only if the first mover chooses A; one choice between LB and RB that will be carried out if and only if the first mover chooses ; and one choice between LC and RC that will be carried out if and only if the first mover chooses C. Final payments will be determined as follows. The computer will match each Red team member with each Blue team member. The decisions of each matched pair will determine a unique path in the tree. For example, suppose the Red team member in the above tree chose A; and the Blue team member chose (LA, RB, RC), the unique path will consist of [show Slide 2] RED A BLUE
C
B
BLUE
BLUE
LA
RA
LB
RB
LC
RC
20
40
10
90
70
30
35
15
20
75
50
50
Slide 2.
the A branch, followed by the LA branch, ending in the box with the numbers (20, 35). In this case, the Red team member will receive 20 points and the Blue team member will receive 35
32
points. On the other hand, if the Blue team member had chosen (RA, LB, RC) the unique path would have been [show Slide 3] RED A BLUE
C
B
BLUE
BLUE
LA
RA
LB
RB
LC
RC
20
40
10
90
70
30
35
15
20
75
50
50
Slide 3.
branch A, followed by branch RA, ending in the box with numbers (40, 15). In this match, the Red team member would have received 40 points while the Blue team member received 15 points. From the point of view of a Blue team member, suppose they chose (LA, RB, RC). [Show Slide 4] RED A BLUE
C
B
BLUE
BLUE
LA
RA
LB
RB
LC
RC
20
40
10
90
70
30
35
15
20
75
50
50
Slide 4.
Then when matched with a Red team member, if that Red team member chose A, the path would be (A, LA), ending in (20, 35) for (Red, Blue) respectively; if Red chose B, the path would be (B, RB), ending in (90, 75); and if Red chose C, the path would be (C, RC), ending in (30, 50). Notice that each of the three choices of the Blue team member is important, because A, B or C could be chosen by any of the Red team members.
33
Since there are 18 members of each team, each Red team member will be matched 18 times (once for each of the 18 Blue team members). The points you received for each match will be summed and then divided by 18, so your total "payoff" will be the average points you receive. At the end of the experiment these payoff numbers will be converted to dollars in a lottery in which for each decision you can win $5 or nothing, and the probability of winning is equal to your payoff; in other words, a payoff of 75 will give you a 75% chance of winning $5. Thus, the higher your payoff for each decision, the greater your chance of winning $5 for that decision. The lotteries for each participant and each decision will be separate and independent. We will also report the sum of the payoffs for the Red team and Blue team. The five Trees, one for each decision task on your screen, will be displayed on the video monitors. We will also pass out hard copies of these Trees for your convenience. Your decisions must be entered as capitalized letters, so please press your Caps Lock key now. Press your Caps Lock key now. [Pause] You will have 7.5 minutes in which to make all five decisions. NO further instructions will be given. You may make these decisions in any order, and you may change any of these decisions by clicking on the corresponding box, deleting what's there using the backspace and/or delete keys, and retyping a new choice. The clock on your screen will display the time remaining. CAUTION: If you fail to make a choice for any of the decisions, $5 will be subtracted from your total earnings. [Begin passing out hard copies, with a front cover that says DO NOT TURN PAGE UNTIL TOLD TO DO SO. Pause until all hard copies are passed out.] You will now call up the screen with these decision tasks. Click on "WAIT FOR THE EXPERIMENTER'S SIGNAL". The clock will start as soon as click on the OK button. At that time, you may turn the page of the handout and begin making your decisions. Do not stop until you have made all five decisions. After making all five decisions, you may reconsider your decisions, but you may not stand up or speak. Click on OK now.
34
Appendix B. Harvard Instructions for Non-Team Framing Treatments.
The welcoming remarks and instructions for the first and second decision tasks were identical to those given in Appendix A. The groupings and designation of the Decision Makers for the previous decision task, will not apply to any of the remaining decision tasks. For the One-Prize treatment, the following paragraph was inserted. "Only one of the next five decisions (numbered 3-7) will be used to determine additional monetary payments. I am going to ask one of you to pick a card from a stack of five cards numbered 2, 3,... 7 on the underside (show cards). We will put the card picked aside, and after everyone has made all five decisions, we will turn the card over and the revealed number will be the decision problem that is carried out to determine additional monetary payments." Each of the next five decisions you face will be described by a Tree that will be displayed on the video monitor. I will now display an example of a Tree in order to explain what it means. Please look at a video monitor while I describe the tree. CAUTION, this is an example and is not the Tree for any of your decision tasks. A Tree consists of nodes and branches. The first (top) node and its branches are Red, while the second (lower) nodes and branches are Blue. Half of the participants in this room (18) will choose among the Red branches and we will call them "First Movers", and the other half of the participants will choose among the Blue branches and we will call them "Second Movers". Your role as First or Second Mover will not be the same for all decision problems. [Show Slide 1.] In the example Tree displayed, if you are the First Mover, you choose between A, B, and C. If you are the Second Mover, you make three distinct choices: one choice between LA and RA that will be carried out if and only if the First Mover chooses A; one choice between LB and RB that will be carried out if and only if the First Mover chooses ; and one choice between LC and RC that will be carried out if and only if the First Mover chooses C. Final payments will be determined as follows. The computer will match each First Mover with each Second Mover. The decisions of each matched pair will determine a unique path in the tree. For pairwise matching treatments, the above paragraph read as follows. "Final payments will be determined as follows. The computer will randomly divide everyone into pairs: one member of each pair will be the First Mover and the other will be the Second Mover. The decisions of each matched pair will determine a unique path in the tree." For example, in the above tree suppose the First Mover chose A; and the Second Mover chose (LA, RB, RC), the unique path will consist of
35
[show Slide 2] the A branch, followed by the LA branch, ending in the box with the numbers (20, 35). In this case, the First Mover will receive 20 points and the Second Mover will receive 35 points. Note that the payoffs for the First Mover will always be shown first and in red, while the payoffs for the Second Mover will always be shown second and in blue. On the other hand, if the Second Mover had chosen (RA, LB, RC) the unique path would have been [show Slide 3] branch A, followed by branch RA, ending in the box with numbers (40, 15). In this match, the First Mover would have received 40 points while the Second Mover received 15 points. From the point of view of the Second Mover, suppose he/she chose (LA, RB, RC). [Show Slide 4.] Then when matched with a First Mover, if that participant chose A, the path would be (A, LA), ending in (20, 35) for (First/Second Mover) respectively; if the First Mover chose B, the path would be (B, RB), ending in (90, 75); and if the First Mover chose C, the path would be (C, RC), ending in (30, 50). Notice that each of the three choices of the Second Mover is important, because A, B or C could be chosen by any of the 18 First Movers. Since there will be 18 participants in each role, each First Mover will be matched 18 times (once for each of the 18 Second Movers). The points you received for each match will be summed and then divided by 18, so your total "payoff" will be the average points you receive. For pairwise matching treatments, the above paragraph read as follows. "For each subsequent decision task, the computer will reshuffle the pairing, so everyone will be paired with five different people. Recall that your role as First or Second Mover will not be the same for all decision tasks." At the end of the experiment these payoff numbers will be converted to dollars in a lottery in which for each decision you can win $5 or nothing, and the probability of winning is equal to your payoff; in other words, a payoff of 75 will give you a 75% chance of winning $5. Thus, the higher your payoff for each decision, the greater your chance of winning $5 for that decision. The lotteries for each participant and each decision will be separate and independent. For the One-Prize treatment, the above paragraph read as follows. "At the end of the experiment, we will turn the card over, and your payoff for that decision will be converted to dollars in a lottery in which you can win $25 or nothing, and the probability of winning is equal to your payoff; in other words, a payoff of 75 points will give you a 75% chance of winning $25. Thus, the higher your payoff for each decision, the greater your chance of winning $25 for that decision. The lotteries for each participant will be separate and independent." The five Trees, one for each decision task on your screen, will be displayed on the video monitors. We will also pass out hard copies of these Trees for your convenience. For each of the five decisions, your computer screen (and only your computer screen) will tell you whether you are a First Mover or a Second Mover. You will not be in the same role for all decisions.
36
Your decisions must be entered as capitalized letters, so please press your Caps Lock key now. Press your Caps Lock key now. [Pause] You will have 7.5 minutes in which to make all five decisions. NO further instructions will be given. You may make these decisions in any order, and you may change any of these decisions by clicking on the corresponding box, deleting what's there using the backspace and/or delete keys, and retyping a new choice. The clock on your screen will display the time remaining. CAUTION: If you fail to make a choice for any of the decisions, $5 will be subtracted from your total earnings. [Begin passing out hard copies, with a front cover that says DO NOT TURN PAGE UNTIL TOLD TO DO SO. Pause until all hard copies are passed out.] You will now call up the screen with these decision tasks. Click on "WAIT FOR THE EXPERIMENTER'S SIGNAL". The clock will start as soon as click on the OK button. At that time, you may turn the page of the handout and begin making your decisions. Do not stop until you have made all five decisions. After making all five decisions, you may reconsider your decisions, but you may not stand up or speak. Click on OK now.
37
Appendix C. UT Instructions
Welcome. [Introduce self and helpers. Say how everyone was recruited.] This is an experiment about economic decision making. If you follow the instructions carefully you might earn a considerable amount of money. This money will be paid at the end of the experiment in private and in cash. [Wave cash.] Every participant will receive $5 as compensation for participating in this experiment, which will last about one hour. Additional bonus payments will depend on the decisions that you and other participants make in the experiment; how these additional payments will be determined will be explained in detail momentarily. It is important that during the experiment you remain SILENT. If you have any questions, or need assistance of any kind, RAISE YOUR HAND but DO NOT SPEAK. One of the experiment administrators will come to you and you may whisper your question to him. If you talk, laugh, or exclaim out loud, you will be asked to leave and will not be paid. We expect and appreciate your cooperation. You will be making choices using the computer mouse and keyboard. You may reposition the mouse pad so it is comfortable for you. Your mouse cursor should move when you slide the mouse on the pad. If not, please raise your hand. Do NOT click the mouse buttons or use the keypad until told to do so. The screen will be blank until everybody clicks on the begin button - but not yet. You will be making a number of decisions, but the outcomes of these decisions will not be revealed to anyone until the end of the experiment after all decisions have been made. Each decision will count in determining your total earnings. *** The text between here and the next “***” was not included for sessions 7 and 8, since those sessions did not include a prior Dictator game. We will now describe the first decision task. The computer has randomly assigned each of you to 8 groups of three participants each (show slide of groups), as illustrated on the overhead. One participant in each group has been randomly selected to be the Decision Maker for this decision task. Each of you has an equal chance of being the Decision Maker. Your screen (when it appears) will indicate whether or not you are the Decision Maker in your group. No one will know which group they belong to, or who is in their group, and the identity of the Decision Makers will never be made public. You will be identified only by a randomly assigned Participant number. Your money earnings will be given to you in an envelope identified only by your Participant number. You will not be asked to sign a receipt or in any way associate your name, Social Security number or any other personal information with your choices or money earnings. Thus, absolute anonymity and privacy will be ensured. Click on page down.
38
If you are the Decision Maker, then you decide how much of $10 (0 - 10) to keep for yourself; the amount you do not keep will be divided equally among the other 2 members of your group. If you are not the Decision Maker, then another participant in your group is the Decision Maker. Nonetheless we would like to know what decision you would have made had you been selected as the decision maker. Everyone, please use the next minute to make your decision. [Pause 60 seconds.] If you haven't made your decision yet, do so now and stop. The groupings and designation of the Decision Makers for the previous decision task, will not apply to any of the remaining decision tasks. *** Click on page down. Each of the next five decisions you face will be described by a Tree that will be displayed on the overhead. I will now display an example of a Tree in order to explain what it means. Please look at the overhead while I describe the tree. CAUTION, this is an example and is not the Tree for any of your decision tasks. The remaining instructions were identical to the corresponding parts of Appendix B.
39
Appendix I. Alternative Specification of Beliefs.
In the model of section 5, we assumed that individualistic types believed all others are individualistic, and that OR types believed all others are OR. We present here an encompassing model to test these assumptions. To allow for the possibility that individualistic Level-2 first movers believe second movers are OR types, we let z1gm(µ) =
∑j [ Q gm (j, µ) y1gm(j)]
denote the expected utility of the first mover from choosing branch m in game g given the belief that the second mover is an OR type with precision µ. Then, the probability that such a first mover chooses branch m is given by Pg3 (m,ν , µ ) ≡ exp (ν z1gm ( µ ))/
∑k exp(ν
z1gk ( µ )) .
Let γ denote the proportion of these subtypes. To allow for OR types to believe second movers are individualistic, we need to specify two additional subtypes. First, let Q 0g (m, ν ) denote the OR first-mover logistic choice function with precision ν for game g and node m given Level-0 beliefs, and let β0 denote the proportion of OR types that have these beliefs. Second, let Q1g (m, ν, µ) denote the OR first-mover logistic choice function with precision ν given the belief that the second-mover has individualistic preferences and precision µ, and let β1 denote the proportion of OR types that have these beliefs. Then, the combined probabilistic choice function for a first-mover is given by PgFM (m) ≡ α0/Bg + α1 Pg1 (m, ν) + α2[(1-γ) Pg2 (m, ν, µ) + γ Pg3 (m,ν , µ ) ] + α3[β0 Q 0g (m, ν ) + β1 Q1g (m, ν, µ) + (1-β0-β1) Q g2 (m, ν, µ) ].
The maximized LL of this encompassing model is -782.22, an increase of only 0.77, which is not significant (p-value = 0.674 with 3 d.f.). Therefore, we cannot reject the projection hypothesis that β0 = 0, β1 = 0, and γ = 0, as asserted in Section 5.
40
Appendix II. Alternative Specifications of Other-Regarding Preferences
Many different specifications of other-regarding preferences were tested. The base model for comparison is the model of Section 5 with the addition of one parameter to allow for a different proportion of utilitarian and alternative other-regarding types. Specifically, let α3 denote the proportion of utilitarian types, and let α4 denote the proportion of Fehr-Schmidt types with θ1 = 0.5 and θ2 = -1. The MLE parameter estimates of this model and the maximized LL are given in the second column of the table below. Compared with the model of Section 5, the increase in the LL is only 0.10, which with one degree of freedom, has a p-value of 0.752. While this is clearly statistically insignificant, we added the extra parameter (i.e. relaxed the restriction that α3 = α4) in order to allow greater flexibility for the alternative specifications to be tested. In the model of Section 5, the implicit reference point for other-regarding preferences is (0,0). A reasonable conjecture is that the appropriate reference point for Fehr-Schmidt preferences varies with the maximin (mi) payoff of each player. To test this, let the reference point for player i in game g be given by rig = βmig , and in the specification of Fehr-Schmidt preferences, eq(8), let (yi – ri) replace yi. If the maximin payoffs have explanatory power, then β will be significantly different from 0.
Base
Maximin
Average
Node MM
Node Avg
ν
0.230
0.240
0.240
0.239
0.241
µ
0.380
0.380
0.380
0.377
0.376
α0
0.022
0.022
0.022
0.022
0.022
α1
0.222
0.221
0.221
0.219
0.221
α2
0.598
0.590
0.590
0.591
0.590
α3
0.074
0.075
0.075
0.076
0.075
α4
0.084
0.091
0.091
0.092
0.092
β
n/a
0.082
0.185
0.072
0.181
LL
-782.89
-782.06
-782.05
-782.27
-782.00
41
The MLE parameter estimates of this model and the maximized LL are given in the third column of the table above. The LL increase is only 0.83, which with one degree freedom, has a p-value of 0.362. Therefore, we reject this alternative. Instead of the maximin payoffs, suppose the reference point is affected by the average payoff (aig) in the game. To test this hypothesis, let the reference point be given by rig = βaig. The MLE parameter estimates of this model and the maximized LL are given in the fourth column of the above table. The result is virtually the same as for the maximin version. Therefore, we reject this hypothesis as well. The above two alternatives do not take account of differences among nodes. One way to remedy this failure is to compute the reference point by node21, while maintaining the assumption that the “node-conditioned” reference point is common knowledge. The MLE parameter estimates for the node-conditioned maximin model and the node-conditioned average model are given in columns 5 and 6 of the above table. The largest increase in LL is only 0.89, which with one degree of freedom, has a p-value of 0.345. Therefore, we reject these nodeconditioned alternatives as well. Another method of letting OR preferences depend on history is to assume preferences are linear: u1(y1,y2) = y1 + θ1n(y2-y1), and u2(y1,y2) = y2 + θ2n(y1-y2) where θin depends on the node. Let yin denote the average payoff to player i at the terminal nodes that follow node n, weighting each terminal node equally. Then, define z1n = (y1n–y2n) - β(a1g-a2g), and z2n = (y2n–y1n) - β(a2ga1g). In words, zin is the advantage of player i at node n, adjusted for his advantage in the whole game. We then specify θin = λzin/(1 + λzin) if zin ≥ 0, and θin = γλzin otherwise; where λ and γ are non-negative parameters. Thus, when player i has a positive advantage relative to the game, he becomes more generous (0 < θin < 1), and when player i has a disadvantage relative to the game, he becomes more spiteful (θin < 0). Despite the three extra parameters (β, λ and γ), the maximized LL is -782.92, which is worse than our base model. Therefore we reject this method of node-conditioning as well.
21
I.e. at node n, the maximin level is computed over the set of nodes excluding node n.
42