Institute of Economics, University of Hamburg, Von-Melle-Park 5, D-2000 ... Probably, the simplest way to describe the problem of fighting pollution is to assume ...
Public Choice 76: 347-356, 1993. © 1993 Kluwer Academic Publishers. Printed in the Netherlands.
Fighting pollution when decisions are strategic
M A N F R E D J. H O L L E R * Institute of Economics, University of Hamburg, Von-Melle-Park 5, D-2000 Hamburg 13
Accepted 5 December 1991
Abstract.In this paper we analyse anti-pollution policies in a 2-by-2 game played between a "polluter" and the "police" in which the payoffs can be manipulated by an exogenous third player called the "policy-maker." We show that the efficiency of the policies may depend on whether the players of the 2-by-2 game choose Nash equilibrium strategies or prefer maximin.
1. The one-shot setting U n f o r t u n a t e l y , we c a n n o t always h o p e to live with p o t e n t i a l p o l l u t e r s in a w o r l d with an infinite t i m e h o r i z o n so t h a t the f o l k theorem 1 gives us a c h a n c e to i m p l e m e n t a n efficient s o l u t i o n in which p o l l u t e r s r e f r a i n f r o m d o i n g their evil j o b . N o r c a n we h o p e t h a t i n f o r m a t i o n is i n c o m p l e t e a n d beliefs a r e s t r o n g e n o u g h 2 so t h a t a p o l l u t e r finds h i m s e l f in a r e p u t a t i o n g a m e o f K r e p s - W i l s o n t y p e - a n d m i m i c s a n o n - p o l l u t i n g a g e n t for a s u b s t a n t i a l t i m e span. M o s t t a n k e r s sink o n l y once, a n d once the forest is p u t to ashes, it will t a k e m o r e t h a n a m a n ' s life to have this p a r t i c u l a r c h a n c e a g a i n . P o l l u t i o n is o f t e n a hita n d - r u n act. P o l l u t e r s are the least inclined to stay in a p o l l u t e d area, especially since t h e y a r e the first t o b e i n f o r m e d a b o u t t h e p o l l u t i o n ' s q u a l i t y a n d d i m e n sion. F o r t u n a t e l y , p o l l u t e r s c a n n o t always r u n a w a y , a n d this, at least in s o m e case, c o n s t r a i n s their activities. I d o n o t d e n y t h a t i n t e r t e m p o r a l s o l u t i o n s are irrelevant f o r the p o l l u t i o n p r o b l e m , b u t I c l a i m t h a t the o n e - s h o t setting has its s t r o k e , t o o , w h e n it c o m e s to p o l l u t i o n as a general p h e n o m e n o n a n d t h a t o n e - s h o t g a m e s a r e r e l e v a n t t o t h e analysis o f t h e p h e n o m e n o n w h e n decisions a r e strategic. In the f o l l o w i n g , I will discuss a s i m p l e ( o n e - s h o t ) 2-by-2 g a m e p l a y e d between a p o l l u t e r a n d a n a n t i - p o l l u t i o n i n s t i t u t i o n called the " p o l i c e , " in which the p a y o f f s o f the t w o p l a y e r s can be m a n i p u l a t e d b y a n e x o g e n o u s t h i r d p l a y e r called the p o l i c y - m a k e r , w h o is a s s u m e d to b e interested in r e d u c i n g p o l l u t i o n . * This paper has been accomplished while the author was a member of the Institute of Economics and Statistics at the University of Aarhus. I would like to thank Friedel Bolle and the anonymous reviewers of this Journal for helpful comments on an earlier version of this paper.
348 We will see that there may be some surprises in this game - for naive players as well as blue-eyed policy-makers. However, before we come to the surprises, we have to hide the rabbit in the black silk hat.
2. The polluter-police game Probably, the simplest way to describe the problem of fighting pollution is to assume that there is a potential polluter. A, on the way to bury the rest of his rubber boat in the park, and a policeman, B, who is in charge of preventing the park from suffering this evil. A has the choice between " p o l l u t e " (strategy Sll ) and " n o t pollute" (strategy sl2), and B has the choice between " e n f o r c e " (strategy s21) and " n o t enforce" (strategy s22) the law, i.e. go to the park and look for A, or not go to the park - or " p a t r o l " and " n o t patrol." If we combine pairs of these strategies and assign utilities for A and B to the events, we get a matrix game given in Figure 1. More specifically, if A chooses Sll and B chooses s2p then the outcome in the event set is "pollution and enforcement of punishment" which is evaluated a and a by A and B, respectively. In the following, a and a as well as the other entries of Figure 1 are assumed to be von N e u m a n n / M o r g e n s t e r n utilities - also called payoffs in standard game theory. It seems plausible to assume that c > a , b > d , a>/3, and 6>3,. The difference a - t 3 can be interpreted as catch premium. Imagine we are the third agent, the "policy-maker", who employs the police and whose major interest is to save the park from pollution. Then we might want to have an answer to the following two questions: (1) What is the outcome of the game in Figure 1 and (2) how can we increase the chance that the park will be spared from pollution? - Obviously, the second question is linked to the first and both boil down to the discussion of the solution to the game which reflects the strategic uncertainty contained in the decision situation. This is what follows.
3. The Nash solution and its implications The Nash equilibrium is a first candidate to serve as a solution to the game in Figure 1, however, this game has no (Nash) equilibrium in pure strategies. For each strategy pair (Slj, Saj), where j = 1, 2, either player A or player B has the possibility to improve his payoff value by choosing an alternative strategy, given the strategy of the other player. Let us take, e.g., the strategy pair (s11, s22), player B can switch from strategy $22 to strategy $21 and, given sll, thereby increase his payoff from 13 to a. But, since the game is finite, it must have a mixed-strategy equilibrium (see Nash, 1951). That is, we look for a pair of
349
B
A
Pollute
sll
Not pollute s12
Enforce
Not enforce
$21
S22
(a, a)
(b, ~)
(c, 7)
(d, 8)
Figure 1. Generalized 2-by-2game
probabilities (p*, q*), where p* is the probability of " p o l l u t e " (and l-p* of " n o t pollute") while q* is the probability of " e n f o r c e " (and l-q* of " n o t enforce") such that neither A nor B can gain a higher (expected) payoff by choosing an alternative strategy given the mixed strategy of the other player. This can be achieved (a) by choosing p* such that B is indifferent between playing s21 and s22, or any mixture of these strategies contained in the interval 1 _>q _ 0, and (b) by choosing q* such that A is indifferent between playing s12 and s22 or any mixture of the two strategies in the interval 1 _>p _ 0. Solving = P/~ + ( l - p ) 6
(la)
qa + ( 1 - q ) b = qc + ( 1 - q ) d
(lb)
po~ + ( l - p ) 7 and
we get p * --
q* -
6-7
d-b a-b-c+d
(2a) (2b)
The probabilities p* and q* represent the Nash (equilibrium) strategies of players A and B. Being the policy-maker, we may expect that A and B randomize their strategies in accordance to (2a) and (2b). Consequently, there is a probability p = p* that the park will get polluted. The policy-maker might not be happy with this outlook and either try to decrease p* or even try to achieve p* = 0. Obviously, the second goal can be reached by making S]l a strictly dominated strategy,
350 i.e., by decreasing the payoff b such that a < c and b < d holds. Note b, which is the payoff of the polluter if he pollutes and the policeman is not around to enforce the law. This, of course, raises questions of enforceability - even more so, since once Sla is dominant, the policeman would always choose SEa, i.e., stay away instead of enforcing the law if it is violated (since there will be no violation). But how can a decrease of b be implemented? To be sure that " p o l l u t e " will not occur, there is no other way than to order the policeman, B, to guard the park irrespective of what the polluter, A, is expected to do. This is the iron law o f police states and has been followed throughout history. But a police state is costly in many ways and we may ask whether the probability of pollution can be decreased in a less drastic way for the described game situation. How about giving a premium to A if he brings his rubber boat to a garbage camp instead of dumping it in the park? This is successful if, again, s12 becomes a dominating strategy, but this might also be very costly. If, however, s12 is not dominant, and c > a and b > d apply, then an increase of c or d, or both, which honors " n o t pollute", had no influence on the equilibrium strategy of A as is readily seen from (2a). The mixed equilibrium strategy of player A does not depend on his own payoff, but on the payoffs of the other player. This is a puzzling result and there is a list of policy measure which seems to ignore this fact (see Tsebelis, 1989, 1991). Condition (2a) tells us that the probability of pollution, expressed by p* in the mixed Nash equilibrium, will decrease if the policy-maker increases the payoff/~, which results when A pollutes and B enforces the law. This is what we expect - but we would also expect a decrease of p* if the payoff a, which results for A for the identical strategy combination, is reduced. - This, however, is not the case for the Nash equilibrium.
4. The maximin solution
The mixed strategy equilibrium is weak, i.e., if a player chooses a strategy different from his equilibrium strategy, he will not be worse off, given that the other player chooses the equilibrium strategy. This result is obvious from (la) and (lb) and raises the question why a player should select an equilibrium strategy if the equilibrium is mixed - even more so, since the maximin solution gives identical payoffs if it is mixed too. This result becomes obvious after the mixed maximin strategies are derived. However, to be sure that both the Nash equilibrium and the maximin solution are mixed, we have to specify the constraints on the payoffs as follows3: (i) b > a , b > d , c > a , c > d
(A.I.)
351 (ii) a>/3, a > 3 ' , 6>B, 6 > 7 We see that (A.I) contains the conditions for a mixed strategy equilibrium as given in Figure 1. The other conditions will become obvious after we derive the mixed maximin solution. Playing maximin implies that player i makes himself independent o f the strategy choices of the other player; then i can guarantee himself a certain payoff which is his security level. This implies that A randomizes on Sll and s12 such that the probability o f p+ of choosing si1 satisfies pa + ( 1 - p ) c = pb + ( 1 - p ) d
(3a)
Correspondingly, B randomizes on s21 and s22 such that the probability q + of choosing s21 satisfies q a + ( I - q ) / 3 = q3' + ( l - q ) 0
(3b)
It follows that p+ =
d-c
(4a)
a-b-c+d ~-B
q+ =
(4b)
a-B-3,+6
p + and q + are the maximin strategies of A and B. The maximin solution contains that p + and q+ are chosen simultaneously. If we plug (4a) into (3a), and (4b) into (3b), we get the maximin p a y o f f values ad - cb UI(P+) - a - b - c + d
(5a)
U2(q +) - a - ~ - 3 ' + O
(5b)
Ui( ) is player i's payoff function (i = A,B); we assume that it is o f the von Neumann Morgenstern type. If we substitute q in the right hand side o f (lb) by q* o f (2b), we get a d - cb Ul(q*) - a - b - c + d
- UI(P+)
Correspondingly, from (la) and (2a) derives
(6a)
352 U2(P*) - 0 t - ~ - 7 + 6
- U2(q+)
(6b)
These results are contained in Holler (1990) and Moulin (1982: Ch. IV). It is obvious that A is more likely to pollute in accordance with the maximin solution if his payoff a is increased given b, c and d remain unchanged. This result is plausible inasmuch as now the choice of A is determined by his own payoffs (only) and the increase of p follows the expected pattern. But if ~ is increased, by a positive premium of enforcing the law against a polluter, then the probability of choosing, s21, (i.e., q+) will decrease. Would a policymaker expect that enforcement becomes less likely if he increases the "catch premium" which a policeman can gain? That is what maximin says. Similar surprises result from variations of other parameters of the game if maximin is played. The demonstration is left to the reader. Corresponding paradoxical results derive for the mixed Nash equilibrium as can easily be demonstrated. (Compare Wittman, 1985, for a generalization.) Before I discuss the question of what a policy-maker can expect from the applications of either maximin or Nash, or both of them, I want to point out that (2a), (2b), (4a) and (4b) contain a second set of conditions which, as an alternative to (A.I), also guarantees that both maximin and Nash are mixed. The conditions are (i) a > c , a > b , d > b , d > c
(A.II)
(ii) f l > a , fl>6, 7 > a , -/>6 Together, (A.I) and (A.II) describe a rather large class of 2-by-2 games. Thus, the problems which we discuss here are not negligible within game theory. 5. What can a policy-maker expect?
If a policy-maker increases the payoff a, what can we expect from this policy? Given that conditions (A.I) hold, pollution (s11) becomes more likely if the pollutor A is a maximin player, or if the increase of a has no influence on the behavior of A since A follows the Nash equilibrium strategy; or pollution becomes the sure thing because the increase of a is such that a > c and sll becomes a dominating strategy. On the other hand, the increase of a has no influence on the behavior of the police B if B is choosing maximin strategies and the increase of a is moderate enough not to make sll a dominating strategy (i.e., so that c > a still applies). - A dominating strategy Sll induces the pure strategy s21 ( " e n f o r c e " ) as B's best reply. - If, however, B is a Nash player, then an increase of a, in the limit c > a, implies that the enforcement strategy becomes more likely.
353
A
S21
$22
Sll
(0,1)
(2,0)
s12
(2,0)
(1,1)
Figure2. A 2-by-2game example The expectation of the policy-maker and thus the choice of an efficient policy (for the case of c > a) thus depends on the solution concept(s) which agents A and B apply - or on the policy-maker's assumption of which solution concept they will follow. How plausible are Nash equilibrium and maximin if both are mixed? One may argue that for the given variable-sum game, the mixing of strategies is not plausible at all. The standard argument of hiding one's own behavior behind a veil of randomization in order to avoid exploitation through the other player, which is relevant for zero-sum games, is not relevant here. Basically, each player is interested in its own payoff and, contrary to zero-sum games, in the game of Figure 1 the payoffs of A are independent of the payoffs of B, and vice versa. - There is, however, a strategic interrelationship which connects the choices of A and B, and thus the payoffs via the solution concept the players apply. If we accept mixed strategies, presupposing that we also accept the expected utility theory, the result contained in (6a) and (6b) seems to be relevant to discriminate between Nash and maximin. Why should agent A " r e l y " on the strategy choice of B to gain a payoff Ul(q+) if A can guarantee this value to himself by choosing the maximin strategy p+? Cautiously rationalizable strategies (see Pearce, 1984) clearly favor the maximin solution if Nash and maximin give identical p a y o f f values. The latter is always the case if both solutions are mixed, i.e., if (A.I) or (A.II) applies. This case has been tested in a series of experiments (reported in Holler and Host, 1990). There is some empirical evidence which supports maximin. 4 The following example (Figure 2, taken from Moulin, 1982: 150) shows that there are 2-by-2 games for which in fact it is costly to choose the mixed Nash strategy if the other player may choose maximin. It can easily be demonstrated that for the game in Figure 2, the relation Ul(p*, q+) < ul(p +, q) = U I(P * , q*) applies for all qe[0,1]. Why should A choose the Nash equilibrium strategy in this game? Let us return to the more general game of Figure 1. What can we conclude
354
~!
A
g
Nash
Maximin
Nash
p* l,q*
p* ~,q* 1
Maximin
p+,q*
p+,q+
Figure 3. Effects of an increase of a
if the policy-maker changes a p a y o f f and both A and B change their behavior, i.e., the probability of choosing the first strategy? For example, if the policymaker reduces a and A reduces p, then - with reference to the two suggested solution concepts in the paper - we conclude that A follows maximin. If, at the same time, B reduces q, we conclude that B follows Nash. A comparison of (2b) and (4a) shows that there is a conformity of reaction of the two players to a change of a. A decrease of a always leads to a decrease of p + and q* given that (A.I) applies. There is a similar conformity of reaction with respect to changes of a. I f the policy-maker increases a, then p* and q+ decrease if there is a reaction at all. Figure 3 summarizes the results for an increase of a. It is easy to see f r o m a comparison of (2a), (2b), (4a), and (4b) that a similar conformity of reaction follows for changes of the payoffs b and/~. The policy-maker m a y wish to exploit these conformities but, note, they imply that the two agents apply different solution concepts.
6. Conclusion We have modelled the fighting of pollution as a two-person one-shot game with an exogenous third agent having the potential to manipulate the payoffs of the two players involved in this game. The one-shot setting has been justified by the hit-and-run nature of the pollution problem: even polluters do not like to live in polluted areas. The emigrations of the Mayas f r o m Yucatan and much of the Viking mobility have been attributed to highly garbaged settlements. The average sea level of Aarhus centre has doubled (from 4 yards to 8 yards) during the last 1,000 years due to the generous littering of its inhabitants. The restriction to a two-person game is, of course, only a first approximation. If we introduce a third player who chooses his strategy simultaneously with player 1 and 2, the equilibrium strategy of each player i becomes depen-
355 dent on i's own payoffs. (See Moulin, 1982, for the result and Ordeshook, 1990, for the argument in a similar case.) However, polluters, in general, care little about whether other people suffer (or benefit) f r o m their action. In fact, this m a y be considered as one of the main characteristics of a polluter. Only direct sanctions promise to have some impact on their behavior - thus we introduced the police as the second player. The case of pollution seems to be specific inasmuch as (a) the polluter does not consider himself in a strategic situation with other pollutes and (b) that he m a y well consider the police a single agent. The police organization generally assigns a specific agent (an individual or a group o f policemen) to a specific location at a specific date. Thus, the police can be considered as a single agent. The policy may, however, expect m o r e than one agent on the other side o f the game. But since there is no strategic relationship a m o n g potential polluters, and no communication between them, 5 the police can be seen to be in an isolated game with each potential polluter. There is, however, a third player in the game, the policy-maker. His relation to the two other players is sequential and the peculiarities of the 2-by-2 game carry over to the sequential game. It should be easy to introduce other players (e.g., the bureaucracy) in a sequential structure without invalidating the peculiar results which we demonstrated for the 2-by-2 game and the arguments which we derived f r o m them. Very often, however, we will observe coalition formation on various stages so that, in fact, the n-person game boils down to a 2-person game. The outcome o f this game is not straightforward concerning the fighting of pollution if the manipulation of payoffs is not substantial enough to create a dominating strategy for one o f the players, or for both. It depends on which solution concept the players apply, whether Nash equilibrium or maximin. I f the decision can be made for either concept, we get some unexpected results with respect to the relationship between payoffs and the probability of choosing a specific pure strategy when strategies are mixed. To point out the peculiarity o f these results, m a y help discuss means of fighting pollution when the decision situation is strategic. The conclusion drawn f r o m the analysis above is that a policy-maker should not rely on incentives (payoff increases or decreases) if the incentives are not substantial enough to create dominating strategies.
Notes t. The folio theorem says that the players can achieve any feasible utility pair to be supported by
a Nash equilibrium if the game is iterated with infinite time horizon and players do not discount future. (See, e.g., Friedman 1986: 103-106, where also weaker conditions are discussed.) 2. In the following I will refer to the polluter as male. Since he is potentially a bad person, it
356 should be easy for female readers to accept this discrimination. I am afraid other discriminations will also be contained in this paper. 3. The relations c > d and 6>3' correspond with the author's personal observation in parallel cases. During more than five years he passed, on the average, once per week the GermanDanish border at Flensburg. On the one hand, it was obvious that people felt more happy after they were examined by the custom officers and the examination confirmed their law obedience - or their choice not to take contraband with them - then when they were simply ignored by the officers. This supports c > d. On the other hand, custom officers had an air of disappointment wheia examinations did not bring positive results, quite different from their regular expression of indifferent authority which they showed in the case of graceful inactivity. 4. The axiomatization of the Nash equilibrium by Tan and Werlang (1988: 381-382) makes clear "that the Nash equilibrium is played when the actions which are going to be taken are common knowledge, before they have been taken." In other words, playing Nash strategies is justified if it is known that the other players also choose Nash strategies and the Nash equilibrium is unique. This, however, is immediate from the definition of Nash equilibrium. 5. This excludes that polluters A and A ' agree on dumping their garbage at the same time at two different places in the park so that a single policeman can catch only one of them at maximum.
References Friedman, J.W. (1986). Game theory with applications to economics. New York and Oxford: Oxford University Press. Holler, M.J. (1990). The unprofitability of mixed-strategy equilibria in two-person games: A second folk-theorem. Economics Letters 32:319-323. Holler, M.J. and Host, V. (1990). Maximin vs. Nash equilibrium: Theoretical results and empirical evidence. In R.E. Quandt and D. Triska (Eds.), Optimal decisions in markets andplanned economies. Boulder, San Francisco and London: Westview. Moulin, H. (1982). Game theory f o r the socialsciences. New York and London: New York University Press. Nash, J.F. (1951). Non-cooperation games. Annals o f Mathematics 54: 286-295. Ordeshook, P.C. (1990). Crime and punishment. Are one-shot, two-person games enough? American Political Science Review 84: 573-775. Pearce, D.G. (1984). Rationalizable strategic behavior and the problem of perfection. Econometrica 52: 1029-1050. Tan, T.C.-C. and Werlang, S.R. da Costa. (1988). The Bayesian foundations of solution concepts of games. Journal o f Economic Theory 45: 370-391. Tsebelis, G. (1989). The abuse of probability in political analysis: The Robinson Crusoe fallacy. American Political Science Review 83: 77-92. Tsebelis, G. (1991). The effect of fines on regulated industries. Journal o f Theoretical Politics 3: 81-101. Wittman, D. (1985). Counter-intuitive results in game theory. European Journal o f PoliticalEconomy 1: 77-89.