Game Strategies, Promises, and Rational Choice - CiteSeerX

Game Strategies, Promises, and Rational Choice Jan van Eijck Centre for Mathematics and Computer Science (CWI), Amsterdam [email protected] Social Software Mini-Conference, CUNY, Friday May 18

Abstract We will study game trees as representations of rational choice and as representations of player preferences, and promises as public announcements of genuine intentions. Promises in a game change what players know about the preferences of other players. They can be modelled as operations that change a given game into a different game where players know more about the effects of their strategies. Point of departure is the work of Van Benthem [1] and [6].

Game Theory and Rational Choice Game theory: Problem of rational choice = problem of finding the best move, or: problem of finding an optimal strategy. Preferred method (for finite games of perfect information): backward induction.

Example: Challenge Game • Challenger chooses between getting in and staying out.

Challenger Get in Stay out

• If challenger stays out, game is over. • If challenger gets in, incumbent chooses between giving in or fighting. • Payoffs as in picture.

Incumbent Give in Fight 2,1

0,0

1,2

Analysis • Challenge is a game with perfect information. • Suppose the challenger intrudes. Then the best move for the incumbent is to give in, for this gives the best payoff. • Therefore the best move for the challenger is to intrude. For then the challenger will play his best strategy in the resulting subgame, and the result will be (2,1), which is better for challenger than the result (1,2) of staying out. • This type of reasoning is an example of Backward Induction (BI).

Backward Induction, Definition BI is a technique to solve a finite game of perfect information. First, one determines the optimal strategy of the player who makes the last move in the game. Then, taking these moves as given future actions, one determines the optimal strategy for the next-to-last player in the game. And so on, backwards in time, until the beginning of the game is reached. As it turns out this determines the Nash equilibrium of each subgame of the game.

Nash equilibrium/strategic equilibrium A Nash equilibrium, also called strategic equilibrium, is a list of strategies, one for each player, which has the property that no player can achieve a better payoff by unilaterally changing her strategy. Example: (Talk, Talk) is a Nash equilibrium in the Prisoner’s Dilemma game. A subgame is a piece of a sequential game beginning at some node such that each player knows every action of the players that moved before him at every point. Further details: [16], [13], [10]. [. . . ] Nash equilibrium [. . . ] models a steady state in which each player has learned the other players’ actions from her long experience playing the game. ([13], p. 377)

Strategic Games Sequential structure of decision making suppressed. Example: Bach or Strawinski Bach Strawinski Bach (2,1) (0,0) Strawinski (0,0) (1,2)

Strategic Game modelled as Extensive Game 1 Bach 2

Strawinsky

Bach Strawinsky 2,1

0,0

2 Bach Strawinsky 0,0

1,2

Ingredients of Extensive Games • Players • Terminal histories (‘maximal’ sequences of actions that may occur). • Player function P :: History → Player, where histories are all proper subsequences of terminal histories. • Preferences :: TerminalHistory → TerminalHistory → Ord (can be specified by a payoff-function). • If there is imperfect information: for each player i an information partition of the set Hi = {h | P (h) = i}. • If there is imperfect information: preferences are assigned to probability distributions (lotteries) over terminal histories.

Extensive Games: (Much Simplified Version of) Liar’s Dice Sequential structure of decision making explicitly described. • Players 1 and 2 both stake one dollar. • Player 1 rolls a die and observes outcome (low/high), while keeping it concealed from player 2. • Player 1 sees or raises. • If 1 sees high, she wins the stake and the game ends. • In response to 1 raising, 2 either passes or meets (so that both add 1 dollar to the stake) • If 2 passes, 1 wins the stake, and the game ends. • If 2 meets low, she wins the stake, otherwise 1 wins the stake. The game ends.

Chance High Low 1

1

See Raise 1,-1

2

See Raise

Pass Meet 1,-1

2,-2

2 Pass Meet 1,-1

-2,2

1,-1

Analysis • Player 1 has two information states: high and low, player 2 has a single information state. • Player 1 has four strategies: (Raise,Raise) (raise in case of high and in case of low), (Raise,See), (See,Raise), and (See,See). Player 2 has two strategies: Pass and Meet. • Strategic form of the game: Pass Meet Raise,Raise 1,-1 0,0 Raise, See 0,0 12 , − 12 See, Raise 1,-1 − 12 , 12 See, See 0,0 0,0 • There is no pure strategy Nash equilibrium.

Analysis, continued Pass Meet Raise,Raise 1,-1 0,0 Raise, See 0,0 12 , − 21 See, Raise 1,-1 − 12 , 12 See, See 0,0 0,0 (See,See) is strictly dominated by an equal mix of (Raise,Raise) and (Raise,See). (See,Raise) is not a good response to any 2-strategy that favours Meet over Pass. Thus, we are left with: Pass Meet Raise,Raise 1,-1 0,0 Raise, See 0,0 12 , − 21

Analysis, continued 2 • Suppose 1 plays (Raise,Raise) with probability p and (Raise,See) with probability 1 − p. Then her expected value is p for her first strategy, and 12 (1 − p) for her second strategy. Thus she should play her first strategy with probability p = 21 (1 − p), i.e., p = 13 . • Suppose 2 plays Pass with probability q and Meet with probability 1 − q. Then his expected value is −q for his first strategy and − 12 (1 − q) for his second strategy. Thus, he should play his first strategy with probability q = 12 (1 − q), i.e., q = 13 . • The mixed strategy where 1 plays (Raise,Raise) in 13 of the cases and 2 plays Pass in 13 of the cases is a Nash equilibrium. • Thus, 1 always raises when her throw is high and bluffs in one third of the cases otherwise, and 2 responds by passing in one third of all cases.

Extensive Games and Strategies A strategy for player i in an extensive game is a function A that assigns to each history h that ends in an i-choice (each h with P (h) = i) an action in A(h), where A(h) is the set of actions that are available after h). Challenge Game Strategy Strategies for incumbent in the challenge game are Give in and Fight, strategies for challenger are Intrude and Stay out. Liar’s Dice Strategy Strategies for 1 in the Liar’s Dice game are (Raise,Raise) (raise in case of high and in case of low), (Raise,See), (See,Raise), and (See,See), strategies for 2 are Pass and Meet.

Games as Processes Process graphs are labeled transition systems; they appear under a different guise in modal logic as rooted Kripke models. Process graphs show up in formal language theory as finite automata. Games are a particular kind of process graphs. They have internal nodes labeled by players, and end nodes labeled by payoffs. We can represent games as pairs (r, R) where r is the root node and R is a set of edges.

Game Operations as Process Operations Basic games are pairs (i, {(i, a, p)}), where a is an action, i is an a agent and p is a payoff. Notation: i → p. The first element is the root. Operations Prefix if G = (i, R) is a game, j is an agent and a is an action, then j ⊗a G is the game with root labeled j, and set of edges R ∪ {(j, a, i)}. Choice if G = (i, R) and G0 = (i, S) are games (both with root i), then G ⊕ G0 is a game.

Example: Construction of Challenge Game Give in

G1 = i → 2, 1 Fight

G2 = i → 0, 0 G3 = G1 ⊕ G2 Stay out

G4 = c → 1, 2 G5 = c ⊗Get in G3 G6 = G4 ⊕ G5 . Note: these operations still define trees.

From Game Trees to Game Graphs Challenger One can extend this to generation of graphs by allowing an operation Intrude Fight Stay out µx.G(x) that unifies xmarked nodes of G with the Incumbent 1,2 root of G (see [12]). This liberalizes the definition of a game to allow cycles that Give in generate runs of arbitrary length. 2,1

Rationality and Backward Induction You Left Right (1,0)

Me Left

(0,100)

Right (99,99)

BI analysis: start at the bottom: as a ‘rational’ player, I will choose to go Left, since 100 is better than 99. You can see this coming: so going Right gives you only 0, whereas going Left gives you 1. Therefore, you will choose Left at the start, and we both end up getting very little, while I lose most of all.

Reasoning about Rationality Let a finite two-player extensive game G specify my preferences, but not yours. Moreover, let both our strategies σme, σyou for playing G be fixed in advance. When can we rationalize your given behaviour σyou to make our two strategies the BI solution of the game? Best Responsiveness: My strategy chooses a move leading to an outcome which is at least as good for me as any other outcome that might arise by choosing an action, and then continuing with σme, σyou. Folklore result (also see Van Benthem [6]): In any game that is bestresponsive for me, there exists a preference relation for you among outcomes making the unique path that plays our given strategies against each other the BI solution.

Example of ‘Rationalisation’ You

You

Me

3

You

Me

You

2

4

3

Rationalisation in Terms of Belief You Left Right (1,0)

Me Left

(0,100)

Right (99,99)

Your initial choice of ‘Right’ is rational if you believe that I will choose ‘Right’ as well.

Johan van Benthem’s Rationalisation Theorem Not-too-bad strategies for you: Your strategy σyou never prescribes a move for which each outcome reachable via further play according to σyou and any moves of mine is worse than all outcomes reachable via some other move for me. Theorem (Van Benthem [6]): In any game that is not-too-bad for you, there exists a strategy τ for me against which, if you believe that I will play τ against your σyou, is optimal. Note: it is your assumption about my strategy that makes your behaviour rational.

You

You

1

4

2

Me

Me

You

You

1

3

You

Me

5

5

2

6

Promises in Games Suppose I make a promise to you that I will choose ‘Right’. Suppose this promise absolutely convinces you. Then this changes the lefthand game into the righthand one: You You

Left Right (1,0)

Left

Me Left

(1,0)

Right Me

Right (99,99)

(0,100)

(99,99)

Promises as Public Announcements A public announcement φ restricts the current M, s to a model M|φ, s of just those worlds in M which satisfy φ. [!φ]Biψ is equivalent to φ → Bi(φ, [!φ]ψ). Here [!φ]ψ expresses that ψ holds after public announcement φ, and Bi( , ) is used for conditional belief. Note that the axiom pushes the [!φ] operator past the belief operator.

PDL-Based Logics for Games Two-sorted PDL (over basic game actions, and over epistemic uncertainty relations), with promises.

a α π φ

::= ::= ::= ::=

the basic game actions and their reverses a |?φ | α1; α2 | α1 ∪ α2 | α∗ i |?φ | π1; π2 | π1 ∪ π2 | π ∗ > | p | ¬φ | φ1 ∨ φ2 | hαiφ | hπiφ | h!φ1iφ2

Abbreviate ¬hπi¬φ as [π]φ, ¬hαi¬φ as [α]φ, and ¬h!φ1i¬φ2 as [!φ1]φ2. Interpretation in epistemic game models.

Example Formulas • [(i ∪ j)∗]p expresses that i, j have common knowledge that p. • ¬[i]hBachˇi> expresses that i does not know that the previous move was Bach. • [i][B ∪ S](hBip ∨ hSip) expresses that i knows that after a choice between B and S either B or S leads to p. • [![L ∪ R; L]⊥]φ expresses that a promise not to choose L after a choice of L or R guarantees φ.

Shape of Reduction Axiom for Promises Previous format: [!φ]Biψ ↔ (φ → Bi(φ, [!φ]ψ)). New format: [!φ][i]ψ ↔ (φ → [(?φ; i)∗][!φ]ψ). Complete logic for this: PDL axioms for α, PDL axioms for π, interaction axioms for α and π, reduction axioms for promise operations. Reduction axioms ‘compile out’ the promise operations.

Conclusions • Many ways of setting up logics for games, many ways of getting complete logics (reduction axioms, or program transformation, in the style of [2]). • Above does not yet cover collective action and ability. See [14, 15], and previous talk, for this. • Option: add strategies themselves as explicit operators e.g., by means of special action symbols σi for i-s strategy in the game. See [4, 1, 7, 11]. • Option: add explicit preference relations and preference change. See [3]. • Background questions and (long) lists of open problems in [5, 6, 9]. • More leisurely version in [8].

References [1] J. van Benthem. Extensive games as process models. Jolli, pages 289–313, 2001. [2] J. van Benthem, J. van Eijck, and B. Kooi. Logics of communication and change. Information and Computation, 204(11):1620–1662, 2006. [3] J. van Benthem, S. van Otterloo, and O. Roy. Preference logic, conditionals, and solution concepts in games. In H. Lagerlund, S. Lindstr¨om, and R. Sliwinski, editors, Modality Matters, pages 61–76, 2006. [4] Johan van Benthem. Games in dynamic-epistemic logic. Bulletin of Economic Research, 53:219–248, 2001.

[5] Johan van Benthem. Open problems in logic and games. Technical Report PP-2005-06, ILLC, Amsterdam, 2005. [6] Johan van Benthem. Rationalisations and promises in games. Amsterdam and Stanford, October 2006. [7] Boudewijn de Bruin. Explaining Games. University of Amsterdam, 2004.

PhD thesis, ILLC,

[8] J. van Eijck and R. Verbrugge, editors. Discourses on Social Software. Texts in Logic and Games. Amsterdam University Press, Amsterdam, 2007, to appear. [9] Jelle Gerbrandy. Communication strategies in games, 2005. [10] Robert Gibbons. A Primer in Game Theory. Prentice Hall, 1992. [11] Paul Harrenstein. Logic in Conflict. PhD thesis, University of Utrecht, 2004.

[12] Robin Milner. A complete inference system for a class of regular behaviours. Journal of Computer and System Sciences, 28:439– 466, 1984. [13] Martin J. Osborne. An Introduction to Game Theory. Oxford University Press, New York, Oxford, 2004. [14] Marc Pauly. Logic for Social Software. Amsterdam, 2001.

PhD thesis, ILLC,

[15] Marc Pauly and Rohit Parikh. Game logic — an overview. Logica, 75(2):165–182, 2003. [16] Philip D. Straffin. Game Theory and Strategy. The Mathematical Association of America, New Mathematical Library, 1993. Fourth printing: 2002.