Tutorial: Introduction to Game Theory. Jesus Rios. IBM T.J. Watson Research
Center .... How to draw together and learn from conflicting probabilistic
judgements.
July, 2013
Tutorial: Introduction to Game Theory Jesus Rios IBM T.J. Watson Research Center, USA
[email protected]
© 2013 IBM Corporation
Approaches to decision analysis Descriptive – Understanding of how decisions are made
Normative – Models of how decision should be made
Prescriptive – Helping DM make smart decisions – Use of normative theory to support DM – Elicit inputs of normative models • DM preferences and beliefs (psycho-analysis) • use of experts – Role of descriptive theories of DM behavior
2
© 2013 IBM Corporation
Game theory arena Non-cooperative games – More than one intelligent player – Individual action spaces – Interdependent consequences • Players’ consequences depend on their own and other player actions
Cooperative game theory – Normative bargaining models • Joint decision making - Binding agreements on what to play
• Given players preferences and solution space Find a fair, jointly satisfying and Pareto optimal agreement/solution – Group decision making on a common action space (Social choice) • Preference aggregation • Voting rules - Arrow’s theorem
– Coalition games 3
© 2013 IBM Corporation
Cooperative game theory: Bargaining solution concepts Working alone
Juan
How to distribute
$ 10
the profits of the cooperation?
Maria $ 20 Working together
$ 100
Juan
x
Maria
y
Maria x + y = 100
90
K
y
Bliss point
• • • • •
Fair
Disagreement point: BATNA, status quo Feasible solutions: ZOPA Pareto-efficiency Aspiration levels Fairness: x = 45 K-S, Nash, maxmin solutions
y = 55
20 10 4
x
80
Juan © 2013 IBM Corporation
Normative models of decision making under uncertainty Models for a unitary DM – vN-M expected utility • Objective probability distributions – Subjective expected utility (SEU) • Subjective probability distributions
Example: investment decision problem – One decision variable with two alternatives • In what to investment? - Treasury bonds - IBM shares – One uncertainty with two possible states • IBM share price at the end of the year - High - Low – One evaluation criteria for consequences • Profit from investment
The simplest decision problem under uncertainty 5
© 2013 IBM Corporation
Decision Table
DM chooses a row without knowing which column will occur Choice depends on the relative likelihood of High and Low? – If DM is sure that IBM share price will be High, best choice is to buy Shares – If DM is sure that IBM share price will be Low, best choice is to buy Bonds
Elicit the DM’s beliefs about which column will occur Choice depends on the value of money – Expected return not a good measure of decision preferences • The two alternatives give the same expected return but most of DMs would not fell indifferent between them
Elicit risk attitude of the DM
6
© 2013 IBM Corporation
Decision tree representation High IBM Shares
$2,000 uncertainty
price
Low
- $1,000
What to buy
Bonds
$500
certainty
What does the choice depends upon? – relative likelihood of H vs L – strength of preferences for money
7
© 2013 IBM Corporation
Subjective expected utility solution If DM’s decision behavior consistent with some set of “rational” desiderata (axioms) DM decides as if he has – probabilities to represent his beliefs about the future price of IBM share – “utilities” to represent his preferences and risk attitude towards money
and choose the alternative of maximum expected utility The subjective expected utility model balance in a “rational” manner – the DM’s beliefs and risk attitudes
Application requires to – know the DM’s beliefs and “utilities” • Different elicitation methods – compute of expected utilities of each decision strategy • It may require approximation in non-simple problems
8
© 2013 IBM Corporation
A constructive definition of “utility” The Basic Canonical Reference Lottery ticket: p-BCRL
p
$2,000
BCLR
1-p
- $1,000
Preferences over BCRL p-BCRL > q-BCRL iff p > q where p and q are canonical probabilities
9
© 2013 IBM Corporation
Elicit prob. of the price of IBM shares Event H – IBM price High
H IBM shares
Event L
$2,000
price
L
– IBM price Low
- $1,000
Pr( H ) + Pr( L ) = 1
p p-BCRL Move p from 1 to 0 Which alternative is preferred by the DM?
$2,000
BCRL
1-p
- $1,000
– IBM shares – p-BCRL
There exists a breakeven canonical prob. such that the DM is indifferent – pH-BCRL ~ IBM shares – The judgmental probability of H is pH
10
© 2013 IBM Corporation
Elicit the utility of $500
p p - BCLR
$2,000
BCLR
U( $500 )?
1-p
Bonds Move p from 1 to 0 Which alternative is preferred by the DM?
- $1,000
$500
p-BCRL vs. Bonds
There exists a breakeven canonical prob. such that the DM is indifferent – u-BCRL ~ Bonds – This scales the value of $500 between the value of $2,000 and - $1,000 U($500) = u
What is then U($500)? – The probability of a BCRL between $2,000 and - $1,000 that is indifferent (for the DM) to getting $500 with certainty
11
© 2013 IBM Corporation
H
Comparison of alternatives IBM shares
$2,000
price
L
~
pH
- $1,000
$2,000
BCRL
- $1,000
U($500)
$2,000
BCLR
The DM prefers to invest on “IBM Shares” iff pH > U($500) 12
- $1,000
~ Bonds
$500 © 2013 IBM Corporation
Solving the tree: backward induction Utility scaling 0 = U( - $1,000 ) < U( $500 ) = u < U( $2,000 ) = 1
Utilities pH IBM Shares
High
1 - pH Bonds
13
1
- $1,000
0
$500
u
price
Low What to buy
$2,000
© 2013 IBM Corporation
Preferences: value vs. utility Value function – measure the desirability (intensity of preferences) of money gained, – but do not measure risk attitude
Utility function – Measure risk attitude – but no intensity of preferences over sure consequences
Many methods to elicit a utility function – Qualitative analysis of risk attitude leads to parametric utility functions – Ask quantitative indifference questions between deals (one of which must be an uncertain lottery) to assess parameters of utility function – Consistency checks and sensitivity analysis
14
© 2013 IBM Corporation
The Bayesian process of inference and evaluation with several stakeholders and decision makers (Group decision making)
15
© 2013 IBM Corporation
Disagreements in group decision making Group decision making assumes – Group value/utility function – Group probabilities on the uncertainties
If our experts disagree on the science (Expert problem) – How to draw together and learn from conflicting probabilistic judgements – Mathematical aggregation • Bayesian approach • Opinion pools - There is no opinion pool satisfying a consensus minimum set of “good” probabilistic properties
• Issues - How do we model knowledge overlap/correlation - Expertise evaluation
– Behavioural aggregation – The textbook problem • If we do not have access to experts we need to develop meta-analytical methodologies for drawing together expert judgment studies
16
© 2013 IBM Corporation
Disagreements in group decision making If group members disagree on the values – How to combine different individuals’ rankings of options into a group ranking? – Arbitration/voting • Ordinal rankings - Arrow impossibility results. • Cardinal ranking (values and not utilities -- Decisions without uncertainty) - Interpersonal comparison of preferences’ strengths - Supra decision maker approach (MAUT) • Issues: manipulation and true reporting of rankings
Disagreement on the values and the science – Combining • individual probabilities and utilities • into group probabilities and utilities, respectively, • to form the corresponding group expected utilities and choosing accordingly – Impossibility of being Bayesian and Paretian at the same time • No aggregation method exist (of probabilities and utilities) compatible with the Pareto order – Behavioral approaches • Consensus on group probabilities and utilities via sensitivity analysis. • Agreement on what to do via negotiation
17
© 2013 IBM Corporation
Decision analysis in the presence of intelligent others Matrix games against nature – One player: R (Row) • Two choices: U (Up) and D (Down) – Payoff matrix Nature L
R
0
5
10
3
U R D
If you were R, what would you do? D > U against L U > D against R 18
© 2013 IBM Corporation
Games against nature Do we know which Colum nature will choose? – We know our best responses to Nature moves, but not what move Nature will choose
Do we know the (objective) probabilities of Nature’s possible moves? – YES
p
1-p Nature
L
R
Expected payoff
0
5
0 p + 5 (1-p)
10
3
10 p + 3 (1-p)
U R D
U > D iff p < 1/6 19
Payoffs = vNM utils © 2013 IBM Corporation
Games against nature and the SEU criteria Do we know the (objective) probabilities of Nature’s possible moves? – No • Variety of decision criteria - Maximin (pessimistic), maxmax (optimistic), Hurwicz, minimax regret,…
Nature L
R
Min
Max
Max Regret
0
5
0
5
10
10
3
3
10
2
U R D Maxmin D
Maxmax D
Minmax Regret D
SEU criteria Elicit DM’s subjective probabilistic beliefs about Nature move (p) Compute SEU of each alternative: D > U iff p > 1/6 20
© 2013 IBM Corporation
Games against others intelligent players Bimatrix (simultaneous) games – Second intelligent player: C (Column) • Two choices: L (Left) and R (Right) – Payoff bimatrix • we know C payoffs and that he will try to maximize them – As R, what would you do? C L
R
0 U
5 2
*
4
R 10 D
21
3 3
– Knowledge C’s payoffs and rationality allows us to predict with certitude C’s move (R)
8
© 2013 IBM Corporation
One shot simultaneous bi-matrix games Two players – Trying to maximize their payoffs
Players must choose one out of two fixed alternatives – Row player chooses a row – Column player chooses a column
Payoffs depends of both players’ moves Simultaneous move game – Players must act without knowing what the other player does – Play once
No other uncertainties involved Players have full and common knowledge of
L
– choice spaces – bi-matrix payoffs
No cooperation allowed
U R D
22
C
uR(U,L) uC(U,L) uR(D,L) uC(D,L)
R uR(U,R) uC(U,R) uR(D,L) uC(D,L) © 2013 IBM Corporation
Dominant alternatives and social dilemmas C
Prisoner dilemma – (NC,NC) is mutually dominant • Players’ choices are independent of information regarding the other player’s move – (NC,NC) is socially dominated by (C,C)
C
NC
5 C
-5 5
10
R
Airport network security
10 NC
-2 -5
*
-2
* 23
© 2013 IBM Corporation
Iterative dominance No dominant strategy for either player, however – There are iterative dominated strategies • L > R • Now M is dominant in the restricted game - M > U and M > D
• Now L > C in the restricted game - 20 > - 10
– (M,L) solution by iteratively elimination of (strict) dominated strategies • Common knowledge and rationality assumptions
Exercise – Find if there is a solution by iteratively eliminating dominated strategies
Solution: (D,C) 24
© 2013 IBM Corporation
Nash equilibrium Games without – Dominant solution – Solution by iterative elimination of dominated alternatives Concert
Ballet
0
2
Ballet
Concert
0 1
*
*
2
Tails
1 1
Head
0
Battle of the sexes
25
Head -1 -1 -1 0
Tails
1 1
1
-1
Matching pennies
© 2013 IBM Corporation
Existence of Nash equilibrium (Nash) Every finite game has a NE in mixed strategies – Requires extending the original set of alternatives of each player
Consider the matching pennies game – Mixed strategies • Choosing a lottery of certain probabilities over Head and Tails – Players’ choice sets defined by the lottery’s probability • Row: p in [0,1] • Column: q in [0,1] – Payoff associated with a pair of strategies (p,q) is • (p,1-p) P (q,1-q)T where P is the payoff matrix for the original game in pure strategies • Payoffs need to be vNM utilities – Nash equilibrium • Intersection of players best response correspondences
uR(p*,q*) > uR(p,q*) uC(p*,q*) > uC(p*,q)
26
(p*,q*)
© 2013 IBM Corporation
Nash equilibria concept as predictive tool Supporting the row player against the column player Games with multiple NEs L
R
4
10
U
-100 12
D
*
6
8
– To protect himself against -100
Knowing this, R would prefer to play U
5
*
Two NEs (D,L) > (U,R), since 12>10 and 8>6 C may prefer to play R
– ending up at the inferior NE (U,R) 4
How can we model C behavior? – Bayesian K-level thinking
27
© 2013 IBM Corporation
K-level thinking p
Row is not sure about Column’s move – p: Row’s beliefs about C moving L – Row’s SEU • U: 4 p + 10 (1-p) • D: 12 p + 5 (1-p) – U > D iff p < 5/13 = 0.38
q
How to elicit p? – Row’s analysis of Column’s decision • Assuming C behave as a SEU maximizer • q: C’s beliefs about whether Row is smart enough to choose D (best NE) • L SEU: -100 (1-q) + 8 q R SEU: 6 (1-q) + 4 q • L > R iff q > 53/55 = 0.96 • Since Row does not know q, his beliefs about q are represented by a CPD F • p = Pr (q > 0.96) = F(0.96)
28
© 2013 IBM Corporation
Simultaneous vs sequential games First mover advantage – Both players want to move first • Credible commitment/threat
Game of Chicken
29
Second mover advantage – Players want to observe their opponent’s move before acting – Both players try not to disclose their moves
Matching pennies game
© 2013 IBM Corporation
Dynamic games: backward induction Sequential Defend-Attack games – Two intelligent players • Defender and Attacker – Sequential moves • First Defender, afterwards Attacker knowing Defender’s decision
30
© 2013 IBM Corporation
Standard Game Theoretic Analysis Expected utilities at node S
Best Attacker’s decision at node A
Assuming Defender knows Attacker’s analysis Defender’s best decision at node D
Solution:
31
© 2013 IBM Corporation
Supporting a SEU maximizer Defender Defender’s problem
Defender’s solution of maximum SEU
Modeling input:
32
??
© 2013 IBM Corporation
Example: Banks-Anderson (2006) Exploring how to defend US against a possible smallpox attack – Random costs (payoffs)
– Conditional probabilities of each kind of smallpox attack given terrorists know what defence has been adopted This is the problematic step of the analysis – Compute expected cost of each defence strategy
Solution: defence of minimum expected cost
33
© 2013 IBM Corporation
Predicting Attacker’s decision: Defender problem
34
. Defender’s view of Attacker problem
© 2013 IBM Corporation
Solving the assessment problem Defender’s view of Attacker problem
Elicitation of A is an EU maximizer D’s beliefs about
MC simulation
35
© 2013 IBM Corporation
Bayesian decision solution for the sequential Defend- Attack model
36
© 2013 IBM Corporation
Standard Game Theory vs. Bayesian Decision Analysis Decision Analysis (unitary DM) – Use of decision trees – Opponent’ actions treated as a random variables • How to elicit probs on opponents’ decisions?? • Sensitivity analysis on (problematic) probabilities
Game theory (multiple DMs) – Use of game trees – Opponent’ actions treated as a decision variables – All players are EU maximizers • Do we really know the utilities our opponents try to maximizes?
37
© 2013 IBM Corporation
Bayesian decision analysis approach to games One-sided prescriptive support – Use a prescriptive model (SEU) for supporting one of the DMs – Treat opponent's decisions as uncertainties – Assess probs over opponent's possible actions – Compute action of maximum expected utility
The ‘real’ bayesian approach to games (Kadane & Larkey 1982) – Weaken common (prior) knowledge assumption
How to assess a prob distribution over actions of intelligent others?? – “Adversarial Risk Analysis” (DRI, DB and JR) – Development of new methods for the elicitation of probs on adversary’s actions • by modeling the adversary’s decision reasoning - Descriptive decision models
38
© 2013 IBM Corporation
Relevance to counterbioterrorism Biological Threat Risk Assessment for DHS (Battelle, 2006) – Based on Probability Event Trees (PET) • Government & Terrorists’ decisions treated as random events
Methodological improvements study (NRC committee) – PET appropriate for risk assessment of • Random failure in engineering systems but not for adversarial risk assessment • Terrorists are intelligent adversaries trying to achieve their own objectives • Their decisions (if rational) can be somehow anticipated – PET cannot be used for a full risk management analysis • Government is a decision maker not a random variable
39
© 2013 IBM Corporation
Methodological improvement recommendations Distinction between risks from – Nature/Accidents vs. – Actions of intelligent adversaries
Need of models to predict Terrorists’ behavior – Red team role playing (simulations of adversaries thinking) – Attack-preference models • Examine decision from Attacker viewpoint (T as DM) – Decision analytic approaches • Transform the PET in a decision tree (G as DM) - How to elicit probs on terrorist decisions?? - Sensitivity analysis on (problematic) probabilities - Von Winterfeldt and O’Sullivan (2006)
– Game theoretic approaches • Transform the PET in a game tree (G & T as DM)
40
© 2013 IBM Corporation
Models to predict opponents’ behavior Role playing (simulations of adversaries thinking) Opponent-preference models – Examine decision from the opponent viewpoint • Elicit opponent’s probs and utilities from our viewpoint (point estimates) – Treat the opponent as a EU maximizer ( = rationality?) • Solve opponent’s decision problem by finding his action of max. EU – Assuming we know the opponent’s true probs and utilities • We can anticipate with certitude what the opponent will do
Probabilistic prediction models – Acknowledge our uncertainty on opponent’s thinking
41
© 2013 IBM Corporation
Opponent-preference models Von Winterfeldt and O’Sullivan (2006) – Should We Protect Commercial Airplanes Against Surface-to-Air Missile Attacks by Terrorists?
Decision tree + sensitivity analysis on probs 42
© 2013 IBM Corporation
Parnell (2007) Elicit Terrorist’s probs and utilities from our viewpoint – Point estimates
Solve Terrorist’s decision problem – Finding Terrorist’s action that gives him max. expected utility
Assuming we know the Terrorist’s true probs and utilities – We can anticipate with certitude what the terrorist will do
43
© 2013 IBM Corporation
Parnell (2007) Terrorist decision tree
44
© 2013 IBM Corporation
Paté-Cornell & Guikema (2002) Attacker
45
Defender
© 2013 IBM Corporation
Paté-Cornell & Guikema (2002) Assessing probabilities of terrorist’s actions – From the Defender viewpoint • Model the Attacker’s decision problem • Estimate Attacker’s probs and utilities (point estimates) • Calculate expected utilities of attacker’s actions – Prob of attacker’s actions proportional to their perceived EU
Feed these probs into the Defender’s decision problem – Uncertainty of Attacker’s decisions has been quantified – Choose defense of maximum expected utility
Shortcoming – If the (idealized) adversary is an EU maximizer he would certainly choose the attack of max expected utility
46
© 2013 IBM Corporation
How to assess probabilities over the actions of an intelligent adversary?? Raiffa (2002): Asymmetric prescriptive/descriptive approach – Prescriptive advice to one party conditional on a (probabilistic) description of how others will behave – Assess probability distribution from experimental data • Lab role simulation experiments
Rios Insua, Rios & Banks (2009) – Assessment based on an analysis of the adversary rational behavior • Assuming the opponent is a SEU maximizer - Model his decision problem - Assess his probabilities and utilities - Find his action of maximum expected utility
– Uncertainty in the Attacker’s decision stems from • our uncertainty about his probabilities and utilities – Sources of information • Available past statistical data of Attacker’s decision behavior • Expert knowledge / Intelligence
47
© 2013 IBM Corporation
The Defend–Attack–Defend model Two intelligent players – Defender and Attacker
Sequential moves – First, Defender moves – Afterwards, Attacker knowing Defender’s move – Afterwards, Defender again responding to attack
Infinite regress
48
© 2013 IBM Corporation
Standard Game Theory Analysis
Under common knowledge of utilities and probs At node
Expected utilities at node S
Best Attacker’s decision at node A
Best Defender’s decision at node
Nash Solution:
49
© 2013 IBM Corporation
Supporting the Defender against the Attacker At node
Expected utilities at node S
At node A
Best Defender’s decision at node
50
??
© 2013 IBM Corporation
Predicting Attacker’s problem as seen by the Defender
51
© 2013 IBM Corporation
Assessing Given
52
© 2013 IBM Corporation
Monte-Carlo approximation of Drawn Generate
by
Approximate
53
© 2013 IBM Corporation
The assessment of The Defender may want to exploit information about how the Attacker analyzes her problem Hierarchy of recursive analysis – Infinity regress – Stop when there is no more information to elicit
54
© 2013 IBM Corporation
Games with private information Example: – Consider the following two-person simultaneous game with asymmetric information • Player 1 (row) knows whether he is stronger than player 2 (Colum) but player 2 does not know this • Player's type use to represent information privately known by that player
55
© 2013 IBM Corporation
Bayes Nash Equilibrium Assumption – common prior over the row player's type: • Column's beliefs about the row player's type are common knowledge • Why column is going to disclose this information? • Why row is going to believe that column is disclosing her true beliefs about his type?
Row’s strategy function
56
© 2013 IBM Corporation
Bayes Nash Equilibrium
57
© 2013 IBM Corporation
Is the common knowledge assumption
realistic?
– Column is better off reporting that
–
–
58
© 2013 IBM Corporation
Modeling opponents' learning of private information Simultaneous decisions – Bayes Nash Equilibrium – No opportunity to learn about this information
Sequential decisions • Perfect Bayesian equilibrium/Sequential rationality • Opportunity to learn from the observed decision behavior - Signaling games
Models of adversaries' thinking to anticipate their decision behavior – need to model opponents' learning of private information we want to keep secret – how would this lead to a predictive probability distribution?
59
© 2013 IBM Corporation
Sequential Defend-Attack model with Defender’s private information Two intelligent players – Defender and Attacker
Sequential moves – First Defender, afterwards Attacker knowing Defender’s decision
Defender’s decision takes into account her private information – The vulnerabilities and importance of sites she wants to protect – The position of ground soldiers in the data ferry control problem (ITA)
Attacker observes Defender’s decision – Attacker can infer/learn about information she wants to keep secret
How to model the Attacker’s learning
60
© 2013 IBM Corporation
Influence diagram vs. game tree representation
61
© 2013 IBM Corporation
A game theoretic analysis
62
© 2013 IBM Corporation
A game theoretic analysis
63
© 2013 IBM Corporation
A game theoretic solution
64
© 2013 IBM Corporation
Supporting the Defender We weaken the common knowledge assumption The Defender’s decision problem
D
S
A
??
V
65
© 2013 IBM Corporation
Defender’s solution
66
© 2013 IBM Corporation
Predicting the Attacker’s move:
67
© 2013 IBM Corporation
Attacker action of MEU
68
© 2013 IBM Corporation
Assessing
69
© 2013 IBM Corporation
How to stop this hierarchy of recursive analysis? Potentially infinite analysis of nested decision models – where to stop? • Accommodate as much information as we can • Stop when the Defender has no more information • Non-informative or reference model • Sensitivity analysis test
70
© 2013 IBM Corporation