... work has applications in open multi-agent systems, where agents will need to be able to ..... timisations, uses integer linear programming techniques to imple-.
Verifying Equilibria in Games of Complete and Perfect Information Represented by Presburger Formulas Emmanuel M. Tadjouddine and Frank Guerin email {etadjoud,fguerin}@csd.abdn.ac.uk Department of Computing Science, King’s College, University of Aberdeen, Aberdeen AB24 3UE, Scotland. ABSTRACT Given a description of a game and a strategy profile, we are interested in the problem of deciding if the strategy profile is an equilibrium of the game. In the present paper we restrict our attention to games of complete but imperfect information. The game is represented by a sequence of Presburger formulas and the equilibrium property is checked by computing conservative approximate least fixpoints using abstract interpretation. This work has applications in open multi-agent systems, where agents will need to be able to verify that a recommended strategy profile is an equilibrium of a game.
1.
INTRODUCTION
It is when we try to apply game theory in concrete computational settings that we often become aware of the difficulties involved in achieving the standard assumptions on which much of the theory is based. The concept of equilibrium is a good example; the Nash equilibrium rests on assumptions of perfect rationality and common knowledge. Suppose that we present a group of agents with the rules of a game, which has a unique Nash equilibrium. If we expect these agents to play the equilibrium, what assumptions are we making? We must assume that the agents are rational, and that they are capable of analysing the presented game and determining the equilibrium strategies. This can be difficult [GGS05, vS02] and has lead some researchers to consider designing auctions where bounded rationality is a consideration [Par04b]. An alternative is to present the agents with the equilibrium strategy profile, as well as the rules of the game. The common knowledge which must be achieved is that each agent must know that it is not in its interest to deviate, given that the other agents follow the strategy. Also, each agent must know that the other agents know that they have no incentive to deviate. Thus each agent must check all possible moves at each choice point in the game. This work is motivated by the desire to apply game theory in open multi-agent systems. By open we mean that foreign agents are free to enter and leave the system at will; such agents will need to be able to work with previously unseen game theory mechanisms, and hence they will need some way to attain common knowledge of the equilibria. This is the scenario in which we met the problem of how to achieve this state of common knowledge. This is a problem which will arise in many scenarios, for example where agents have autonomy to agree new rules for the game they are playing, or to propose to begin a new mechanism. The area of automated mechanism design is receiving increasing attention [DJP03], and it is envisaged that an agent might make a tailor made mechanism to suit the unique scenario in a particular interaction [CS04]. Clearly all the agents in the group will need to attain common knowledge about the equilibrium of the mechanism.
To achieve common knowledge of the game and equilibrium we will represent both the game and recommended strategy profile in a machine-readable formalism, and we will provide agents with a method for deciding if the recommended strategy profile is indeed an equilibrium of the game. In this paper, we restrict our attention to verifying subgame perfect Nash equilibrium for games with complete information. Related work includes that of Van der Hoek [VdH04], Pauly [Pau05] and Guerin [Gue06]. In [VdH04], it is argued that the triple KRA (Knowledge, Rationality, Action) is a powerful tool for reasoning on multi-agent systems. Among other examples, the idea of checking equilibria for normal form games with complete information by using the KRA triple and the backward induction technique [Gib92, p. 58] is presented. In [Pau05], a WHILE-language is extended to represent game theory mechanisms with complete information for multiple agents and a Hoare-like calculus with pre- and post-conditions is used to verify simple mechanisms such as the Solomon’s dilemma or the Dutch auction. In [Gue06], the SPL(Simple Programming Language) [MP95] is extended into SMPL (Simple Mechanism Programming Language) in order to represent games in open multiagent systems where agents have implicit preferences. The equilibrium is checked using an exhaustive algorithm that builds the entire game tree and proceeds backwards to check equilibrium properties at each tree node. This algorithm has the advantage of being robust but has an exponential complexity that is mn where m is the branching factor and n the number of choice points in the game. In particular, the required memory storage is huge, and we may be dealing with a setting where agents have limited computer resources. In the worst case this algorithm requires checking each possible alternative strategy for each agent where the space of possible strategies includes every possible mapping from history to action. However, in many practical cases, e.g., the Dutch auction, large portions of this strategy space can be lumped together and dealt with abstractly. The present work follows that of [Gue06] with the following main difference: The game representation as an extended SMPL program is translated into a transition system wherein each transition is a Presburger formula. The game mechanism, preferences and utilities are simply viewed as a sequence of Presburger formulas. The main contribution of this paper is the idea of viewing the game as a sequence of Presburger formulas and then using symbolic analysis based on model checking and abstract interpretation to verify subgame perfect Nash equilibria. The advantage this gives over the approach of [Gue06] is that it makes it possible to verify an equilibrium without necessarily needing to construct the entire game tree. The verification is carried out using a symbolic forward analysis that evaluates Presburger formulas using a satisfiability solver,
e.g., the Omega library [Pug91]. The memory requirement of this algorithm is proportional to that required by the original program. However, deciding whether a Presburger formula is true or false has a triply exponential complexity in the size of the formula [Opp78]. The good news is that the Omega library implements efficient methods that have exhibited low-order polynomial time complexity in practice [Pug91]. The remainder of this paper is organised as follows. Section 2 reviews game theory mechanisms. Section 3 describes the process of representing a game as a sequence of Presburger formulas. Section 4 informally describes our approach to checking a subgame perfect Nash equilibria with the help of a simple example (the cake cutting protocol). Section 5 describes in a formal way our verification approach based on symbolic forward analysis. Eventually, Section 6 states some remarks and provides directions for future work.
2.
PRELIMINARIES: GAME THEORY MECHANISM
A game theory mechanism [FT91, Gib92, Bin92, Par04a] aims at implementing a desired outcome in multi-agent systems, where agents are self-interested and have preferences, by providing them with incentive compatible strategies for playing the game. Let G denote the game, N the set of n agents, A the set of alternative actions (moves) and O the set of outcomes of the game. Each agent i has a preference θi , which is a partial ordering over the set of outcomes. We assume preferences are common knowledge to the agents, therefore the game is of complete information. We also assume that the player knows the entire history of the game when making a move. Hence we are dealing with games of perfect information. A strategy σi for an agent i is a detailed plan of possible actions for all configurations of the game. It can be viewed as a function that maps the game history to an action ai ∈ A. A strategy profile σ=(σ1 , . . . , σn ) is a tuple of n mappings σi , i=1, . . . , n for the n agents of the game. We write σ−i , the tuple of the n−1 players’ strategies except player i’s and (σi , σ−i ), the n components tuple (σ1 , . . . , σn ). The most studied stable strategy or equilibrium in this setting is the Subgame Perfect Nash Equilibrium (SPNE) [FT91, Gib92, Bin92]. In order to define a SPNE it is first necessary to define a Nash Equilibrium (NE). A strategy profile σ ∗ = (σ1∗ , . . . , σn∗ ) is a NE if for all agent i, for all type θi , and for all strategies σi , we have: ∗ ∗ ui (θ, o(σi∗ , σ−i )) ≥ ui (θ, o(σi , σ−i ))
(2.1)
Strategy profile σ ∗ is a SPNE if the restriction of σ ∗ to K is a NE of K for every proper subgame K of the game G. This means for each agent i, the strategy σi∗ is optimal if all other agents behave rationally.
3.
GAME REPRESENTATION AND PRESBURGER FORMULA
Following [Gue06], the game will be represented algorithmically by a computer program in a high level language. The chosen programming language is the SMPL.
3.1
From Game to Program
As defined in [Gue06], a finite extensive form game G is a tuple hN, A, T, INF, m, ui: (i) A nonempty finite set N of n agents. (ii) A nonempty finite set A of agent actions or choices. (iii) A game tree T that captures all possible moves for each player at each stage
of the game. (iv) An information set function INF mapping each nonterminal node to the set of nodes in the same information set as it. It is required that each node in an information set has the same set of choices. (v) A mover function m mapping each game node to the agent who has the move at that node (obviously m must map all nodes in the same information set to the same agent). (vi) A utility function u mapping each terminal node to a tuple hu1 , u2 , . . . , un i giving the utility for each agent. From this definition, we see the possibility of representing a game algorithmically. Thus, a game can be viewed as a computer program in a high level language. In this setting, a game tree represents the symbolic execution of the computer program. We choose the SMPL language for its simplicity both syntactically and semantically as well as its use in reactive systems. As we are dealing with games of complete information and explicit preferences, the SMPL adds a choose construct for a player to select a move among a set of alternatives. SMPL was originally designed with imperfect information games in mind, so it allows us to state explicitly what messages are passed between what agents, and so to keep track of their information sets. However, this facility is not fully exploited in this paper.
3.2
SMPL Syntax
A program consists of a number of modules representing each of the agent processes. It is required that the module of exactly one agent will begin with a choose statement. The final statement in any terminating run of an SMPL program must assign a value to the special variable U which is a tuple giving the utility for all agents. A channel is a variable whose value is a list of integers. We identify channel variables as follows: αi will be an input channel for agent i and any other agent can write to it; i.e. only agent i can read from this channel. There are no other channels. With the exception of channel variables and the utility variable, statements in each module refer only to variables which are local to that module. Basic Statement
Description
u := e choose u c1 ..c2 await c α⇐e α⇒u if c then S1 else S2 if c then S1 S1 ; . . . ; Sk while c do S
assignment: assign value e to variable u choose a value in the interval for variable u wait for Boolean expression c send expression e on channel α receive on channel α and store in variable u conditional statement
3.3
one branch conditional statement concatenation: sequential execution repetition of s
SMPL Program Semantics
The SMPL semantics is defined via a transition system. The transition system has variables corresponding to the program’s variables, and it has transitions which describe how those variables change as program statements are executed. The program will identify a transition system, and the transition system will define the possible sequences of states it could produce. Thus the semantics of a program is given in terms of possible sequences of states (of variables) it could produce. A program identifies a unique transition system hV, I, T i. The variables come from a universal set of typed variables V, called the vocabulary. From this we can construct expressions (such as x + 3y + 4), atomic formulae (such as (x + 3y) > 7) and Assertions (such as x > y ∧ y < 4). A state s is an interpretation of V, assigning each variable u ∈ V a value s[u] over its domain.
The initial condition I is an empty value for all channels (α = λ) and the control variable equal to the set of entry locations for each agent. If a state s of the system satisfies the assertion I, then it is a state from which the system can start running. T is a set of transitions including one transition corresponding to each statement in the program. Denoting π and Y the sets of control and non control variables, we have V = π ∪ Y . Figure 1 summarises the SMPL semantics. Note that |α| means the length of the channel α and the • symbol is used to add an element to one end of a list; for example α0 = α • e means that the value of α in the successor state will be equal to what it was previously, but with e appended to the end. Denoting ` the statement’s label and `ˆ its post-label, the abbreviation m(` , `ˆ ) means a move of control from location ` to location `ˆ i.e. the control variable which now points to ` will subsequently point to `ˆ . Furthermore, p(U ) means the variables in the set U are not changed by the transition. Figure 1: SMPL Semantics. SMPL Statement Transition Relation u := e choose u c1 ..c2
m(`, `ˆ ) ∧ u0 = e ∧ p(Y − {u}) c2 W u0 = c m(`, `ˆ ) ∧ p(Y − {u}) ∧ c=c1
await c α⇐e α⇒u if c then `1: S1 if c then `1: S1 else `2: S2 while c do [`1: S ` :]
m(`, `ˆ ) ∧ c ∧ p(Y ) m(`, `ˆ ) ∧ α0 = α • e ∧ p(Y − {α}) m(`, `ˆ ) ∧ |α| > 0∧ α = u0 • α0 ∧ p(Y − {u, α}) [m(`, `1) ∧ c ∧ p(Y )] ∨ [m(`, `ˆ ) ∧ ¬c ∧ p(Y )] [m(`, `1) ∧ c ∧ p(Y )] ∨ [m(`, `2) ∧ ¬c ∧ p(Y )] [m(`, `1) ∧ c ∧ p(Y )] ∨ [m(`, `ˆ ) ∧ ¬c ∧ p(Y )]
Each transition maps each state onto a set of possible successor states. If a transition τ maps a state s to a non-empty set of possible successor states then τ is enabled on s, if it maps s to the null set then the transition is disabled on state s. The transitions in the system tell us how one state can move to the next. A transition is taken at state s if the next state is related to s by the transition. A sequence of states (possibly infinite) s0 , s1 , s2 , s3 , . . . is called a computation of the program P (which identifies our transition system) if s0 satisfies the initial condition I and if each state sj+1 is accessible from the previous state sj via one of the transitions T in the system. If it is a finite computation then there will be a final state sn which has no successor state. A computation is a sequence of states that could be produced by an execution of the program. Given a fixed decision for each agent’s choice points, an SMPL program should produce a single computation; otherwise it is not a valid SMPL program. This means that at any state, all the agents, except one, should be at an await statement, or should have terminated. This restriction ensures that we have a unique history of the system corresponding to a single state of the game. Furthermore, a valid SMPL program must have no infinite computations; this ensures that all games represented by SMPL programs are finite. The program should also have a unique start state.
3.4
From Program to Presburger Formula
The semantic of the SMPL language is defined in terms of tran-
sition system. As shown in Figure 1, each transition of the system is a first order logic formula. Note that the quantities |α|, m(`, `ˆ ) and p(V ) manipulate integer values and can be expressed as linear transformations. They are then viewed as representing Presburger formulas. In particular p(V ) is a conjunction expressing each variable of the set V remains unchanged. Consequently, each transition can be viewed as a Presburger formula. A Presburger formula consists of affine constraints over integer variables and constants, the logical connectives ¬, ∨ and ∧ and the quantifiers ∀ and ∃. As an example, the formula ∃p, n=2p+1 is true for all odd integer n. Presburger formula are generated using the following grammar f Expr Op
::= ::= ::=
Expr Op Expr | ¬f | ∃Varf | ∀Varf | f ∧ f | f ∨ f Const | Var | (Expr) | Expr + Expr = | < | > | ≤ | ≥
To summarise, the game is represented by an SMPL program, which is described as a transition system over a set of states. If the game is finite, then the transition system is finite. Moreover, each transition represents a Presburger formula. Consequently, a finite game with complete information can be viewed as a sequence of elementary Presburger formulas allowing us to encode and manipulate the program states. Formally, a finite game with complete information can be defined as a transition system S=hN, A, Q, I, F, PF, u, sati in which: (i) N is a finite set of agents. (ii) A is a finite set of actions or choices. (iii) Q is a finite set of states. (iv) I is the non empty set of initial states. (v) F is the non empty set of final states. (vi) PF is a set of Presburger formulas describing transition functions, one for each player i ∈ N , which maps each state of the system to the set of actions A. (vii) u : F → Zn is a utility function that assigns utilities to the n players for all possible final states of the game. (viii) sat : Q × PF → {True, False} is the satisfiability function of Presburger formulas over the set of states. The good news is that Presburger arithmetic is decidable even though its decision procedure has a worst case time complexity that is triply exponential in the size of the formula [Opp78]. This indicates a prohibitive complexity and we may believe only very simple Presburger formula can be verified. However, the Omega library [Pug91], developed for dependence analysis in compiler optimisations, uses integer linear programming techniques to implement efficient methods, which performed with low-order polynomial time complexity in practice [Pug91]. See also http://www.cs.umd.edu/projects/omega/
4.
VERIFYING SPNE(SUBGAME PERFECT NASH EQUILIBRIUM)
Backwards induction is a straightforward method of checking a SPNE. However, backwards induction requires us to first build the game tree so that we know what the terminal nodes are, and work backwards from them. In many games it is not feasible to build and maintain a representation of the entire tree. We would like to be able to check a SPNE by stepping through the game from the initial node, without needing to maintain a representation of the entire tree. Our approach can be illustrated by the simple example depicted in Figure 2. Nodes have been labelled a . . . g. The relevant utilities have been labelled x1 . . . x6 . If we were doing backwards induction on this game, we would start by knowing the numerical values of these utilities, and then we would check that x3 ≥ x4 at node b. When this is verified, the utility of node d would be propagated up to node b. Similarly, once it is verified that x5 ≥ x6 , the utility of f will be propagated up to node c, and then we can check x1 ≥ x2 to verify that the choice made at node a is rational.
Figure 2: Example game: Dark lines show SPNE.
Figure 3: Simple Cake Cutting Mechanism represented as an SMPL program
»
`0: choose c 0..100; `1: α2 ⇐ c; `2
2
m0 : await |α2 | > 0; m1 : α2 ⇒ c1 ; m2 : c2 := 100 − c1 ; m3 : choose s 1..2; m4 : if s = 1 then m5 : hU1 , U2 i := hc1 , c2 i else m6 : hU1 , U2 i := hc2 , c1 i; m7
P1 :: ‚ ‚ ‚ 6 6 6 P2 :: 6 6 4
If we want to work in the forward direction, we start at the root node and step through each branch. At the root we know that we need to check that the utility player 1 obtains for playing left is greater or equal to the utility he obtains for playing right, but having not yet built the entire game tree, we do not have values for these utilities however we have their algebraic expressions given by the game mechanism. Our approach is to simply create a constraint at each choice point, and to carry these constraints forward until reaching the terminal nodes, where they can be checked using a sat solver e.g., the Omega library. For example, at the root node we create variables for the values we do not know: x1 will be the utility obtained by playing left and following the recommended strategies thereafter; x2 will be the utility obtained by playing right and following the recommended strategies thereafter. The constraint to be carried forward is x1 ≥ x2 . We can find the value for x2 by following the right hand branch, verifying that x5 ≥ x6 at the end of it, and returning the x2 value. We then continue stepping through the equilibrium branch of the tree in this way. At node b we pick up the value for x4 from the right branch. By the time we reach terminal node on the equilibrium branch, we will have the constraint x1 ≥ x2 ∧ x3 ≥ x4 . The main advantage of this is that we do not need to maintain the tree, we can forget about the parts we have stepped through, so long as we keep track of the symbolic expressions of the relevant xi variables, and the constraints to be checked. This simple example provides an illustration of a symbolic forward analysis where the game can be represented by a straight line code with branches but no loops. In the presence of loops, the flow of the game mechanism may be viewed as a cyclic digraph. Therefore, one single forward pass may not be sufficient in checking the SPNE property. Instead, we need to iteratively evaluate the equilibrium constraints over the set of states until we find a terminal state in which the equilibrium does not hold or the set of states satisfying the constraints remain constant. In other words, we need to calculate a fixed-point iteration, see Section 5.
4.1
Simple Example: Cake Cutting Protocol
Our cake cutting scenario has two players, and a long rectangular cake. Player 1 can cut the cake at any position along its length from 0 to 100. Player 2 then picks one part, leaving the remainder for Player 2. The equilibrium strategy profile is for Player 1 to cut the cake in half (at 50) and for Player 2 to pick the largest part (if they are equal he simply picks the first part). We represent the game mechanism by a single program. Figure 3 shows the cake cutting mechanism in our SMPL language. Note that |α2 | means the number of messages waiting on channel α2 . This SMPL program can be translated into Presburger formulas
–
3 7 7 7 7 7 5
as follows. Firstly there is the initial assertion: α2 := [ ] ∧ π := h`0, m0 i Each statement of the program has a transition relation that can be written as a Presburger formula. These formulas describe the variables that change or stay the same, for example, the first P2 transition is: π2 = m0 ∧ π20 = m1 ∧|α2 | > 0 ∧c0 = c ∧ c01 = c1 ∧ c02 = c2 ∧ s0 = s ∧ α20 = α2 ∧ U 0 = U We will give the full program now in abbreviated form, where m is an abbreviation for the movement of control from one program location to another, and p(V ) means the set of variables V is not altered: m(`0, `1) ∧ c0 ≥ 0 ∧ c0 ≤ 100 ∧ p(Y − {c}) m(`1, `2) ∧ α20 = α2 • c ∧ p(Y − {α2 }) m(m0 , m1 ) ∧ |α2 | > 0 ∧ p(Y ) m(m1 , m2 ) ∧ |α2 | > 0 ∧ α2 = c01 • α20 ∧ p(Y − {c1 , α2 }) m(m2 , m3 ) ∧ c02 = 100 − c1 ∧ p(Y − {c2 }) m(m3 , m4 ) ∧ s0 ≥ 1 ∧ s0 ≤ 2 ∧ p(Y − {s}) [m(m4 , m5 ) ∧ s = 1 ∧ p(Y )] ∨ [m(m4 , m6 ) ∧ s 6= 1 ∧ p(Y )] m(m5 , m7 ) ∧ hU1 , U2 i0 = hc1 , c2 i ∧ p(Y − {hU1 , U2 i}) m(m6 , m7 ) ∧ hU1 , U2 i0 = hc2 , c1 i ∧ p(Y − {hU1 , U2 i}) Where s 6= 1 is an abbreviation for ¬(s = 1). In addition to the variables explicitly mentioned in the program, we need history variables to keep a record of all communication sent or received by each agent. The initial assertion becomes: αh1 := [ ] ∧ αh2 := [ ] ∧ α2 := [ ] ∧ π := h`0, m0 i Only one of the program transitions involves a message being sent, we now give it in abbreviated form: 0 0 = αh1 • c ∧ αh2 = αh2 • c∧ m(`1, `2) ∧ α20 = α2 • c ∧ αh1 p(Y − {α2 , αh1 , αh2 })
Now we consider the equilibrium strategies. Recall that each player’s strategy describes what action he should take in any given history: S1 :: » [action := 50] – if history[1] > 50 then action := 2 S2 :: else action := 1 The history for each player p refers to the value of his special history channel αhp . These can be turned into symbolic representations of sets of states by stepping forward, resulting in these sets:
branch 1: S1 : {action := 50} S2 : {history[1] > 50 ∧ action = 2} ∪{history[1] ≤ 50 ∧ action = 1} The strategies in this game are particularly simple; player 1’s is so simple that he does not even consider the history, as there is only one game node in which he is asked to act. In more complex cases there may be long sequences of actions. In general the symbolic representation of a strategy will consist of several disjoint sets, each of which specifies the action to be taken in a set of states with some condition on their history. Now we step forward through the cake cutting mechanism to check if these strategies are an equilibrium. Each time we meet a choice point, we branch in two: one branch for the choice recommended by the strategy, and one branch for all deviant choices. Note that the recommended strategy is one execution path of the game mechanism. It also represents a particular action to be taken by a player in a given configuration of the game. In the SMPL representation, this choice is performed by the choose statement. Although it is useful to encode the equilibrium property by a formula to be checked over the states of the transition system, such a formula may be hard to formalise in general games with complex flow and high number of configurations. Here, to express the SPNE for the cake cutting protocol, we create new variables x1 , x2 . . . to record the utility which the acting agent can gain from each branch. At each choice point we express the set of conditions that must hold according to equation (2.1). We obtain the following tree, where branch 1 follows the recommended strategies: initial state branch 1
choose c branch 1
branch 2
comply (c = 50)
deviate (c 6= 50)
choose s branch 1
comply
choose s
branch 3
deviate
branch 2
comply
branch 4
deviate
We will now give the sequences of states in each branch. To keep things uncluttered we will only specify the values of variables that have changed. While expanding the branches we add constraints to an assertion xConstraints which is the equilibrium property; i.e. if xConstraints can be satisfied, given the assignment of variables at the end of each branch, then the recommended strategy is an equilibrium.
initial:π = h`0, m0 i ∧ αh1 := [ ] ∧ αh2 := [ ] ∧ α2 := [ ] split to branch 2 let x1 be the utility we get back for player 1 we get back: 0 ≤ x1 ≤ 49 add to xConstraints; xConstraints = U1 ≥ x1 π = h`1, m0 i ∧ c = 50 π = h`2, m0 i|α2 | = 1 ∧ α2 [1] = 50 Note that α2 [1] means the first message waiting on channel α2 . π = h`2, m1 i π = h`2, m2 i ∧ c1 = 50 ∧ |α2 | = 0 π = h`2, m3 i ∧ c2 = 50 split to branch 3 let x3 be the utility we get back for player 2 we get back: x3 = 50 add to xConstraints; xConstraints = U1 ≥ x1 ∧ U2 ≥ x3 π = h`2, m4 i ∧ s = 1 π = h`2, m5 i π = h`2, m7 i ∧ hU1 , U2 i = h50, 50i therefore we can verify U1 ≥ x1 ∧ U2 ≥ x3 branch 2: π = h`1, m0 i ∧ c 6= 50 ∧ c ≥ 0 ∧ c ≤ 100 π = h`2, m0 i∧ |α2 | = 1 ∧ α2 [1] 6= 50 ∧ α2 [1] ≥ 0 ∧ α2 [1] ≤ 100 π = h`2, m1 i π = h`2, m2 i ∧ c1 6= 50 ∧ c1 ≥ 0 ∧ c1 ≤ 100 ∧ |α2 | = 0 π = h`2, m3 i ∧ c2 = 100 − c1 split to branch 4 let x2 be the utility we get back for player 2 we get back: 0 ≤ x2 ≤ 49 add to xConstraints; xConstraints = U2 ≥ –x2 » s = 1 ∧ c1 ≤ 49 ∧ c1 ≥ 0∨ π = h`2, m4 i ∧ s = 2 ∧ c1 ≥ 51 ∧ c1 ≤ 100 – » π = h`2, m5 i ∧ s = 1 ∧ c1 ≤ 49 ∧ c1 ≥ 0∨ π = h`2, m6 i ∧ s = 2 ∧ c1 ≥ 51 ∧ c1 ≤ 100 π = h`2, m7 i∧ – » hU1 , U2 i = hc1 , c2 i ∧ c1 ≤ 49 ∧ c1 ≥ 0∨ hU1 , U2 i = hc2 , c1 i ∧ c1 ≥ 51 ∧ c1 ≤ 100 therefore 0 ≤ U1 ≤ 49 and 51 ≤ U2 ≤ 100 this U1 value is passed back to branch 1 as x1 branch 3: π = h`2, m4 i ∧ s = 2 ∧ x4 = U [2] π = h`2, m6 i π = h`2, m7 i ∧ hU1 , U2 i = h50, 50i this U2 value is passed back to branch 1 as x3 branch 4: »
– s = 2 ∧ c1 ≤ 49 ∧ c1 ≥ 0∨ π = h`2, m4 i ∧ s = 1 ∧ c1 ≥ 51 ∧ c1 ≤ 100 – » π = h`2, m5 i ∧ s = 2 ∧ c1 ≤ 49 ∧ c1 ≥ 0∨ π = h`2, m6 i ∧ s = 1 ∧ c1 ≥ 51 ∧ c1 ≤ 100 π = » h`2, m7 i∧ – hU1 , U2 i = hc2 , c1 i ∧ c1 ≤ 49 ∧ c1 ≥ 0∨ hU1 , U2 i = hc1 , c2 i ∧ c1 ≥ 51 ∧ c1 ≤ 100 this U1 value is passed back to branch 2 as x2 i.e. 0 ≤ x2 ≤ 49 Given the final xi values above, we can verify that xConstraints is true, and hence the recommended strategies are an equilibrium. This verification will be carried out by the sat solver. However,
we can see it is much easier if we could have a formula that expresses the equilibrium property as we can can step through the game forward on time and check the formula at the final states. In Section 4.3, we develop an approach aiming at expressing a game equilibrium as a temporal formula using CTL [HR00, p. 207].
4.2
The SPNE Checking Procedure
In the case of loop-free game mechanism, the SPNE checking is carried out by propagating forward the constraints expressing the SPNE equilibrium starting from the set of initial states I0 to the final states of the transition system. Each transition is viewed as a Presburger formula and can be represented symbolically. We will explain the symbolic representation of the states in Section 5.2. The propagation requires the ability to determine the the set of successor states succ(Q) for a given set of state Q. The operations involved in manipulating the states of the transition system are the usual set operations (intersection, union, and difference). Algorithm 4.1 takes as input a loop-free game mechanism represented by a transition system that encodes Presburger formulas, an equilibrium strategy profile (σi∗ )i as well as a sat solver and returns a boolean indicating if the given mechanism is a SPNE. A LGORITHM 4.1 (SPNE C HECKING ). Set spne := False Set Q := I0 Constraints := empty While succ(Q) 6= ∅ Do If succ(Q) is choice point Then Let i be the player choosing Let u∗i be its utility from σi∗ For all remaining choices σi Do Let ui be i’s utility for those choices Constraints := Constraints ∪ {u∗i ≥ ui } End Do End If Q := Q ∪ succ(Q) End Do Check Constraints over the set of states Q using a sat solver If sat(Constraints) = True Then spne := True End If This simple checking procedure for loop-free game mechanism can be generalised for more complicated mechanism containing loops. From now on, we assume the game mechanism contains loops and can has an equilibrium point that can be expressed as Presburger formula as well. The equilibrium must hold at the end of the game. Such a property can be expressed in a temporal logic. Furthermore, the complex nature of the control flow of the game suggests that one single forward pass may not sufficient to ensure the equilibrium property holds. This allows to express the equilibrium as a fixed-point calculation over the set of the transition system. In the case of loop-free game mechanism, we can see that computing a fixed-point will take at most 2 iterations to converge.
4.3
Temporal Equilibrium Game Properties
We are interested in expressing game equilibria in some logic. We need a logic whose formulas are interpreted over the set of states of the program transition system. Moreover, to analyse the game, we would like to reason on the next transition and express the equilibrium property as a state formula that eventually becomes true. Thus the logic L we used is based on the following CTL [HR00, p. 207] modal operators: the quantified next state operators (∃,
and ∀,) and the quantified eventually operators (∃♦ and ∀♦). The formulas of the game logic L are generated by the following grammar φ ::=
f | ¬f | ∃ , f | ∀ , f | ∃♦f | ∀♦f | φ ∧ φ | φ ∨ φ
for a Presburger formula f in the program transition system representing the game. The model checking for this game logic and the transition system S=hN, A, Q, I, F, PF, u, sati attempts to answer the following: given a logic formula φ, find the states s ∈ Q for which φ is satisfied. Following [BGP01], the semantic of a logic formula is defined over maximal computation paths in the transition system. The logic modal operators are defined as follows: A state s0 satisfies ∃ , φ (∀ , φ) iff for some (all) maximal paths (s0 , s1 , s2 , . . .), s1 satisfies φ. A state s0 satisfies ∃♦φ (∀♦φ) iff for some (all) maximal paths (s0 , s1 , s2 , . . .), there exists i such that si satisfies φ. The equilibrium strategy σ ∗ can be viewed as a specified maximal computation path in the transition system. In other words, a run of the game following σ ∗ yields a maximal path (s0 , s1 , . . . , sm ) for which sm ∈ F satisfies the formula for σ ∗ . If f represents a dominant equilibrium property for σ ∗ in our game logic L, then f can be expressed as f = ∃♦g, in which g can be expressed as a sequence of Presburger formulas expressing that no player will benefit by deviating from σ ∗ regardless of the strategies followed by all other players. In general, checking such an equilibrium involves manipulating the game history and all possible mappings from history to actions. This is exponential in the number of Presburger formulas. To overcome this exponential complexity, we verify the equilibrium property by computing the least fixed point of some function over a lattice induced by the domain of Presburger formulas via abstract interpretation [CC77].
5.
APPROXIMATION-BASED VERIFICATION OF EQUILIBRIUM PROPERTIES
The verification of the equilibrium property is carried out as a reachability problem [ACH+ 95] that asks if there is a maximal path in the transition system from a state s to a final state s0 . In our analysis, the set of states Q of the transition system is partitioned using the control variables π of the program so that each subset can then be analysed within its context. This is facilitated by the modular nature of the SPL [MP95]. The set of reachable states can be computed using fixpoint iterations over the lattice of Presburger formulas. This can be carried out using forward or backward analysis techniques. The backward analysis starts with a given state s or formula f , then derives the set of states that can reach s and finally checks that the set of initial states is included in the derived set. The forward analysis start with the initial states, then computes the reachable state-space iterating over the set of successor states until convergence.
5.1
Symbolic Forward Analysis
To carry out the forward analysis using the transition system S encoding the game and the logic L, we define the successor function succ : Q → Q, which maps a state s to the set succ(s) = {s0 : ∃ τ ∈ PF | s →τ s0 } describing the set of states reachable from s in one step. This approach is similar to that of [BGP01]. Let JφK denotes the set of states of Q which satisfy φ and lfp(Un ) the least fixpoint of the the recurrence relation Un . For a given formula f ,
we have the following equations: J¬f K = Q − Jf K Jf1 ∧ f2 K = Jf1 K ∩ Jf2 K Jf1 ∨ f2 K = Jf1 K ∪ Jf2 K J∃ , f K = succ(f ) J∀ , f K = Q − succ(¬f ) J∃♦f K = lfp(Un ) where U0 = Jf K, Un+1 = Un ∪ succ(Un ) J∀♦f K = lfp(Un ) where U0 = Jf K, Un+1 = Un ∪ (succ(Un )− succ(Q − Un )) (5.1) Here, succ(f ) is interpreted as succ(Jf K). We see that the semantics of the game logic operators involves manipulating regions (sets of states) with the set operators union, intersection and difference. R EMARK 5.1. If f is a Presburger formula, then Jf K is an affine space-region (described by a set of affine relations). R EMARK 5.2. If f is a Presburger formula and lfp(f ) converges, then lfp(f ) is an affine space-region. It follows lfp(f ) can be computed as a space-region involving fixed point iteration. Although Presburger formulas are decidable, computing a fixpoint over the set of Presburger formulas may keep iterating forever. To ensure convergence, we need an approach allowing for conservative approximation of the semantic of the game (here a program). This idea of semantic approximation is at the heart of abstract interpretation wherein the concrete semantic of a program is mapped to an abstract semantic using lattice theory and then program dynamic properties can be determined statically.
5.2
Approximations
Traditionally, semantic analyses are defined over lattices. Let us recall the following definitions: D EFINITION 5.3. A complete partial order is a set E partially ordered by a relation v; E has a least element ⊥; and every countable increasing sequence e0 v . . . v en . . . has a limit in E denoted by ti ei . D EFINITION 5.4. A lattice is a complete partial order in which any two elements x and y have a greatest lower bound (x ∨ y) and a least upper bound (x ∧ y). A lattice is complete if any two elements x and y have a greatest element (x ∨ y) and a least element (x ∧ y). An example of complete lattice is the power set domain. Because we are dealing with Presburger formulas, the prime candidate for our analysis is the complete lattice formed by the domain of Presburger formulas in which Tarski’s fixpoint theorem guarantees the existence of lfp(f ) when f is an increasing mapping. For practical reasons, the space regions for the set of states are represented by convex polyhedra in our symbolic analysis. We then consider the domain of convex polyhedra with the usual set operators. We observe that the union and difference of two convex polyhedra are not convex polyhedra. A least fixpoint computation using equations (5.1) require approximations. As in [ACH+ 95, BGP01], we use the convex hull operator t and the widening operator ∆ yielding over-approximations. Denoting R[R] the over[under]-approximation of the region R, we have: R1 t R2 ≡ {λx + (1 − λ)y | x ∈ R1 , y ∈ R2 , 0 ≤ λ ≤ 1} R1 ∪ R2 ≡ R1 t R2 ⊆ R1 ∆R2 R1 − R2 ≡ R1 − R2 The widening operator is defined so that R1 ∆R2 is formed by the set of affine constraints in R1 that are also satisfied by R2 and is
chosen so that the domain of convex polyhedra is a lattice. The set difference over convex polyhedra requires a definition of best under-approximation. This is a hard problem which is contextdependent. One way of computing R1 −R2 is to represent it as a list of convex polyhedra and then to compute their convex hull [CI96]. For a given transition system S=hN, A, B, Q, I, F, PF, u, sati and an equilibrium property expressed as state formula f =∃♦g with g a Presburger formula, we verify the game equilibrium using the following algorithm: A LGORITHM 5.1 (G ENERIC F ORWARD V ERIFICATION ). Set satisfied := True Set Q0 := ∅; Q := I While (satisfied==True ∧ ¬(Q == Q0 ) ) Do Set Q0 := Q Set Q := Q0 ∆ succ(Q0 ) Evaluate satisfied := sat(g) for the states Q End Do At the end, Q = JgK.
This algorithm runs forward on time up till it reaches a final state for which the formula g is false or g remains true and the set of states stop varying.
6.
CONCLUDING REMARKS
We have considered open multi-agent systems where agents are rational but have limited computer resources. The system is viewed as a game whose rules and equilibrium strategy profile are published in an extended SMPL program. The publication of this information allows the game and equilibrium to become common knowledge amongst the agents, and hence agents can work with previously unseen mechanisms. This is a small step towards a machine-readable formalism that will standardise the game mechanism design and hence enable interoperability. We have presented an algorithm based on forward symbolic analysis, which verifies subgame perfect equilibrium for loop-free game mechanism. Then, we have formalised our symbolic analysis using model checking and abstract interpretation techniques. In this work, our focus is on verification of equilibria for games with complete information. The analysis is applied to a simple mechanism (the cake cutting protocol) with two agents. There is much work remaining. The cake cutting example we have chosen is a type of resource allocation, where actions are chosen from a range of integer values. Obviously, in more interesting resource allocation scenarios we must additionally deal with incomplete information. Our prime intended application of this work in future is in auctions. In an auction, agents’ preferences are private, so we need to be able to check Bayesian equilibria where each agent has a probability distribution over other agents’ preferences. We also wish to study games that are representable as Presburger formulas to find out their verification complexity in practice. This may lead to a classification where for a given game and an equilibrium type, we recommend a verification technique along with its justification.
Acknowledgements We thank the UK EPSRC for funding this project under grant EP/D02949X/1.
References R. Alur, C. Courcoubetis, N. Halbwachs, T. A. Henzinger, P.-H. Ho, X. Nicollin, A. Olivero, J. Sifakis, and S. Yovine. The algorithmic analysis of hybrid systems. Theoretical Computer Science, 138(1):3–34, 1995.
Tevfik Bultan, Richard Gerber, and William Pugh. Symbolic Model Checking of Infinite State Systems Using Presburger Arithmetic. In CLEVER: Divide and Conquer Combinational Logic Equivalence VERification with False Negative Elimination, 2001. K. Binmore. Fun and Games: A Text on Game Theory. D.C. Heath and Company, Lexington, Massachusetts, 1992. P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 238–252, Los Angeles, California, 1977. ACM Press, New York, NY. Beatrice Creusillet and Francois Irigoin. Exact versus approximate array region analyses. In Languages and Compilers for Parallel Computing, pages 86–100, 1996. V. Conitzer and T. Sandholm. Self-interested automated mechanism design and implications for optimal combinatorial auctions. In In Proceedings of the ACM Conference on Electronic Commerce (ACM-EC), pages 132–141. ACM Press, 2004. R. K. Dash, N. R. Jennings, and D. C. Parkes. Computationalmechanism design: A call to arms. IEEE Intelligent Systems, pages 40–47, November 2003. Special Issue on Agents and Markets. D. Fudenberg and J. Tirole. Game Theory. MIT Press, 1991. G. Gottlob, G. Greco, and F. Scarcello. Pure nash equilibria: Hard and easy games. Journal of Artificial Intelligence Research, 24:357–406, 2005. available at http://www.jair.org/abstracts/gottlob05a.html. R. Gibbons. A Primer in Game Theory. Prentice Hall, 1992. F. Guerin. An algorithmic approach to specifying and verifying subgame perfect equilibria. In Proceedings of the Eighth Workshop on Game Theoretic and Decision Theoretic Agents (GTDT2006), Hakodate, Japan. AAMAS 2006, 2006. Michael R. A. Huth and Mark D. Ryan. Logic in Computer Science: Modelling and Reasoning about Systems. Cambridge University Press, Cambridge, England, 2000. Z. Manna and A. Pnueli. Temporal Verification of Reactive Systems (Safety), vol. 2. Springer-Verlag, New York, Inc., 1995. 2pn
Derek C. Oppen. A 22 upper bound on the complexity of presburger arithmetic. J. Comput. Syst. Sci., 16(3):323–332, 1978. D. C. Parkes. Auction design with costly preference elicitation. 2004. D. C. Parkes. On learnable mechanism design. In Kagan Tumer and David Wolpert, editors, Collectives and the Design of Complex Systems. Springer-Verlag, 2004. M. Pauly. Programming and verifying subgame perfect mechanisms. Journal of Logic and Computation, 15(3):295–316, 2005. William Pugh. The omega test: a fast and practical integer programming algorithm for dependence analysis. In Supercomputing ’91: Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pages 4–13, New York, NY, USA, 1991. ACM Press.
Wiebe Van der Hoek. Knowledge, rationality and action. In Proceedings of the AAMAS’04, New York, USA. AAMAS 2004, 2004. Bernhard von Stengel. Computing equilibria for two-person games. In Handbook of Game Theory with Economic Applications, Vol. 3, eds. R. J. Aumann and S. Hart. Elsevier, Amsterdam, 2002.