2017 14th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE) Mexico City, Mexico September 20-22, 2017
An Iterative Method For Solving Stackelberg Security Games: A Markov Games Approach Daniel Guerrero
Alin A. Carsteanu
Rocio Huerta
Instituto Polit´ecnico Nacional National Polytechnic Institute Mexico City, Mexico Email:
[email protected]
Instituto Polit´ecnico Nacional National Polytechnic Institute Mexico City, Mexico Email:
[email protected]
Instituto Polit´ecnico Nacional National Polytechnic Institute Mexico City, Mexico Email:
[email protected]
Julio B. Clempner Instituto Polit´ecnico Nacional National Polytechnic Institute Mexico City, Mexico Email:
[email protected] Abstract—Stackelberg security games are represented by a Stackelberg model for multiple defenders and attackers. The dynamics of the game involves defenders trying to allocate their limited resources to defend important targets, and attackers observing the behavior of the defenders, look for the most advantageous target to harm. The computation of the equilibrium point is a fundamental issue for Stackelberg security games. This paper presents an iterative method for computing the equilibrium point in Stackelberg security Markov games. We first cast the problem as a Stackelberg game for multiple players in Markov chain games conceptualizing security games as polylinear games. Defenders and attackers are independently playing non-cooperatively in a Nash game restricted by a Stackelberg game. Then, we develop a new method for solving security games, that provides randomized patrolling strategies for optimizing resource allocation. For developing the method, we transform the problem into a system of independent equations where each is an optimization problem. The method involves two half steps: the first employs a proximal approach and the second a projection gradient method. We present a numerical example for showing the effectiveness of the method.
I. I NTRODUCTION A. Brief review Computation of the equilibrium point is a fundamntal problem in security games. Korzhyk et al. [1] showed that optimal Stackelberg strategies can be computed in polynomial time in some cases, and is NPhard in others. Letchford and Conitzer [2] presented a solution for security games on graphs showing that the game can be solved in polynomial time or NP-hardness under diverse conditions. Xu [3] suggested that security games can be characterized by a set system showing that the complexity of a security game is essentially determined by this set system. Trejo et al. [4] presented an approach for representing a real-world attackerdefender Stackelberg security game theoretic model
c 978-1-4673-7839-0/15/$31.00 2017 IEEE
based on Markov chain games, and employed the Lagrange principle and Tikhonov’s regularization method to ensure the convergence of the cost functions to some equilibria. They based the result on a previos paper [5] developing the extraproximal method for computing the Stackelberg/Nash equilibria. Clempner and Poznyak [6] considered a game-theoretical approach for representing a real-world attacker-defender Stackelberg security game where the behavior of an ergodic system is represented by a Lyapunov-like function, non-decreasing in time. Then, the representation of the Stackelberg security game is transformed in a potential game in terms of Lyapunov. Clempner and Poznyak [7] suggested a security model describes a strategic game in which the defenders cooperate and attackers do not cooperate. Trejo et al. [8] presented an approach for adapting preferred strategies in controlled Stackelberg security games using a reinforcement learning (RL) approach for attackers and defenders employing an average rewards. In case of a metric state space, the coalition of the defenders achieves it synergy by computing the Strong Lp −Stackelberg/Nash equilibrium [9], [10]. Clempner and Poznyak [11] extended the model presented in [6] solving the security game problem using the extraproximal approach for computing the shortest-path Lyapunov equilibrium in Stackelberg security games. The extraproximal method is employed to compute the mixed stationary strategies: attackers operate on partial knowledge of the defender’s strategies for fixed targets. B. Main results In this paper, we study the classic security game model for multiple defenders and attackers. Our proposal includes a solution method established for Stack-
elberg security Markov games. This paper presents the following main results: • Considers a security game where defenders and attackers are independently playing noncooperatively in a Nash game. • Suggests an iterative method for solving Stackelberg securtity Markov games. • Transforms the problem into a system of independent equations, where each is an optimization problem. • The method involves two half steps: the first employs a proximal approach and the second a projection approach. Shows the rate of convergence of the method. • Presents a numerical example for showing the effectiveness of the method. C. Organization of the paper The paper is organized as follows. The next section presents the preliminaries needed for undestanding the rest of the paper. Section III describes the formulation of the Stackelberg security game. Section IV suggests an itetative method for computing the Stackelberg security equilibrium point and shows the rate of convergence of the method. A numerical example shows the usefulness and effectiveness of the method in Section V. Section VI concludes with some remarks.
l l stationary strategies πk|i (t) = πk|i . In the ergodic case, when all Markov chains are ergodic for any stationary l l strategy πk|i the probabilities P l Xt+1 =slj exponentially quickly converge to their limit probabilities, given by the solution to the linear system of equations: K N P P l plj|ik πk|i ∀j ∈ {1, ..., N }. P l (si ) P l (slj ) = i=1 k=1 l l z := zik i=1,N ;k=1,K (l = 1, n) is a matrix with l l elements zik = πk|i P l (sli ) and satisfies the ergodicity and simplex constraints [12], [13].
III. S TACKELBERG SECURITY GAME Let us consider a game [5], [14], [15] with a set N of defenders l = 1, n whose strategies are denoted by xl ∈ X l where X is a convex an compact set where l xl := col zik) , n N l Xl X l := Zadm l = 1, n , X := l=1
where col denotes the column operator which transl forms the matrix z(i,k) in a column. Let x = 1 n > (x , . . . , x ) ∈ X be the joint strategy of the players x and let x ˆ = x−l represents the strategy of the rest of the players (complement of xl ) such that x−l := > x1 , . . . , xl−1 , xl+1 , . . . , xn ∈ X −l :=
n N
Xt
t=1, t6=l
II. P RELIMINARIES Let M = (S, A, {A(s)}s∈S , P ) be a Markov chain [12], [13], where S is a finite set of states, S ⊂ N and A is the set of actions, which is a finite set. For each s ∈ S, A(s) ⊂ A is the non-empty set of admissible actions at state s ∈ S. Without loss of generality we may take A= ∪s∈S A(s). Whereas, K = {(s, a)|s ∈ S, a ∈ A(s)} is the set of admissible stateaction pairs, which is subset of S × A. a measurable The variable P = pj|ik is a stationary controlled transition matrix which defines a stochastic kernel on S given K, where pj|ik ≡ P (Xt+1 = sj |Xt = si , At = ak ) ∀t ∈ N represents the probability associated with the transition from state si to state sj , i = 1, N (i = 1, ..., N ) and j = 1, N (j = 1, ..., N ), under an action ak ∈ A(si ), k = 1, K (k = 1, ..., K). A Markov Decision Process is a pair M DP = (M, C) where M is a controllable Markov chain and C : K → R is a cost function, associating to each state-action a real value. A Markov game consists of a set N = {1, ..., tn} of players (indexed by l = 1, n). The dynamics of a Markov game is described as follows. Each of the players l is allowed to randomize actions, with l probabilities πk|i (t) := P (At = ak |Xt = si ), over the pure action choices alk ∈ Al sli , i = 1, N and k = 1, K. From now on, we will consider only
ˆ
where x = (xl , xl ). In adition, let us consider a set M of attackers with strategies y r ∈ Y r r = 1, m and let also X be a convex and compact set such that r y r := col zik) , m N r Y r := Zadm r = 1, m , Y := Yr r=1
1
Nm
m
Let y = (y , . . . , y ) ∈ Y := r=1 Y r be the joint strategy of the attackers and let y rˆ = y −r be a strategy of the rest of the players (complement of y r ) such that > y −r := y 1 , . . . , y m−1 , y m+1 , . . . , y m ∈ Y −r m N := Yr r=1, r6=m
we get v = (v , v ) r = 1, m . The dynamics of the Stackelberg security game is as follows. The defenders play non-cooperatively and they are assumed to anticipate the reactions of the attackers trying to reach the Nash equilibria. To reach the goal of the game, defenders first try to find a joint strategy x∗ = x1∗ , . . . , xn∗ ∈ X, satisfying for any admissible xl ∈ X l and any l = 1, n n X l −l l −l − fl x , x Γ (x) := min fl x , x r
l=1
rˆ
xl ∈X l
(1)
[16], [17]. Here fl xl , x−l is the cost-function of the leader l who plays the strategy xl ∈ X l , and the rest of the players play the strategy x−l ∈ X −l . If we consider the utopia point x ¯l := arg min fl xl , x−l (2) xl ∈X l
(3)
l=1
The functions fl xl , x−l l = 1, n are assumed to be convex in all their arguments. As well as, in this process the attackers try to reach one of the Nash equilibria trying to find a joint strategy y ∗ = y 1∗ , . . . , vy m∗ ∈ Y satisfying for any admissible y r ∈ Y r and any r = 1, m m X r −r r −r min h y , y − h y , y Ψ (y) := r r r r
xl ∈X l
such that f (y|x) :=
m X
hr y¯r , y −r |x − hr y r , y −r |x
given that y is a strategy of the rest of the attackers adjoint to y r , namely, y −r := y 1 , . . . , y m−1 , y m+1 , . . . , y m ∈ Y −r :=
m N
Yq
q=1, q6=l
and y¯r := arg min hr y r , y −r |x . y r ∈Y r
IV. I TERATIVE METHOD Let us consider the regularized Lagrange function given by
y ∈Y
(4) [16], [17]. Here hr (y , y ) is the cost-function of the follower m which plays the strategy v r ∈ V r and the rest of the leaders play the strategy y −r ∈ Y −r . If we consider the utopia point y¯r := arg min hr y r , y −r r
−r
y r ∈Y r
then, we can rewrite Eq. (4) as follows m X
hr y¯r , y −r − hr y r , y −r .
r=1
The functions hr (y r , y −r ) r = 1, m are assumed to be convex in all their arguments. Defenders and attackers together are in a Stackelberg game: the model involves two non-cooperatively Nash games restricted by a Stackelberg game defined as follows. Definition 1: A game with n defenders and m attackers said to be a Stackelberg–Nash game if Γ (x|y) :=
−r
n X Γ (x) := fl x ¯l , x−l − fl xl , x−l .
Ψ (y) :=
x ¯l := arg min fl xl , x−l |y
r=1
then, we can rewrite Eq. (1) as follows
r=1
and
n X
fl x ¯l , x−l |y − fl xl , x−l |y
l=1
such that max% (x|y) = u∈U n P l −l fl x ¯ , x |y − fl xl , x−l |y ≤ 0 l=1
where x−l is a strategy of the rest of the defenders adjoint to xl , namely, x−l := x1 , . . . , xl−1 , xl+1 , . . . , xn ∈ X −l :=
n N t=1, t6=l
Xt
Lθ,δ xl , x−l, y r , y −r , α, β, λ := θΓ xl , x−l |y − α% xl , x−l |y + βζ (y r , y −r |x) + λ| Axeq x − beq +
δ
xl 2 + x−l 2 − λ| Ayeq y − beq + 2 2 2 2 2 2 ky r k − ky −r k + kαk − kβk − kλk (5) where the parameters θ, δ are positive and the Lagrange vector-multipliers, as well as the components of α, β, λ ∈ Λ ⊆ R may have any sign. Then, the optimization problem Lθ,δ xl , x−l , y r , y −r , α, β, λ → max−r max min min max min −l l α≥0 y∈Yadm y −r ∈Y β≥0,λ xl ∈Xadm x−l ∈Xadm adm
(6) has a unique saddle-point on xl , x−l , y r , y −r , α, β, λ since the optimized regularized Lagrange function given in Eq. (5) is strongly convex if the parameters θ and δ > 0 provide the condition H Lθ,δ xl , x−l , y r , y −r , α, β, λ > 0 (7) where H is the Hessian which is strongly concave on y, β and λ for any δ > 0. The regularized Lagrange function has a unique saddle point xl∗ (θ, δ) , x−l∗ (θ, δ) , y r∗ (θ, δ) , y −r∗ (θ, δ) , α∗ (θ, δ) , β ∗ (θ, δ) , λ∗ (θ, δ)) because Lθ,δ xl , x−l , y r∗ , y −r∗ , α, β ∗ , λ∗ ≥ Lθ,δ xl∗ , x−l∗ , y r∗ , y −r∗ , α∗ , β ∗ , λ∗ ≥ (8) Lθ,δ xl∗ , x−l∗ , y r , y −r , α∗ , β, λ Note that function the Lθ,δ xl , x−l , y r , y −r , α, β, λ is poly-linear in ˆ adm , y ∈ Yadm , yˆ ∈ Yˆadm , x ∈ Xadm , x ˆ ∈ X and α, β, λ ∈ Λ, and therefore it can not be solved analytically. Therefore, we need to apply an
iterative method to find a minimizing solution which, moreover, may be not unique. The general format iterative version of the iterative step method for computing the Stackelberg security equilibrium point is as follows: 1. Proximal prediction step: α ¯ n = arg min 12 kα − αn k2 − α≥0 ¯n) γLθδ (xn , x ˆn , y ˆn , α, β¯n , λ n, y β¯n = arg min 21 kβ − β n k2 − β≥0 ¯n) γLθδ (xn , x ˆn , yn, yˆn , α ¯ n , β, λ ¯ n = arg min 1 kλ−λn k2 − λ λ≥0 2 γLθδ (xn , x ˆn , yn , yˆn , α ¯ n , β¯n , λ) 1 kx−xn k2 + x ¯n = arg min x∈Xadm 2 (9) ¯n) γLθδ (x, x ˆn , yn , yˆn ,α ¯ n , β¯n , λ 1 x ˆn (x) = arg min kˆ x − xn k2 + ˆ adm 2 x ˆ ∈X ¯n) γLθδ (xn , x ˆ, yn, yˆn , ξ¯n , λ y¯n = arg min 21 ky − y n k2 − y∈Yadm ¯n) γLθδ (xn , x ˆn , y, yˆn ,α ¯ n , β¯n , λ yˆn (y) = arg min 12 kˆ y −ˆ yn k2 − yˆ∈Yˆadm ¯n) γLθδ (xn , x ˆn , yn , yˆ, α ¯ n , β¯n , λ
l=1,n
r=1,m
the stepsize parameter is given by ( −1 q (1−α) l , γn = γn (α)< max n max K 2 l=1,n ) −1
m
max K r
r=1,m
V. N UMERICAL EXAMPLE
(10)
Let us define the following extended vectors x ˜ ˆ × R+ , ˆ ∈ X:=X x ˜:= x ×X α y yˆ + + ˆ ˜ y˜:= β ∈ Y :=Y × Y × R × R . λ Using these variables the proximal format can be represented by y˜∗ = arg min 12 k˜ x−˜ y ∗ k2 +γLθδ (˜ x, y˜∗ ) . (11) ˜ x ˜∈X
Then, the sequence {˜ yn }n∈N converges to some Stack˜ × V˜ . In addition, for elberg equilibrium point y˜∗ ∈ U l l any α ∈ (0, 1), K ≤ n max K and K r ≤ m max K r
such that 0 < γn ≤ γ0 .
2. Gradient aproximation step: αn+1 = αn + ¯n) γ∇α Lθδ (¯ xn , x ˆn , y¯n , yˆ, α, β¯n , λ βn+1 = β n + ¯n) γ∇β Lθδ (¯ xn , x ˆn , y¯n , yˆ, α ¯ n , β, λ λn+1 =λn + ˆn , y¯n , yˆ, α ¯ n , β¯n , λ) γ∇λ Lθδ (¯ xn , x xn+1 = Prx∈Xadm {xn − ¯n) γ∇x Lθδ (x, x ˆn , y¯n , yˆ, α ¯ n , β¯n , λ x ˆn+1 (x) = Prxˆ∈Xˆ adm {ˆ xn (x)− ¯n) γ∇xˆ Lθδ (¯ xn , x ˆ, y¯n , yˆ, α ¯ n , β¯n , λ yn+1 = Pry∈Yadm {yn + ¯n) γ∇y Lθδ (¯ xn , x ˆn , y, yˆ, α ¯ n , β¯n , λ yˆn+1 (y) = Pryˆ∈Yˆadm {ˆ yn + ¯n) γ∇yˆ Lθδ (¯ xn , x ˆn , y¯n , yˆ, α ¯ n , β¯n , λ
Now we are ready to show the following theorem for the convergence to a unique saddle point of the iterative method. Theorem 2: Let us consider a Stacklebrg game for a set N of defenders l = 1, n and a set M of attackers r = 1, m . Let Lθδ (˜ x, y˜) be differentiable in x ˜ and y˜, whose partial derivative with respect to y˜ satisfies the Lipschitz condition with positive constant K such that k˜ yn+1 − y˜n k ≤ γKk˜ yn − y˜n k. Let {˜ yn }n∈N be a sequence defined by the local search algorithm given by y˜n = arg min 12 k˜ x − y˜n k2 + γLθδ (˜ x, y˜n ) ˜ x ˜∈X (12) yn ) . y˜n+1 =Pry˜∈Y˜ y˜n −γ∇y˜ Lθδ (˜ x,˜
This example represents a security Stackelberg game consisting of two defenders and two attackers. The attackers try to maximize the damage and defenders try to minimize the expected loss. We compute the Stackelberg security Markov game equilibrium point. For this propose, we consider four states, N = 4, and two strategies, M = 2, for each players. The transition matrices are as follows: 0.8095 0.3424 0.9575 0.9572 0.9058 0.0275 0.9549 0.4854 (1) pj|i1 = 0.1170 0.2785 0.1576 0.8103 0.9034 0.5469 0.9706 0.1419
(1)
Pj|i2
(2)
Pj|i1
(2)
Pj|i2
0.7094 0.7547 = 0.2960 0.6597
0.6451 0.1326 0.1190 0.4984
0.9397 0.3404 0.5753 0.2238
0.7213 0.2551 0.5060 0.6991
0.4218 0.9157 = 0.7922 0.9095
0.6857 0.1457 0.8491 0.9940
0.6887 0.7577 0.7431 0.3822
0.6655 0.1712 0.7260 0.0218
0.8709 0.9293 = 0.5472 0.1386
0.1493 0.2375 0.8707 0.2343
0.8043 0.2435 0.9393 0.3500
0.1966 0.2611 0.6160 0.4733
0.2669 0.0462 = 0.0971 0.8235
0.6948 0.3171 0.9502 0.0344
0.4387 0.3816 0.7655 0.7952
0.1869 0.489 0.4456 0.6463
0.3517 0.8308 = 0.5853 0.5497
0.9172 0.2858 0.7572 0.7537
0.3804 0.5678 0.0759 0.0540
0.5308 0.7792 0.9340 0.1299
Pj|i1
0.2769 0.0462 = 0.0971 0.8235
0.6948 0.3131 0.6502 0.0344
0.4387 0.3816 0.7655 0.7952
0.1869 0.4898 0.4456 0.6463
(4) Pj|i2
0.5688 0.4694 = 0.0119 0.3371
0.1422 0.7943 0.3112 0.5285
0.1064 0.6020 0.2630 0.6541
0.6892 0.5482 0.4505 0.0838
(3)
Pj|i1
(3)
Pj|i2
(4)
Fig. 1. Convergence of the strategies of the defender 1.
and the cost matrices are as follows (1)
26 81 44 92
Cj|i1 = 19 87 27 58 15 55 14 15
79 39 25 41
Cj|i1 = 10 58 14 6 95 24 96 36
49 44 45 31
Cj|i1 = 51 65 52 38 82 82 80 54
91 98 44 12
Cj|i1 = 26 61 41 72 60 23 27 12
(1)
86 63 36 52
Cj|i2 = 41 19 91 34 8 24 95 91 24 42 50 37 13 5 49 12
83 2 5 17
65 74 65 46
Cj|i2 = 55 69 30 19 75 37 19 63
36 94 88 56
63 59 21 31
Cj|i2 = 48 23 24 18 85 23 20 44
30 32 43 51
(2)
(2)
(3)
79 9 93 78
(3)
(4)
32 93 44 19
Fig. 2. Convergence of the strategies of the defender 2.
(4)
Cj|i2 = 1 10 24 53 27 74 46 24 81 49 97 49 3 58 55 63
The resulting equilibrium point for the defenders is 0.1658 0.1061 0.1882 0.1192 0.0775 0.0926 (2) 0.1249 0.0918 (1) cik = 0.2424 0.0050 cik = 0.2707 0.0050 0.1725 0.1381 0.0798 0.1204 and the attackers 0.1658 0.0775 (3) cik = 0.2424 0.1725
0.1882 0.1061 0.0926 c(4) = 0.1249 0.2707 0.0050 ik 0.1381 0.0798
0.1192 0.0918 0.0050 0.1204
Fig. 3. Convergence of the strategies of the attacker 1.
ACKNOWLEDGMENT The authors would like to thankfully acknowledge the financial support of the project 20170245 of the Secretar´ıa de Investigaci´on y Posgrado del Instituto Polit´ecnico Nacional. R EFERENCES
Fig. 4. Convergence of the strategies of the attacker 2.
Fig. 5. Realization of the game.
Figure 1 to Figure 4 show the convergence of the strategies to the Stackelberg security equilibrium point. We consider a Patrol Schedule model. Figure 5 shows a realization of the game. As a result, we obtain that the attacker 2 is caught at state 2 in the first step by the defender 2 and the attacker 1 is caught at state 3 after nine steps by defender 1 and defender 2 in together, so the game is over. VI. C ONCLUSION This paper presentes a solution for classical Stackelberg security games: defenders and attackers are independently playing non-cooperatively in a Nash game restricted to a Stackelgerg game. We presented an iterative solution method based on the proximal and the projected gradient methods. In addition, we studied the rate of convergence of the Stackelberg equilibrium computation in security games. We believe that our results offer a theoretical basis for the design of new algorithms and complexity analysis in security games.
[1] D. Korzhyk, V. Conitzer, and R. Parr, “Complexity of computing optimal stackelberg strategies in security resource allocation games,” in Twenty-Fourth Conference on Artificial Intelligence, Atlanta, Georgia, USA, 2010, pp. 805–810. [2] J. Letchford and V. Conitzer, “Solving security games on graphs via marginal probabilities,” in Twenty-Seventh Conference on Artificial Intelligence, Bellevue, Washington, USA, 2013, pp. 591–597. [3] H. Xu, “The mysteries of security games: Equilibrium computation becomes combinatorial algorithm design,” in Proceedings of the 2016 ACM Conference on Economics and Computation, Maastricht, The Netherlands, 2016, pp. 497–514. [4] K. K. Trejo, J. B. Clempner, and A. S. Poznyak, “A stackelberg security game with random strategies based on the extraproximal theoretic approach,” Engineering Applications of Artificial Intelligence, vol. 37, pp. 145–153, 2015. [5] ——, “Computing the stackelberg/nash equilibria using the extraproximal method: Convergence analysis and implementation details for markov chains games,” International Journal of Applied Mathematics and Computer Science, vol. 25, no. 2, pp. 337–351, 2015, to be published. [6] J. B. Clempner and A. S. Poznyak, “Stackelberg security games: Computing the shortest-path equilibrium,” Expert Syst Appl, vol. 42, no. 8, pp. 3967–3979, 2015. [7] ——, “Conforming coalitions in stackelberg security games: Setting max cooperative defenders vs. non-cooperative attackers,” Applied Soft Computing, vol. 47, pp. 1–11, 2016. [8] K. K. Trejo, J. B. Clempner, and A. S. Poznyak, “Adapting strategies to dynamic environments in controllable stackelberg security games,” in IEEE 55th Conference on Decision and Control (CDC), Las Vegas, USA, 2016, pp. 5484–5489. [9] ——, “An optimal strong equilibirum solution for cooperative multi-leader-follower stackelberg markov chains games,” Kibernetika, vol. 52, no. 2, pp. 258–279, 2016. [10] ——, “Computing the lp-strong nash equilibrium for markov chains games,” Applied Mathematical Modelling, vol. 41, pp. 399–418, 2017. [11] J. B. Clempner and A. S. Poznyak, “Using the extraproximal method for computing the shortest-path mixed lyapunov equilibrium in stackelberg security games,” Mathematics and Computers in Simulation, vol. 138, pp. 14–30, 2017. [12] A. S. Poznyak, K. Najim, and E. Gomez-Ramirez, Self-learning control of finite Markov chains. New York: Marcel Dekker, Inc., 2000. [13] J. B. Clempner and A. S. Poznyak, “Simple computing of the customer lifetime value: A fixed local-optimal policy approach,” Journal of Systems Science and Systems Engineering, vol. 23, no. 4, pp. 439–459, 2014. [14] C. U. Solis, J. B. Clempner, and A. S. Poznyak, “Modeling multi-leader-follower non-cooperative stackelberg games,” Cybernetics and Systems, vol. 47, no. 8, pp. 650–673, 2016. [15] J. B. Clempner and A. S. Poznyak, “Analyzing an optimistic attitude for the leader firm in duopoly models: A strong stackelberg equilibrium based on a lyapunov game theory approach,” Economic Computation And Economic Cybernetics Studies And Research, vol. 4, no. 50, pp. 41–60, 2016. [16] K. Tanaka and K. Yokoyama, “On -equilibrium point in a noncooperative n-person game,” Journal of Mathematical Analysis and Applications, vol. 160, pp. 413–423, 1991. [17] K. Tanaka, “The closest solution to the shadow minimum of a cooperative dynamic game,” Computers & Mathematics with Applications, vol. 18, no. 1-3, pp. 181–188, 1989.