2013 American Control Conference (ACC) Washington, DC, USA, June 17-19, 2013
Multi-Pursuer Single-Evader Differential Games with Limited Observations* Wei Lin† , Student Member, IEEE, Zhihua Qu† , Fellow, IEEE, and Marwan A. Simaan† , Life Fellow, IEEE Abstract— In this paper, closed-loop Nash equilibrium strategies for an N -pursuer single-evader differential game over a finite time horizon with limited observations is considered. The game setting is such that each pursuer has limited sensing range and can observe the state vector of another player only if that player is within the pursuer’s sensing range. The evader, on the other hand, has unlimited sensing range which allows it to observe the state of all pursuers at all times and implement a standard closed-loop Nash strategy. To derive strategies for the pursuers, a new concept of best achievable performance indices is proposed. These indices are derived in a way to be the closest to the original performance indices and such that the resulting pursuers’ collective strategy satisfies a Nash equilibrium against the evader’s strategy. The strategies obtained by such an approach are independent of the initial state vector. An illustrative example is solved and simulation results corresponding to different sensing ranges and performance indices of the game are presented.
I. I NTRODUCTION The pursuit-evasion game has been considered for decades as a model for illustrating a variety of applications of differential game theory [1], [2]. Basically, it models the process where one or more pursuers try to chase one or more evaders while the evaders try to escape. Solving a pursuitevasion game essentially involves developing strategies for the pursuers and evaders such that their prescribed performance indices are optimized. Historically, pursuit-evasion games were first investigated by Isaacs [1] in the 1950s. Saddle point solutions for a type of zero-sum single-pursuer single-evader games were considered in [2]. Nonzero-sum pursuit-evasion games were introduced and investigated as an example of the Nash equilibrium strategies in [3] and as an example of leader-follower Stackelberg strategies in [4]. Two pursuers and one evader games were studied in [5]. Recently, there has been considerable interest in problems that involve many pursuers chasing one evader [6], [7], [8], [9], [10], [11]. These types of games present some interesting challenges in designing coordinated strategies among all the pursuers to accomplish the goal of capturing the evader. From a practical point of view, however, it is important to consider games with state information constraints, where global information is not available to all players. In this paper, we consider an N -pursuer single-evader game over a finite time horizon, where only the evader has global information with the ability *This work is supported in part by a grant (CCF-0956501) from National Science Foundation. † W. Lin, Z. Qu, and M. A. Simaan are with the Department of EECS, University of Central Florida, Orlando, Florida, 32816, USA. Emails:
[email protected],
[email protected],
[email protected].
978-1-4799-0178-4/$31.00 ©2013 AACC
to observe all the pursuers. The pursuers, on the other hand, have limited sensing capabilities. Each pursuer is only able to observe players that fall within its sensing range during any interval of time in the game. This might necessitate that each pursuer needs to implement a strategy based on the information available to it. Most of the results on pursuitevasion games so far are built upon the assumption that all players must have global or full information about all the players in the game in order to implement their closed-loop Nash strategies. Some of the recent papers on this subject have considered limited information structures either by the pursuers or by the evader [6], [9], [10]. In this paper, we consider a game between a single evader with a high sensing capability outnumbered by several pursuers but with poor sensing capabilities. A practical example of such a situation occurs when a well-equipped drone with a very wide range of sensing capability must evade several weakly-equipped pursuing air vehicles. These types of problems need to be treated using an approach different from what currently exists in the literature. This paper builds on previous results obtained in [12] and further explores how the inverse optimality can be used to secure a Nash equilibrium between the pursuers and evader. The remainder of the paper is organized as follows. Problem formulation is presented in Section II. The evader’s Nash equilibrium strategy based on a global information structure is derived in Section III. The Nash strategies for the pursuers with limited observations are analyzed and derived in Section IV. An illustrative example and simulation results are shown in Section V. The paper is concluded in Section VI. II. P ROBLEM F ORMULATION The differential game considered in this paper is between N pursuers and a single evader in the unbounded Euclidean space. We assume that the state of each player is governed by the differential equation x˙ i = F xi + Gi ui ,
(1)
for i = e, 1, · · · , N , where vector xi ∈ Rn is the state variable, ui ∈ Rmi is the control input, and matrices F and Gi are of proper dimensions. Subscript e stands for the evader and subscripts 1 to N stand for pursuers 1 to N . We assume that all players have the same dynamics (i.e. same F matrix) which allows for an explicit derivation of the T T dynamics of a vector z = [z1T · · · zN ] where zi = xi − xe , whose entries represent the differences (or displacement) between the evader and each of the pursuers’ state vectors.
2711
Hence, the overall dynamics becomes z˙ = Az + Be ue +
N X
Bi ui = Az + Be ue + Bp up ,
(2)
i=1
where up = [uT1 · · · uTN ]T , A = IN ⊗ F , IN is N × N identity matrix, ⊗ is the Kronecker product, Be = −1N ⊗ Ge , 1N is a N × 1 vector with all the elements equal to 1, Bj = dj ⊗ Gj , dj is an N × 1 vector with the jth element equal to 1 and all other elements equal to 0, Bp = blkdiag{G1 , · · · , GN }, and blkdiag stands for “block diagonal matrix”. We assume that the collective objective of the pursuers is to minimize a quadratic function of the vector z while at the same time minimizing their energy costs. Hence, denoting kxi k2M = xTi M xi , we assume that all the pursuers try to minimize a common performance index: Z 1 tf 1 2 (kzk2Qp + kup k2Rp )dt, (3) Jp = kz(T )kMp + 2 2 0 where Mp = blkdiag{Mp1 , · · · , MpN }, Qp = blkdiag{Qp1 , · · · , QpN }, Rp = blkdiag{Rp1 , · · · , RpN }. n×n
III. NASH S TRATEGIES FOR THE E VADER From the perspective of the evader, since it has a wide enough sensing range to observe all the pursuers, it will naturally adopt its closed-loop Nash strategy derived from (2), (3), and (4) without knowing that the pursuers only have limited observations. Therefore, according to well known Nash equilibrium result from [3] on linear quadratic differential games, the evader’s strategy is u∗e = −Re−1 BeT Pe z,
(7)
where Pe is obtained by solving the coupled differential Riccati equations P˙ e + Qe + Pe A + AT Pe − Pe Bp Rp−1 BpT Pp − Pp Bp Rp−1 BpT Pe − Pe Be Re−1 BeT Pe = 0. P˙ p + Qp + Pp A + AT Pp − Pp Bp R−1 B T Pp
(8a)
− Pp Be Re−1 BeT Pe − Pe Be Re−1 BeT Pp = 0
(8b)
p
with boundary condition Pp (T ) = Mp and Pe (T ) = Me . IV. NASH S TRATEGIES FOR THE P URSUERS Since the evader will implement its Nash strategy in (7), clearly, the best strategy for the pursuers is to implement the corresponding Nash strategy up = −Rp−1 B T Pp z
n×n
Matrices Mpj ∈ R and Qpj ∈ R are positivesemidefinite and matrix Rj ∈ Rmj ×mj is positive definite for j = 1, · · · , N . Similarly, we assume that the objective of the evader is to maximize a quadratic function of the vector z while at the same time keeping its energy costs at a minimum. Hence, the evader will try to minimize the performance index: Z T 1 (4) (kzk2Qe + kue k2Re )dt, Je = kz(T )k2Me + 2 0 where Me = blkdiag{Me1 , · · · , MeN }, Qe = blkdiag{Qe1 , · · · , QeN }. Matrices Mej ∈ Rn×n and Qej ∈ Rn×n are negativesemidefinite for j = 1, · · · , N and matrix Re ∈ Rme ×me is positive definite. Given system dynamics (2) and performance indices (3)-(4), a differential nonzero-sum game problem is formed. The Nash equilibrium for this game is defined as follows. Definition 1: For the N -pursuer single-evader differential game described in (2) with pursuers minimizing performance index (3) and evader minimizing performance index (4), the strategy set {u∗p , u∗e } is a Nash equilibrium if the inequalities
p
(9)
where Pp is obtained from the solution of (8). However, the pursuers are generally not able to implement (9) because the ith element of up in (9), meaning the Nash strategy for pursuer i, is generally a full state feedback of z and hence requires pursuer i to have global information. Therefore, with the constraint of limited observations, a different strategy for the pursuers must be implemented. This strategy must form a new Nash equilibrium together with the strategy (7) of the evader. To accurately model the sensing capabilities of the pursuers, we define a sensing radius of pursuer i as ri . This radius will determine whether the full state vector of another player in the game is observable to that pursuer or not. Consequently, we define a time dependent Laplacian matrix [13] for the N pursuers as follows: L11 (t) · · · L1N (t) .. .. L(t) = ... (10) , . . LN 1 (t) · · ·
LN N (t)
where
Lij (t) =
1 0
if kΛzi (t) − Λzj (t)k ≤ ri for j 6= i if kΛzi (t) − Λzj (t)k > ri for j 6= i
N X − Lil
if j = i
l=1,l6=i
Ji (u∗p , u∗e ) ≤ Ji (up , u∗e )
(5)
Je (u∗p , u∗e ) ≤ Je (u∗p , ue )
(6)
hold for any up ∈ Up and ue ∈ Ue where Up and Ue are the admissible strategy sets for the pursuers and evader, respectively.
Also define a vector h(t) = [hT1 (t) · · · hTN (t)]T whose ith entry hi (t) represents the ability of the ith pursuer to observe the evader at time t. That is, 1 if kΛzi (t)k ≤ ri hi (t) = (11) 0 if kΛzi (t)k > ri
2712
In (10) and (11), Λzi (t) represents the coordinates of pursuer i relative to the position of the evader P in vector zi (t) at time t. Further, we assume that if N1 N i=1 kΛzi k ≤ ǫ where ǫ is a capture radius, then the game terminates and the evader is assumed to be captured. We assume that the strategy for pursuer i is chosen in the following form ui =Ki1 hi zi + Ki2
N X
Lij (zi − zj ).
(12)
j=1
The first term in the above expression represents a control component to chase the evader directly if pursuer i observes the evader. The second term in the expression represents a component as a feedback control of the difference between pursuer i and its neighboring pursuers’ state variables. In general, pursuer i’s control could be chosen in a more general structure than the one in (12), however, in this paper, we choose ui in (12) because of its simplicity and applicability in realistic situations. Control ui can be rewritten as ui =Ki1 [(hi dTi ) ⊗ In ]z + Ki2 [(dTi L) ⊗ In ]z T hi di =[Ki1 Ki2 ] ⊗ In z dTi L ,Ki Ci z,
where Ki = [Ki1 Ki2 ] ∈ Rmi ×2n and T hd Ci = iT i ⊗ In ∈ R2n×nN . di L
(13)
Lemma 1: For the pursuit-evasion game described in (2)(4), given a set of matrices {K1 , · · · , KN }, then the strategies ue in (7) and up in (15) form a Nash equilibrium with respect to performance indices 1 Jps = kz(T )k2Mp 2 Z 1 T (kzk2Qps − uTp Sz − z T S T up + kup k2Rp )dt, (17) + 2 0 Z 1 1 T 2 Jes = kz(T )kMe + (18) (kzk2Qes + kue k2Re )dt, 2 2 0 where
S = BpT Pp + Rp Ks , Qps = Qp − ∆p Qes = Qe − ∆e , ∆p = Pp Bp Rp−1 BpT Pp − KsT Rp Ks , ∆e =
Pe Bp Rp−1 S
+S
T
Rp−1 BpT Pe .
(20) (21)
Ks is as in (16), and Pp and Pe are the solutions of (8). Proof: Consider Lyapunov functions 1 1 (22) Vp = z T Pp z and Ve = z T Pe z, 2 2 where Pp and Pe are the solutions to the coupled Riccati equations (8b), (8a). Differentiating Vp in (22) with respect to t and integrating it from 0 to tf yield Vp (tf ) − Vp (0) Z 1 T = − kzk2Qps − kuk2Rp + uTp Sz + z T S T up 2 0 + 2z T Pp Be (ue + Re−1 BeT Pe z) + kup − Ks zk2Rp dt.
(14)
Therefore, denoting up = Ks z,
(19)
(15)
Hence,
where Ks = [(K1 C1 )T · · · (KN CN )T ]T ,
the problem reduces to finding matrices {K1 , · · · , KN } such that up can still form a Nash equilibrium with the evader’s strategy ue in (7). However, according to Definition 1, given performance indices (3) and (4), if the evader sticks to the Nash strategy ue in (7) and the pursuers fail to implement the corresponding Nash strategies in (9), the Nash equilibrium will no longer hold. Since it is generally not possible for the two expression of up in (9) and (15) to be the same, an intuitive approach is to modify the performance indices of both the evader and the pursuers, keeping them as close as possible to the original indices given in (3) and (4) but in such a way that the evader strategy of (7) and pursuers’ strategy of (15) form a Nash equilibrium with respect to the modified performance indices. To achieve this goal, we first find a family of performance indices with respect to which the pursuers’ strategy up in (15) and evader’s strategy ue in (7) form a Nash equilibrium. Then, we use a new concept of best achievable performance indices to find a pair of performance indices within this family such that they are closest to the original performance indices Jp and Je in (3) and (4).
T
1 kup − Ks zk2Rp 0 2 + z T Pp Be (ue + Re−1 BeT Pe z)dt,
(16)
Jps =Vp (0) +
Z
where Jps is defined in (17). Similarly, we have Z T 1 Jes =Ve (0) + kue + Re−1 BeT Pe zk2Re 0 2 + z T Pe Bi (up − Ks z)dt,
(23)
(24)
where Jes is defined in (18). Since Rp and Re are positive definite, it is obvious from (23) and (24) that up = Ks z and ue = −Re−1 BeT Pe form a Nash equilibrium according to Definition 1. Clearly, if Ks = [(K1 C1 )T · · · (KN CN )T ]T = −Rp−1 BpT Pp , then Jps and Jes in (17) and (18) become identical to (3) and (4). We propose the concept of the best achievable performance indices as follows. Definition 2: Given system (2) and performance indices (3) and (4), performance indices (17) and (18) are called the best achievable performance indices if there exists a set of matrices {K1 , · · · , KN } such that kSk, k∆p k, and k∆e k are minimized for all t where S, ∆p , and ∆e are defined in (19)-(21), and k · k is any user defined matrix norm.
2713
Therefore, by finding matrices {K1 , · · · , KN } that correspond to the best achievable performance indices, we actually find a pair of performance indices for the pursues and evader such that they are the closest to the original ones (3) and (4) while admitting strategies (7) and (15) as their Nash equilibrium. In order to determine {K1 , · · · , KN }, we need to solve a multi-objective optimization problem of minimizing kSk, k∆p k and k∆e k simultaneously. One way to accomplish this is to minimize at every t a function that consists of a convex combination of these terms. That is H(t) =α1 kSk2F + α2 k∆p k2F + α3 k∆e k2F (25) =α1 Tr(S T S) + α2 Tr(∆2p ) + α3 Tr(∆2e ) P3 where αi > 0 and i=1 αi = 1. In (25), k · kF denotes the Frobenius norm defined as kM k2F = Tr(M T M ). Note that the minimization of H(t) in (25) with respect to {K1 , · · · , KN } is generally PN quite difficult to solve analytically. Since Ks = j=1 Dj Kj Cj where Di is an (m1 + · · · + mN ) × mi block matrix with the ith block being an mi × mi identity matrix and other blocks zero matrices, the gradient of H(t) with respect to Ki can be obtained as
where q > 0 is a scalar. The initial positions of pursuers 1,2,3 and the evader are x1 = (−3, 0), x2 = (3, 0), x3 = (5, 1), xe = (0, 1). We derive and simulate the results under various scenarios. In all of these scenarios, we use α1 = 1/2 and α2 = α3 = 1/4 in (25). We also assume a capture radius ǫ = 0.37. Scenario 1: If in addition to the evader, all the pursuers also have wide enough sensing ranges (i.e., Λ = I2 and ri → ∞ in (10) and (11)), the game is under global information. Solving the coupled differential Riccati equations (8a) and (8b) with q = 1 yields solutions Pp and Pe in the following form: Pp1 Pp2 Pp2 Pe1 Pe2 Pe2 Pp = Pp2 Pp1 Pp2 ⊗I2 , Pe = Pe2 Pe1 Pe2 ⊗I2 Pp2 Pp2 Pp1 Pe2 Pe2 Pe1 Plots of Pp1 , Pp2 , Pe1 , and Pe2 in this case are shown in Figure 1. And the pursuers and evader’s strategies can be 2
∇Ki H =DiT Rp (2α1 S − 4α2 Ks ∆p + 4α3 Rp−1 BpT Pe ∆e )CiT , i = 1, · · · , N.
(26) 1
One possible approach to perform this minimization is to implement a gradient based algorithm such as steepest descent or conjugate gradient to search for the optimal matrices {K1 , · · · , KN }. Note that these matrices will be independent of the initial state vector. By varying the coefficients α1 , · · · , α3 , a noninferior set can be generated and an appropriate choice of coefficients can be then made. In the game process, we assume for simplicity that the pursuers’ sensing abilities are updated at discrete instants of time t0 , t1 , · · · , tβ−1 at which the pursuers perform sensing, where t0 = 0 and tβ = T . We assume that (ti − ti−1 ) is sufficiently small for all i = 1, · · · , β such that the observations among the players can be regarded to be constant within such a small time interval (ti − ti−1 ). Hence, once the information topology changes among the players at any of the time instants t1 , · · · , tβ , the above algorithm will be performed to update the matrices K1 , · · · , KN for the remainder of the game. V. E XAMPLE AND S IMULATION R ESULTS Let us now consider a three-pursuer single-evader game taking place in a planar environment. Suppose that all players have the same dynamics given by (1) where xi represents two position coordinates on the plane, F = 0, Gi = I2 , and ui represents the velocity control for i = e, 1, · · · , N . Then, the system dynamics (2) can be written as A = 0,
Bp = I3 ⊗ I2 ,
Be = −13 ⊗ I2 .
(27)
The performance indices are given by (3) and (4) with T = 10 and Mp = Qp = q(I3 ⊗ I2 ), Rp = I3 ⊗ I2 ,
Pp1
1.5
(28)
Pp2
0.5
Pe1 0
Pe2 −0.5
−1 0
1
Fig. 1.
2
3
4
5 time
6
7
8
9
10
Plots of Pp1 , Pp2 , Pe1 , and Pe2
obtained from (9) and (7). In this case, the motion trajectories of the players and the distances between the pursuers and evader over time are shown in Figure 2. With a capture radius ǫ = 0.37 (indicated as a black line in Figure. 2), the evader is captured around t = 2.2. Scenario 2: Assume that the pursuers’ sensing ranges are now reduced to ri = 4 for i = 1, 2, 3. As shown in Figure 3, at t = 0, pursuer 1 can only observe the evader, pursuer 2 can observe the evader and pursers 3, and pursuer 3 can only observe pursuer 2. For q = 1, the motion trajectories and the distances between the pursuers and evader over time in Figure 4. Clearly, in this scenario, the pursuers are not able to capture the evader (with the same capture radius ǫ = 0.37) when the final time T = 10 is reached. The change in the topology of the game is reflected in the changes of Laplacian matrix L(t) and vector h(t) in (10) and (11). In this scenario, L(t) has changed three times as follows: " # " # " #
Me = Qe = −(I3 ⊗ I2 ), Re = I2 .
0 0 0
0 1 −1
0 −1 , 1
0 ≤ t ≤ 0.75, 2714
1 −1 0
−1 2 −1
0 −1 , 1
0.75 < t ≤ 2.6,
1 −1 0
−1 1 0
0 0 0
2.6 < t ≤ 10
1.5
2 Pursuer 1 Pursuer 2 Pursuer 3 Evader
1.8 1.6 1.4 vertical axis
Vertical axis
1 Pursuer 1 Pursuer 2 Pursuer 3 Evader 0.5
1.2 1 0.8 0.6 0.4 0.2
0 −3
−2
−1
0
1 2 Horizontal axis
3
4
0 −5
5
0 horizontal axis
5
9
5 Pursuer 1 Pursuer 2 Pursuer 3 Average
4.5
Pursuer 1 Pursuer 2 Pursuer 3 Average
8
4
7
3.5 6
Distance
Distance
3 2.5
5
4
2 3 1.5 2
1
1
0.5 0
0 0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 2. Motion trajectories and distances between the pursuers and evader under global information. Average distance to the evader is plotted in red dashed curve.
0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 4. Motion trajectories and distances between the pursuers and evader for scenario 2.
Vertical Axis
Evader (-3,0)
(5,1)
(0,1)
Pursuer 2
Pursuer 1
Fig. 3.
Pursuer 3
(3,0)
Horizontal Axis
Initial positions of three pursuers and single evader
and the vector h remained unchanged during the entire game, i.e., h1 (t) = h2 (t) = 1 and h3 (t) = 0. This essentially means while pursuers 3 was never able to observe pursuer 1 for the entire game, it is able to observe pursuer 2 for t ∈ [0, 2.6] and loses that observation after t = 0.75 Scenario 3: Using the same setting as scenario 2 expect with q = 5 (instead of q = 1), motion trajectories and the distances between the pursuers and evader are shown in Figure 5. Clearly, in this scenario (with ǫ = 0.37), the evader is captured around t = 5.2. During the entire game, the Laplacian matrix has also changed three times as follows " # " # " # 0 0 0
0 1 −1
0 −1 , 1
0 ≤ t ≤ 0.5,
1 −1 0
−1 2 −1
0 −1 , 1
0.5 < t ≤ 2,
2 −1 −1
−1 2 −1
−1 −1 2
2 < t ≤ 10
and vector h has changed as follows: h1 (t) = h2 (t) = 1 for 0 ≤ t ≤ 10, h3 (t) = 0 for 0 ≤ t ≤ 1.4, and h3 (t) = 1 for 1.4 < t ≤ 10. In other words, after t = 1.4 all the pursuers are able to observe the evader and after t = 2 all the pursuers are also able to observe each other. Note that although scenarios 1 and 2 have the same performance indices, the pursuers are not be able to catch the evader in scenario 2 because of their limited observations. However, as q increased from 1 to 5 between scenarios 2 and 3, the pursuers are able to catch the evader in scenario 3 because of the higher penalty on the distances in the pursuers’ performance index. Remark: It would be interesting to determine a critical value qc of q which separates the escape and capture regions of the evader. That is, • •
if q < qc , the evader escapes and if q ≥ qc the evader is captured at a time t ∈ [0, T ].
For this game, the critical value of q has been determined to be qc = 1.36. Figure 6 shows plots of the distances between the pursuers and evader when q = qc = 1.36. Clearly, the capture time in this case occurs at t = 10. Figure 7 shows a plot of the capture time versus q. Clearly, as q increases from the critical value, the capture time will decrease. The region 0 < q < qc = 1.36 is where the evader escapes, i.e., the capture time will be ∞.
2715
VI. C ONCLUSION
1.4 Pursuer 1 Pursuer 2 Pursuer 3 Evader
1.2
In this paper, closed-loop Nash strategies for a linear N -pursuer single-evader differential game with quadratic performance indices over a finite time horizon is considered. The evader is assumed to have unlimited observations while each pursuer has limited observations based on its own sensing radius. Since the evader is able to observe the states of all pursuers at all times, it implements a standard closed-loop Nash strategy using the well-known couple Riccati differential equations approach. Under the limited observations setting, however, each pursuer is able to observe the states of only those players who are within its sensing radius. Hence, the pursuers will seek to implement a collective strategy that can form a Nash equilibrium with the evader’s strategy. To accomplish this, the pursuers derive best achievable performance indices in such a way to be as close as possible to the original performance indices and such that the resulting pursuers’ Nash strategy based on the available observations and the original evader’s strategy form a Nash equilibrium. An illustrative example involving three pursuers and one evader is solved and simulation results corresponding to different scenarios are presented.
vertical axis
1
0.8
0.6
0.4
0.2
0 −3
−2
−1
0
1 2 horizontal axis
3
4
5
5 Pursuer 1 Pursuer 2 Pursuer 3 Average
4.5 4 3.5
Distance
3 2.5 2 1.5 1
R EFERENCES
0.5 0
0
1
2
3
4
5 Time
6
7
8
9
10
Fig. 5. Motion trajectories and distances between the pursuers and evader for scenario 3. 5 Pursuer 1 Pursuer 2 Pursuer 3 Average
4.5 4 3.5
Distance
3 2.5 2 1.5 1 0.5 0
Fig. 6.
0
1
2
3
4
5 Time
6
7
8
9
10
Distances between the pursuers and evader when q = qc = 1.36 Evader escapes
10
Capture Time
8 6 4 2 0
0
qc=1.36
Evader is captured 2
4
6
Fig. 7.
8
10 q
12
14
16
18
20
[1] R. Isaacs, Differential Games. John Wiley and Sons, 1965. [2] Y. Ho, A. B. Jr., and S. Baron, “Differential games and optimal pursuitevasion strategies,” IEEE Trans. Automatic Control, vol. AC-10, no. 4, pp. 385–389, 1965. [3] A. Starr and Y. Ho, “Nonzero-sum differential games,” Journal of Optimization Theory and Applications, vol. 3, pp. 184–206, 1969. [4] M. Simaan and J. Cruz, “On the Stackelberg solution in the nonzerosum games,” Journal of Optimization Theory and Applications, vol. 18, pp. 322–324, 1973. [5] M. Foley and W. Schmitendorf, “A class of differential games with two pursuers versus one evader,” IEEE Trans. Automatic Control, vol. 19, pp. 239–243, 1974. [6] S. Bopardikar, F.Bullo, and J. Hespanha, “On discrete-time pursuitevasion games with sensing limitations,” IEEE Trans. Robotics, vol. 24, no. 6, pp. 1429–1439, 2008. [7] S. Bhattacharya and T. Baar, “Differential game-theoretic approach to a spatial jamming problem,” in Advances in Dynamic Games, ser. Annals of the International Society of Dynamic Games, P. Cardaliaguet and R. Cressman, Eds. Birkhuser Boston, 2013, vol. 12, pp. 245–268. [8] D. Li and J. Cruz, “Defending an asset: A linear quadratic game approach,” IEEE Trans. Aerospace and Electronic Systems, vol. 47, no. 2, pp. 1026 –1044, april 2011. [9] H. Huang, W. Zhang, J. Ding, D. Stipanovic, and C. Tomlin, “Guaranteed decentralized pursuit-evasion in the plane with multiple pursuers,” in 50th IEEE CDC, 2011, pp. 4835 –4840. [10] J. Guo, G. Yan, and Z. Lin, “Local control strategy for moving-targetenclosing under dynamically changing network topology,” Systems & Control Letters, vol. 59, pp. 654–661, 2010. [11] M. Wei, G. Chen, J. B. Cruz, L. S. Haynes, K. Pham, and E. Blasch, “Multi-pursuer multi-evader pursuit-evasion games with jamming confrontation,” AIAA J. of Aerospace Computing, Information, and Comm, vol. 4, no. 3, pp. 693–706, 2007. [12] W. Lin, Z. Qu, and M. Simaan, “A design of distributed nonzero-sum nash strategies,” in 49th IEEE CDC, Atlanta, USA, 2010, pp. 6305– 6310. [13] R. Olfati-Saber and R. Murray, “Consensus problems in networks of agents with switching topology and time-delays,” IEEE Trans. Automatic Control, vol. 49, no. 9, pp. 1520– 1533, 2004.
Capture Time versus q
2716