Given the increasingly dense environment in both low-earth orbit (LEO) and ...... [4] Y. Bar-Shalom, and W. D. Blair (editors), Multitarget-Multisensor Tracking: ...
Sensor Management for Collision Alert in Orbital Object Tracking∗ Peiran Xu, Huimin Chen, D. Charalampidis University of New Orleans Dept. of Electrical Engineering New Orleans, LA 70148
Dan Shen, Genshe Chen DCM Research Resources, LLC 14163 Furlong Way Germantown, MD 20874
Erik Blasch AFRL/RYAA WPAFB OH 45433
Khanh Pham AFRL/RVSV Kirtland AFB NM, 87117
ABSTRACT Given the increasingly dense environment in both low-earth orbit (LEO) and geostationary orbit (GEO), a sudden change in the trajectory of any existing resident space object (RSO) may cause potential collision damage to space assets. With a constellation of electro-optical/infrared (EO/IR) sensor platforms and ground radar surveillance systems, it is important to design optimal estimation algorithms for updating nonlinear object states and allocating sensing resources to effectively avoid collisions among many RSOs. Previous work on RSO collision avoidance often assumes that the maneuver onset time or maneuver motion of the space object is random and the sensor management approach is designed to achieve efficient average coverage of the RSOs. Few attempts have included the inference of an object’s intent in the response to an RSO’s orbital change. We propose a game theoretic model for sensor selection and assume the worst case intentional collision of an object’s orbital change. The intentional collision results from maximal exposure of an RSO’s path. The resulting sensor management scheme achieves robust and realistic collision assessment, alerts the impending collisions, and identifies early RSO orbital change with lethal maneuvers. We also consider information sharing among distributed sensors for collision alert and an object’s intent identification when an orbital change has been declared. We compare our scheme with the conventional (non-game based) sensor management (SM) scheme using a LEO-to-LEO space surveillance scenario where both the observers and the unannounced and unplanned objects have complete information on the constellation of vulnerable assets. We demonstrate that, with adequate information sharing, the distributed SM method can achieve the performance close to that of centralized SM in identifying unannounced objects and making early warnings to the RSO for potential collision to ensure a proper selection of collision avoidance action.
1. INTRODUCTION Over recent decades, the space environment has become more complex with a significant increase in space debris among densely populated satellites. Efficient and reliable space operations rely heavily on the space situational awareness where searching and tracking space objects and identifying their intent are crucial in creating a consistent global picture. Orbit determination with measurements provided by a constellation of satellites has been studied extensively [16,32,18,34]. Unlike ground targets whose motion may contain frequent maneuvers, a space object usually follows its orbit so that long term prediction of its orbital trajectory is possible once the orbital elements are known [6,16]. However, a space object can also make an orbital change owing to its desired mission or intentionally hiding from the space borne observers. Existing maneuvering target tracking literature mainly focuses on modeling target maneuver motion at random onset time (See, e.g., [3–5,7]). This may not be the case for those satellites which are unlikely to change their orbits due to the desired constellation. Nevertheless, an intelligent adversary can devise evasive maneuvering motion that takes the advantage of the geometry, e.g., transferring to an orbit with maximum duration of the Earth blockage to an observer with known orbit. Such orbital change of an unannounced and unplanned object poses immediate threat of potential collision to the existing resident space objects. Accordingly, a sensor management technique has to optimally utilize the sensing resources to acquire and track space objects with sparse measurements, i.e., with a typically large sampling interval, in order to maintain a large number of tracks simultaneously. Sensor management (SM) is concerned with the sensor-to-object assignment and a schedule of sensing actions for each sensor in the near future given the currently available information on the space objects. Sensor assignment and scheduling usually aim to optimize a certain criterion under energy, data processing and communication constraints. One popularly used criterion is the total information gain for all the objects being tracked ∗ This work was supported in part by NSF through grant REU Site 0851618, ARO through grant W911NF-08-1-0409 and ONR-DEPSCoR through grant N00014-09-1-1169.
Sensors and Systems for Space Applications IV, edited by Khanh D. Pham, Henry Zmuda, Joseph Lee Cox, Greg J. Meyer, Proc. of SPIE Vol. 8044, 80440E · © 2011 SPIE CCC code: 0277-786X/11/$18 · doi: 10.1117/12.883632 Proc. of SPIE Vol. 8044 80440E-1 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
[26]. However, this criterion does not prioritize the objects with respect to their types or identities. Alternatively, covariance control optimizes the sensing resources to achieve the desired estimation error covariance for each object [24]. It has the flexibility to design the desired tracking accuracy according to the importance of each object. This may include the assessment of collision probability among those junctions between an object being tracked and a known resident space object (RSO). Many existing sensor management schemes assume that the sensing model upon which the optimization is based is known a priori [36]. The model can come from domain knowledge or calibrated data from a pilot deployment. In addition, a centralized optimization scheme has to be used in order to select the sensors which obtain the highest utility with respect to the sensing model. Finding the optimal solution is in general computationally prohibitive for large scale problems involving sensor selection and sensor placement. Thus it is desirable to develop a distributed sensor selection and placement scheme which is efficient, and can learn the sensing model or utility function online. Moreover, the scheme should possess a quantifiable performance gap to the optimal solution, which, in general, is very difficult to find when the problem is NP hard. Previous work has shown that many sensing tasks satisfy the submodular property [1], which intuitively says that activating a new sensor helps more when only few sensors measuring the same object of interest have been activated so far. The submodular property of the objective function warrants that an efficient algorithm, namely, greedy sensor selection, can achieve near optimal performance with a quantifiable performance gap. Note that the sensing model does not have to be specified in advance. We develop a distributed algorithm to learn the objective function in an online manner. In the long run, the distributed approach achieves the best bound compared to the optimal centralized solution, making the sensor selection amenable to large scale space surveillance requirement. Since the tracking accuracy depends critically on the early detection of a target maneuver, we propose a game theoretic framework for solving the sensor scheduling problem that assumes the worst case intentional collision of an object’s orbital change on one of the known space objects. Note that a lethal maneuver can lead to the collision to an RSO within a few minutes. Thus collision detection has become an important issue in the increasingly dense low-earth-orbit (LEO) environment. In our earlier work, we compared space object tracking with various nonlinear filtering methods [10], considered the measurement delay in orbital state estimation [11], and obtained realistic estimates of collision probability and impact time using the orbital state [29]. Here we focus on the sensor management schemes that can improve the collision assessment, alert the impending collisions, and identify early orbital change with lethal maneuvers. We emphasize on the information sharing among distributed sensors for collision alert and identification of an object’s intent when an orbital change has been declared. We compare the proposed scheme with the conventional non-game based SM scheme using a LEOto-LEO space surveillance scenario. We assume that both the observers and the unannounced and unplanned objects have complete information on the constellation of vulnerable assets. We demonstrate that with adequate information sharing, the distributed SM method can achieve the performance close to that of the centralized SM in identifying unannounced objects and making early warnings to the RSO for potential collision so that proper collision avoidance action can be taken.
2. FUNDAMENTALS OF SPACE OBJECT TRACKING 2.1. Time and Coordinate Systems To track a space object, one needs to handle several time and coordinate systems when sharing information among different space borne observers. Satellite laser ranging measurements are usually time-tagged in coordinated universal time (UTC) while global positioning system (GPS) measurements are time tagged in GPS system time (GPS-ST). Although both UTC and GPS-ST are based on atomic time standards, UTC is loosely tied to the rotation of the Earth through the application of “leap seconds” while GPS-ST is continuous with the relation GPS-ST=UTC+n where n is the number of leap seconds since January 6, 1980. The orbital equation describing near-Earth satellite motion is typically tagged with terrestrial dynamical time (TDT). TDT is an abstract, uniform time scale implicitly defined by the motion equation and can be converted to UTC or GPS-ST for any given reference date. The Earth centered inertial (ECI) coordinate system used to link GPS-ST with UTC is a geocentric system defined by the mean equator and vernal equinox at Julian epoch 2000.0. Its XY plane coincides with the equatorial plane of the Earth and the X-axis points toward the vernal equinox direction. The Z-axis points toward the north pole and the Y -axis completes the right hand coordinate systems.
Proc. of SPIE Vol. 8044 80440E-2 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
The Earth centered Earth fixed (ECEF) coordinate system has the same XY plane and the Z-axis as in the inertial coordinate system. However, its X-axis rotates with the Earth and points to the prime meridian and the Y -axis completes the right hand coordinate systems. The local Cartesian system commonly referred to as east-north-up (ENU) coordinate system has its origin at some point on the Earth surface or above (typically at the location of an observer). Its Z-axis is normal to the Earth’s reference ellipsoid defined by the geodetic latitude. The X-axis points toward the east while the Y -axis points toward the north.
2.2. Orbital Equation and Orbital State Estimation Without any perturbing force, the position r of a space object relative to the center of the Earth in ECI coordinate system should satisfy μ r (1) r¨ = − ||r||3 Δ Δ where μ is the Earth’s gravitational parameter. The object’s velocity is v = r˙ and the radial velocity is vr = Δ
v · r r
where r = ||r|| is the distance from the object to the center of the Earth. In order to determine the position and velocity of a satellite at any time, six parameters are needed, typically, the three position components and three velocity components at any given time. Alternatively, the orbital trajectory can be conveniently described by the six components of the Keplerian elements [12,35]. In reality, a number of forces act on the satellite in addition to the Earth’s gravity. To distinguish them from the central force created by the satellite, these forces are often referred to as perturbing forces. In a continuous time state space model, perturbing forces are often lumped into the noise term of the system dynamics. Denote by x(t) the continuous time target state given by ⎡ ⎤ x(t) y(t) ⎥ ⎢ ⎢ ⎥ r(t) Δ ⎢ z(t) ⎥ x(t) = ˙ (2) =⎢ ⎥ r(t) ⎢ vx (t) ⎥ ⎣ v (t) ⎦ y
vz (t) For convenience, we omit the argument t and write the nonlinear state equation as follows.
where
x˙ = f (x) + w
(3)
T f (x) = vx vy vz − (μ/r3 )x − (μ/r3 )y − (μ/r3 )z
(4)
and
T
w = [0 0 0 wx wy wz ]
(5)
is the acceleration resulting from perturbing forces. As opposed to treating the perturbing acceleration as noise, the spacecraft general propagation (SGP) model maintains general perturbation element sets and finds analytical solution to the satellite motion equation with time varying Keplerian elements [27]. For precise orbit determination, numerical integration of (3) is often a viable solution where both the epoch state and the force model have to be periodically updated when a new measurement is available [35].
2.3. Tracking Maneuvering Space Object 2.3.1. Sensor Measurement Model Space based visible (SBV) sensors have recently become a viable tool to collect measurements from space borne observers. A satellite with EO/IR SBV sensor may obtain higher resolution measurements than a ground based radar when an object has close range within its field of view. An SBV sensor can provide angle measurements with respect to the observer’s local coordinate system. Typically, a radar can provide the following types of measurements: range, azimuth, elevation and range rate. The range between the i-th observer located at (xi , yi , zi ) and the space object located at (x, y, z) is given by (6) dr (i) = (x − xi )2 + (y − yi )2 + (z − zi )2 .
Proc. of SPIE Vol. 8044 80440E-3 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
The azimuth is −1
da (i) = tan
The elevation is −1
de (i) = tan The range rate is dr˙ (i) =
y − yi x − xi
.
z − zi
(7)
(x − xi )2 + (y − yi )2
.
(x − xi )(x˙ − x˙ i ) + (y − yi )(y˙ − y˙ i ) + (z − zi )(z˙ − z˙i ) . dr
(8)
(9)
Measurements from the i-th observer will be unavailable when the line-of-sight path between the observer and the object is blocked by the Earth. The condition of Earth blockage is examined assuming spherical Earth with radius RE . If there exist α ∈ [0, 1] such that Dα (i) < RE , where Dα (i) = [(1 − α)xi + αx]2 + [(1 − α)yi + αy]2 + [(1 − α)zi + αz]2 , (10) then the measurement from the i-th observer to the object will be unavailable. The minimum of Dα (i) is achieved at α = α∗ given by xi (x − xi ) + yi (y − yi ) + zi (z − zi ) α∗ = − . (11) (x − xi )2 + (y − yi )2 + (z − zi )2 Thus we first examine whether α∗ ∈ [0, 1] and then check the Earth blockage condition Dα∗ (i) < RE . 2.3.2. Game Theoretic Formulation for Maneuvering Onset Time When an object changes its orbit, the detection of maneuvering onset time depends critically on the number of sensor measurements. Here we consider the case that a single observer tracks a single object. Initially, the observer knows the object’s state and the object also knows the observer’s state. Assume that the object can only apply a T -second burn that produces a specific thrust w with a maximum acceleration of a m/s2 . The goal of the object is to determine the maneuvering onset time and the direction of the thrust so that the resulting orbit will have the maximum duration of the Earth blockage to the observer. The goal of the observer is to maintain the track with the best estimation accuracy. To achieve this, the observer has to determine the sensor revisit time and cuing region as well as notify other observers having better geometry when Earth blockage occurs. Without loss of generality, we assume that the object can transfer its orbit to the same plane as the observer. In this case, when the object is at the opposite side of the Earth with respect to the observer and rotating in the same direction as the observer, the duration of the Earth blockage will be the maximum compared with other orbits with the same orbital elements except the inclination. Note that in the pursuit-evasion game confined to a two dimensional plane, the minimax solution requires that the evasive object applies the same thrust angle as the observer’s [30]. Thus an intelligent adversary will choose its maneuvering onset time as soon as its predicted observer’s orbital trajectory has the Earth blockage. The corresponding maneuvering thrust will follow the minimax solution to the pursuit-evasion game. When an object is tracked by multiple space borne observers, an observer can predict the object’s maneuvering motion based on its estimated orbital state and the corresponding response of the pursuit-evasion game from the object where the terminal condition will lead to the Earth blockage to the observer. Thus there is a need for the sensor manager to select the appropriate set of sensors that can persistently monitor all the objects especially when adversarial objects perform lethal maneuvers that can result in collision to asset satellites.
2.4. Nonlinear Filter Design for Space Object Tracking When a space object has been detected, a tracking filter will predict the object’s state at any time in the future based on the available sensor measurements. Despite the abundant literature on nonlinear filter design ˆ− [9,13,14,17,21,28,32], we chose the following tracking filter based on our earlier study [10]. Denote by x k the + ˆ k−1 at time tk−1 with all measurements state prediction from time tk−1 to time tk based on the state estimate x up to tk−1 . The prediction is made by numerically integrating the state equation given by ˆ˙ (t) = f (ˆ x x(t))
Proc. of SPIE Vol. 8044 80440E-4 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
(12)
without process noise. The mean square error (MSE) of the state prediction is obtained by numerically integrating the following matrix equation T P˙ (t) = F (ˆ x− x− (13) k )P (t) + P (t)F (ˆ k ) + Q(t) where F (ˆ x− k ) is the Jacobian matrix given by
F (x) = ⎡ ⎢ F0 (x) = μ ⎣
03×3 F0 (x)
3x2 1 r5 − r3 3xy r5 3xz r5
I3 03×3
3xy r5 2
3y r5
−
3yz r5
1 r3
r = x2 + y 2 + z 2
and evaluated at x =
ˆ− x k.
,
(14) 3xz r5 3yz r5 2
3z r5
−
⎤ 1 r3
⎥ ⎦,
(15)
(16)
The measurement zk obtained at time tk is given by
where
zk = h(xk ) + vk
(17)
vk ∼ N (0, Rk )
(18)
is the measurement noise, which is assumed independent of each other and independent to the initial state as well as process noise. The recursive linear minimum mean square error (LMMSE) filter applies the following update equation [5]
Δ ˆ k|k = E ∗ xk |Zk = x ˆ k|k−1 + Kk z˜k|k−1 x
Pk|k = Pk|k−1 − Kk Sk Kk where ˆ k|k−1 x
=
ˆ zk|k−1 ˜ k|k−1 x ˜ zk|k−1
= = =
Pk|k−1
=
Sk
=
Kk
=
Cx˜ k ˜zk
=
(19) (20)
E ∗ xk |Zk−1
E ∗ zk |Zk−1 ˆ k|k−1 xk − x zk − ˆzk|k−1 ˜ k|k−1 ˜ k|k−1 x E x E ˜zk|k−1 ˜zk|k−1 Cx˜ k ˜zk Sk−1 ˜ k|k−1 ˜zk|k−1 . E x
Note that E ∗ [·] becomes the conditional mean of the state for linear Gaussian dynamics and the above filtering equations become the traditional Kalman filter [5]. For a nonlinear dynamic system, (19) is optimal in the mean square error sense when the state estimate is constrained to be an affine function of the measurement. Given ˆ k−1|k−1 and its error covariance Pk−1|k−1 at time tk−1 , if the state prediction x ˆ k|k−1 , the the state estimate x corresponding error covariance Pk|k−1 , the measurement prediction zˆk|k−1 , the corresponding error covariance Sk , ˆ k−1|k−1 ˜ k|k−1 z˜k|k−1 in (19) and (20) can be expressed as a function only through x and the crosscovariance E x and Pk−1|k−1 , then the above formula is truly recursive. However, for general nonlinear system dynamics (3) and measurement equation (17) , we have tk
ˆ k|k−1 x
= E∗
zˆk|k−1
= E ∗ h(xk , vk )|Zk−1 .
tk−1
f (x(t), w(t))dt + xk−1 |Zk−1
Proc. of SPIE Vol. 8044 80440E-5 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
(21) (22)
ˆ k|k−1 and zˆk|k−1 will depend on the measurement history Zk−1 and the corresponding moments in the Both x LMMSE formula. In order to have a truly recursive filter, the required terms at time tk can be obtained ˆ k−1|k−1 and Pk−1|k−1 , i.e., approximately through x
ˆ k|k−1 , Pk|k−1 ≈ Pred f (·), x ˆ k−1|k−1 , Pk−1|k−1 x
ˆ k|k−1 , Pk|k−1 ˆ zk|k−1 , Sk , Cx˜ k ˜zk ≈ Pred h(·), x
ˆ k−1|k−1 , Pk−1|k−1 propagates through the nonlinear funcˆ k−1|k−1 , Pk−1|k−1 denotes that x where Pred f (·), x
tion f (·) to approximate E ∗ f (·)|Zk−1 and the corresponding error covariance Pk|k−1 .
ˆ k|k−1 , Pk|k−1 predicts the measurement and the corresponding error covariance only Similarly, Pred h(·), x through the approximated state prediction. This poses difficulties for the implementation of the recursive LMMSE filter due to insufficient information. The prediction of a random variable going through a nonlinear function, most often, can not be completely determined using only the first and second moments. Two remedies are often used: One is to approximate the system via unscented transform such that the prediction based on the ˆ k−1|k−1 , Pk−1|k−1 [22,23]. Another is by approximating approximated system can be carried out only through x the density function with a set of particles and propagating those particles in the recursive Bayesian filtering framework, i.e., using a particle filter [15,20,19].
3. SENSOR MANAGEMENT FOR COLLISION ALERT 3.1. Sensor Selection with Known Sensor Model We formalize the sensor management problem in a generic context. Suppose a set of sensors has been deployed to V locations with the task of monitoring potential collision. Practical constraints on communication bandwidth or power consumption require us to select a subset A of these sensors to perform the sensing actions according to a known sensing quality function f (A). One possible choice of f can be related to the prediction accuracy for tracking space objects. For sensor deployment, one can have a probabilistic model P (xs ) for every location s ∈ V so that the joint probability distribution P (xV ) models the correlation among different sensing configurations. Note that xV denotes a random vector over all measurements. When some measurements xA = zA are obtained from a subset of V locations, then the conditional distribution P (xV \A ) allows to make prediction by using E[xV \A |xA = zA ]. The mean square prediction error given by MSE(xV \A |zA ) = E (xV \A − E[xV \A |zA ])2 |zA can be used to justify the selection of sensor subset A. However, the measurement zA is in general unavailable before we choose sensor subset A to take the sensing action. If one has a probabilistic model for each sensing action, then the expected mean square prediction error is EMSE(A) = MSE(xV \A |zA )p(zA )dzA It can serve as the objective function for sensor selection. Specifically, one can maximize the reduction in mean square prediction error fEMSE(A) = EMSE(φ) − EMSE(A) subject to practical constraints. Typically, fEMSE is monotonic, i.e., fEMSE(A) ≤ fEMSE(B) when A ⊆ B ⊆ V . However, if one can only choose at most k sensors to maximize the average reduction of the mean square prediction error, then the optimization problem becomes A∗ = arg max fEMSE(A) s.t. |A| ≤ k A
Unfortunately, this problem is NP hard. Thus finding the optimal solution essentially requires to exhaustively search all possible sensor combinations with a cardinality of at most k. Fortunately, when f satisfies the submodular property, then efficient algorithm can achieve near optimal solution with quantifiable performance gap. A function f : 2V → R is called submodular if for all A ⊆ B ⊆ V and s ∈ V \B, we have f (A ∪ {s}) − f (A) ≥ f (B ∪ {s}) − f (B)
Proc. of SPIE Vol. 8044 80440E-6 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
It means that adding more sensors will have diminishing return to the objective function. For tracking a single object with sensors of the same type, one can show that fEMSE is a submodular function for arbitrary sensor constellation [1]. Surprisingly, a simple greedy algorithm starting with A0 = φ and adding a new sensor sj iteratively by sj = arg max f (Aj−1 ∪ {s}), Aj = Aj−1 ∪ {sj }, j = 1, ..., k s∈V \Aj−1
achieves at least 63% of the optimal value when f is submodular [31]. Theorem 1: Let Ak be the sensor subset selected using the above greedy algorithm. Then for any submodular function f , it holds that f (Ak ) ≥ (1 − 1/e) max f (A) |A|≤k
3.2. Online Sensor Selection Problem In order to select the best sensor subset, the objective function f has to be specified in advance. For maximal prediction error reduction in terms of fEMSE, one has to know P (xV ) in advance. While for some applications the prior knowledge of P (xV ) may be available, in realistic situations, sensors have to be activated to collect data in order to learn the sensing model. Note that the learned sensing model will be used for sensor management. As a concrete example, one needs to detect a space object and then activates sensors close to its predicted location. However, prior knowledge about the object’s location before obtaining any sensory data can be very questionable. We assume that the sensor selection takes T time steps and we expect to achieve satisfactory performance at the end without knowing f explicitly. At each time step t, if a set St of sensors is selected, we will know ft (St ) as the indicator of sensing quality. We seek to develop an algorithm for selecting St at each time step t so that the average performance of the algorithm is competitive to the solution that optimizes f directly, e.g., using the greedy algorithm with a known objective function. Ideally, the best sensor selection strategy can achieve the total reward given by T max ft (S) S⊆V,|S|≤k
t=1
The difference between the ideal reward and the actual reward achieved by the sensor selection algorithm is called the regret RT of the sensor selection algorithm. An algorithm is called no-regret algorithm if lim supT →∞ RT /T ≤ 0. When k = 1, the problem becomes the well known multi-armed bandit (MAB) problem [2]. In the classical MAB setting, there is a slot machine with multiple arms, where each arm generates a random payoff with unknown mean. The goal is to find a strategy for pulling arms to maximize the total reward. Known algorithms n log n can achieve average regret per time step in O where n is the number of arms. Suppose that we want T to select at most k sensors at every time step. The regret bound will be exponential in k, thus making the MAB results not applicable to the online sensor selection problem. Fortunately, when f is submodular, the greedy algorithm can obtain a (1 − 1/e) approximation to the optimal solution with f being known a priori because the average of submodular functions remains to be submodular. Thus we can not expect any efficient online sensor T selection algorithm to be better than (1 − 1/e) maxS⊆V,|S|≤k t=1 ft (S) on average. Consider the sensor subsets {St }Tt=1 generated by the online algorithm and the (1 − 1/e)-regret given by RT∗ = (1 − 1/e)
max
S⊆V,|S|≤k
T
ft (S) −
t=1
T
E[ft (St )]
t=1
where the expectation is over the distribution for each St since the algorithm may produce a random subset of sensors in each time step. We call an online algorithm (1 − 1/e)-no-regret algorithm if lim supT →∞ RT /T ≤ 0.
3.3. Centralized Online Sensor Selection We first consider the existence of efficient algorithm to achieve (1 − 1/e)-no-regret for submodular function f . In the case of k = 1, we only select one sensor in each time step. This is the classical MAB problem. At each time step t, a no-regret algorithm selects sensor s with probability ws ps = (1 − γ) |V |
j=1
+ wj
γ n
Proc. of SPIE Vol. 8044 80440E-7 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
where γ ∈ (0, 1) controls the exploration probability for each sensor. The weights will be updated by ws (t + 1) = ws (t)eηft ({s})/ps when sensor s is selected at time step t and the value ft ({s}) becomes available. Note that η controls the learning rate that exploits the probability of selecting sensor s with better sensing √ quality. It can be shown that by appropriately choosing η and γ, one can achieve the cumulative regret in O( T n log n) [2]. Thus in the long run, the best sensor will be selected with probability close to 1. In principle, we can interpret the sensor selection problem as an MAB problem and exhaustively enumerate all possible sensor subsets using the above no-regret algorithm. However, this approach does not scale well since the number of arms grows exponentially in k. When f is submodular, we can exploit the greedy algorithm for no-regret learning in MAB problem. Specifically, we apply k no-regret algorithms E1 , E2 , ..., Ek in parallel, each with action set V . At time step t, if Ei chooses an action vi (t), then we can compute the marginal gain ft ({vj (t) : j ≤ i}) − ft ({vj (t) : j < i}) and feedback to Ei . This indicates how much additional utility can be obtained by adding sensor si to the set of sensors already selected. It replaces the feedback ft ({s}) in the weight update for ws in the standard no-regret algorithm. One can show that the modified k-no-regret learning algorithm has a (1 − 1/e)-regret bound of O(kR) if each algorithm Ei yields the expected regret at most R [2].
3.4. Distributed Online Sensor Selection We assume that at any time step t, each sensor s ∈ V can compute its contribution to the utility ft (S ∪ {v}) − ft (S) where S is the subset of sensors which have already been selected. We also assume that each sensor can communicate to all other sensors, which can be achieved when sensors have calibrated clocks and unique identifiers. The goal is to develop a distributed sensor selection algorithm that is efficient and achieves close to (1 − 1/e)-no-regret in the online learning context. For detecting object’s maneuver, once the previously selected sensors have announced which objects they already detected, the new sensor s is able to compute the improvement in mean square prediction error over the previously collected data based on the game theoretic model. To extend the centralized algorithm to a distributed sensor management environment, we first consider the distributed implementation of the MAB problem where only one sensor is selected in each time step. This is equivalent of showing a way to sample n sensors with probability distribution {ps } in a distributed manner. A naive distributed algorithm would be to let each sensor keep track of all activation probabilities. Then one sensor with the lowest number of the identifier would broadcast a single random number u uniformly distributed s−1 s in [0, 1] and sensor s for which i=1 pi ≤ u < i=1 pi would be activated. However, this method requires that each sensor stores a large amount of global information, i.e., all activation probabilities p. On the other hand, if each sensor s stores only its or probability mass ps , then sensors would have to broadcast their {ps } according to their identifiers and stop when the sum of the probabilities exceeds u. Note that in order the maintain a constant amount of local information, the sensor selection procedure requires Θ(n) messages to be sent sequentially over Θ(n) time steps, which makes the distributed implementation impractical for large n. We consider an alternative sampling method where we select sensor s with probability pˆs . A simple distributed algorithm would activate each sensor s ∈ V independently with probability pˆs . Clearly, there is a nonzero probability that no sensor is activated or more than one sensor are activated. Using a synchronized clock, all sensors will know when there is no sensor being activated. In this case, one can simply repeat the procedure until one sensor is activated. When more than one sensor are activated, one can randomly pick one sensor with equal probability. One possible choice for sensor s is to draw a sample Xs from Poisson distribution with parameter αps for some α > 0. If Xs ≥ 1, then activate sensor s. This ensures that for ps > 0 and α > 0, with probability at least (1 − e−α )ps , sensor s will be selected while the expected messages required for selecting a single sensor is α. Now we can extend the distributed MAB algorithm to online sensor selection problem with efficient distributed implementation that exploits the greedy algorithm to maximize the submodular objective function. At each time step t, we need to select k sensors. Each sensor s has to maintain k weights denoted by ws,1 , ..., ws,k and k normalizing constants denoted by Zs,1 , ..., Zs,k . The algorithm passes messages in k stages synchronized using a common clock. At stage i, a single sensor is selected using random sampling method applied to sensor s with the weight distribution (1 − γ)ws,i /Zs,i + γ/n. Suppose the subset S = {s1 , ..., si− } contains the sensors
Proc. of SPIE Vol. 8044 80440E-8 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
selected in stages from 1 to i − 1. Then sensor s selected at stage i needs to compute its local reward πs,i using ft (S ∪ {s}) − ft (S) and update its weight by ws,i (t) = ws,i (t − 1)eηπs,i /ps,i The sensor has to broadcast the difference between its new and old weights given by δs,i (t) = ws,i (t) − ws,i (t − 1) so that all sensors will update their normalizing constants by Zs,i (t) = Zs,i (t − 1) + δs,i (t), s = 1, ..., n Theorem 2: The distributed k sensor selection algorithm described above achieves (1 − 1/e)-no-regret with message passing among sensors in O(k) on average for each time step. Specifically, we can bound the (1 − 1/e)regret by T T 1 − 1/e 1 max ft (S) − E ft (St ) ≥ O k n log n/T T T |S|≤k t=1 t=1 where St is the set of k sensors selected by the distributed algorithm at time step t. In essence, the distributed algorithm can learn the objective function online and in the long run, yielding comparable performance to the sensing tasks achieved using the optimal centralized sensor selection scheme with a known objective function (often computationally demanding for large n and k).
4. SIMULATION STUDY 4.1. Scenario Description We consider a small scale space object tracking and collision alert scenario where 30 LEO observers collaboratively track 3 LEO satellites (called red team) and monitor 5 LEO asset satellites (called blue team). The orbital trajectories are created with the same altitude similar to those real satellites from the North American Aerospace Defense Command (NORAD) catalog, but we can change the orbital trajectories to generate a collision event between an object from the red team and an object from the blue team. The associated tracking errors for each object in the red team were obtained based on the recursive LMMSE filter when sensors are assigned to objects according to some criterion based on the non-maneuvering motion. We assume that the orbital trajectories of LEO observers and blue team are known to red team. We also assume that each observer can schedule the sensor scan every 50 seconds. However, at any time, at most 5 observers can be activated. The sensor selection is based on with weighted information gain with weights being proportional to the estimated collision probability over the impact time. The estimation of collision probability and impact time was presented in [29]. Note that the objective function does not satisfy the submodular property because of the state dependent weights. Red team may direct an unannounced object to perform intelligent maneuver that changes the inclination of its orbit. In particular, at time t = 1000s, object 1 performs a 1s burn that produces a specific thrust which leads to a collision to object 3 in the blue team in 785 seconds. At time t = 1523s, object 2 performs a 1s burn that produces a specific thrust which leads to a collision event to object 5 in the blue team in 524 seconds. Note that the maneuver onset time of object 2 is chosen to have the Earth blockage of the closest 3 LEO observers for more than 200 seconds. The maneuver is also lethal because of the collision path to the closest asset satellite in less than 9 minutes. Within 1000s and 2000s, object 3 performs a 1s burn with random maneuver onset time that does not lead to a collision. The goal of sensor selection is to improve the tracking accuracy and declare the collision event as early as possible with false alarm below a desirable rate. Each observer has range, bearing, elevation and range rate measurements with standard deviations 100m, 10mrad, 10mrad, 2m/s, respectively. We applied the generalized Page’s test (GPT) for maneuver onset detection while the filter update of the state estimate does not use the range rate measurement [33]. The use of the GPT is required because the nonlinear filter designed assuming non-maneuver motion is sensitive to the model mismatch in the range rate when a space object maneuvers. The thresholds of the GPT was chosen to have the false alarm probability PF A = 1%.
Proc. of SPIE Vol. 8044 80440E-9 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
4.2. Performance Comparison We studied three different sensor management (SM) configurations. (i) Information based method: Sensors are selected with a uniform sampling interval of 50s to maximize the total information gain. (ii) Centralized game theoretic method: Sensors are selected to maximizes the weighted information gain with respect to the worst case maneuver onset time being determined by the pursuit-evasion game. (iii) Distributed game theoretic method: Sensors are selected using online distributed algorithm without knowing the objective function a priori. We ran 200 Monte Carlo simulations on the tracking and collision alert scenario for each SM configuration and compare both tracking and collision alert performance as opposed to the criteria used in the SM schemes. Table 1 shows the peak errors in position and velocity for each object in red team based on the centralized tracker with three SM schemes. The average detection delays for each object are also shown in Table 1. We can see that both the maneuver detection delay and average peak estimation error are larger using the conventional SM scheme (i) than the game theoretic ones (ii) and (iii) for object 2 due to its intelligent choice of the maneuvering onset time. Interestingly, the performance degradation is quite mild for the distributed SM scheme compared with its centralized counterpart. Table 1. Comparison of tracking accuracy and maneuver detection delay. object (i) average delay (s) (i) average peak position error (km) (i) average peak velocity error (km/s) (ii) average delay (s) (ii) average peak position error (km) (ii) average peak velocity error (km/s) (iii) average delay (s) (iii) average peak position error (km) (iii) average peak velocity error (km/s)
1 124 23.7 0.28 134 24.8 0.31 166 26.5 0.32
2 424 54.8 0.40 155 26.4 0.32 187 29.3 0.34
3 89 14.5 0.22 94 14.6 0.21 102 16.2 0.25
Next, we compare the collision detection performance as well as the average time between the collision alert and its occurrence. We also compute the average number of scans required to declare a collision event starting from the maneuver onset time. A collision alert will be declared when the closest encounter of two space objects is within 10km with at least 99% probability based on the predicted orbital states. The false alarm probability is estimated from the collision declaration occurrence between object 3 and any of the asset satellites. The performance of collision alert with three SM schemes is shown in Table 2. We can see that the non-game based method (configuration (i)) yields much smaller collision detection probability for object 2. Among those collision declarations for object 2, the average duration between the collision alert and the actual encounter time is much shorter using configuration (i) than using configurations (ii) and (iii). Thus blue team will have limited response time in choosing the appropriate collision avoidance action. This is mainly due to the long delay of detecting target maneuver thus leading to large tracking error as seen in Table 1. In contrast, centralized game-theoretic method (configuration (ii)) achieves much more accurate collision detection with longer early warning time on average. It is worth noting that the distributed method (configuration (iii)) yields slightly worse performance than that of configuration (ii) due to lack of knowledge on the objective function. Nevertheless, configuration (iii) is computationally more efficient and yields satisfactory performance even for optimizing the non-submodular objective function.
5. SUMMARY AND CONCLUSIONS We considered the sensor management problem for tracking space objects and making collision alert based on the estimated orbital state and the known asset satellites. Since orbital maneuver onset time and motion are important in the early warning of a potential collision, we modeled the intelligent adversary that can intentionally collide with one of the asset satellites with evasive motion by exploiting the Earth blockage condition to the observers. We compared the sensor management schemes using conventional information gain based criterion and the game theoretic formulation for the sensor selection. We also provided an online version of the distributed SM scheme for selecting the best sensor subset efficiently with a provably optimal performance gap. Using a realistic
Proc. of SPIE Vol. 8044 80440E-10 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
Table 2. Performance comparison of collision detection probability and average early-warning duration. configuration (i) object 1 (i) object 2 (ii) object 1 (ii) object 2 (iii) object 1 (iii) object 2
detection probability 0.87 0.32 0.92 0.83 0.88 0.79
false alarm probability 0.04 0.05 0.04 0.02 0.04 0.03
average duration (s) 564 132 578 328 542 337
average scans 2.8 2.4 2.6 2.2 2.5 2.4
LEO-to-LEO tracking scenario, we found that the game-theoretic SM scheme outperforms the non-game based scheme in terms of tracking accuracy, detection delay of maneuver onset, as well as the precision and timeliness of the collision alert. Interestingly, the distributed SM scheme yields only slight performance degradation as compared with the centralized solution even when the objective function is not strictly submodular. We expect that the distributed SM scheme scales well to real world space surveillance problems where thousands of space objects have to be continuously monitored with many ground based and space borne sensing resources.
REFERENCES [1] Z. Abrams, A. Goel, S. Plotkin, “Set k-Cover Algorithms for Energy Efficient Monitoring in Wireless Sensor Networks”, Proc. of Information Processing in Sensor Networks, pp. 424–432, 2004. [2] P. Auer, N. Cesa-Bianchi, Y. Freund, R. E. Shapire, “The Nonstochastic Multiarmed Bandit Problem”, SIAM Journal of Computing 32, pp. 48–77, 2002. [3] Y. Bar-Shalom, and X. R. Li, Multitarget-Multisensor Tracking: Principles and Techniques, YBS Publishing, 1995. [4] Y. Bar-Shalom, and W. D. Blair (editors), Multitarget-Multisensor Tracking: Applications and Advances, vol. III, Artech House, 2000. [5] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan, Estimation with Applications to Tracking and Navigation: Algorithms and Software for Information Extraction, Wiley, 2001. [6] R. Bate, et al, Fundamentals of Astrodynamics, New York, Dover, 1971. [7] S. Blackman, and R. Popoli, Design and Analysis of Modern Tracking Systems, Artech House, 1999. [8] S. Bolognani, L. Tubiana, and M. Zigliotto, “Extended Kalman Filter Tuning in Sensorless PMSM Drives”, IEEE Trans. Industrial Applications, vol. 39, pp. 1741-1747, 2003. [9] S. Carme, D.-T. Pham, and J. Verron, “Improving the Singular Evolutive Extended Kalman Filter for Strongly Nonlinear Models for Use in Ocean Data Assimilation”, Inverse Problems, vol. 17, pp. 1535-1559, 2001. [10] H. Chen, G. Chen, E. Blasch, and K. Pham, “Comparison of several space target tracking filters”, Proc. SPIE, 7730, 2009. [11] H. Chen, G. Chen, E. Blasch, and K. Pham, “Space Target Tracking with Delayed Measurements”, Proc. of SPIE, 7691, 2010. [12] H. D. Curtis, Orbital Mechanics for Engineering Students, Amsterdam, The Netherlands: Elsevier, 2005. [13] F. E. Daum, “Exact Finite-Dimensional Nonlinear Filters”, IEEE Trans. Automatic Control, vol. 31, pp. 616–622, 1986. [14] F. E. Daum, “Nonlinear Filters: Beyond the Kalman Filter”, IEEE Aerospace and Electronic Systems Magazine, vol. 20, pp. 57–69, 2005. [15] A. Doucet, N. de Frietas, and N. Gordon, editors, Sequential Monte Carlo Methods in Practice, Statistics for Engineering and Information Science, Springer-Verlag, New York, 2001.
Proc. of SPIE Vol. 8044 80440E-11 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms
[16] N. Duong, C. B. Winn, “Orbit Determination by Range-Only Data”, Journal of Spacecraft Rockets, 10, pp. 132–136, 1973. [17] G. Evensen, Data Assimilation: The Ensemble Kalman Filter, New York: Springer-Verlag, 2006. [18] J. L. Fowler, J. S. Lee, “Extended Kalman Filter in A Dynamic Spherical Coordinate System for Space based Satellite Tracking”, Proc. AIAA 23rd Aerospace Sciences Meeting, AIAA-85-0289, Reno, NV, 1985. [19] W. R. Gilks, and C. Berzuini, “Following A Moving Target—Monte Carlo Inference for Dynamic Bayesian Models”, J. Royal Stat. Soc. B, 63, pp. 127–146, 2001. [20] N. Gordon, D. Salmond, and A. Smith, “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings - F., 140(2), pp. 107–113, 1993. [21] P. L. Houtekamer, and H. L. Mitchell, “Data Assimilation Using An Ensemble Kalman Filter Technique”, Monthly Weather Rev., vol. 126, pp. 796-811, 1998. [22] S. Julier, J. Uhlmann, and H. F. Durrant-Whyte, “A New Method for the Nonlinear Transformation of Means and Covariances in Filters and Estimators”, IEEE Trans. Automatic Control, vol. 45, pp. 477-482, 2000. [23] S. Julier, and J. Uhlmann, “Unscented Filtering and Nonlinear Estimation”, Proc. of the IEEE, 92(3), pp. 401–422, 2004. [24] M. Kalandros, L. Y. Pao, “Covariance Control for Multisensor Systems”, IEEE Trans. Aerospace Electronic Systems, 38, pp. 1138–1157, 2002. [25] A. J. Krener, and W. Respondek, “Nonlinear Observers with Linearizable Error Dynamics”, SIAM J. Control and Optimization, vol. 23, pp. 197-216, 1985. [26] C. M. Kreucher, A. O. Hero, K. D. Kastella, M. R, Morelande, “An Information based Approach to Sensor Management in Large Dynamic Networks”, Proc. of IEEE, 95, pp. 978–999, 2007. [27] M. H. Lane, F. R. Hoots, “General Perturbations Theories Derived from the 1965 Lane Drag Theory”, Project Space Track Report No. 2, Aerospace Defense Command, Peterson AFB, CO, 1979. [28] X. R. Li and V. P. Jilkov, “A Survey of Maneuvering Target Tracking: Approximation Techniques for Nonlinear Filtering”, In Proc. of SPIE, 5428, 2004. [29] A. Maus, H. Chen, A. Oduwole, D. Charalampidis, “Designing Collision Alert System for Space Situational Awareness”, 20th ANNIE Conf., St. Louis, MO, 2010. [30] P. E. Moraal, J. W. Grizzle, “Observer Design for Nonlinear Systems with Discrete-Time Measurements”, IEEE Trans. Automatic Control, vol. 40, pp. 395-404, 1995. [31] G. L. Nemhauser, L. A. Wolsey, M. L. Fisher, “An Analysis of Approximations for Maximizing Submodular Set Functions - I”, Mathematical Programming, 14, pp. 265–294, 1978. [32] V. L. Pisacane, R. J. Mcconahy, L. L. Pryor, J. M. Whisnant, H. D. Black, “Orbit Determination from Passive Range Observations”, IEEE Trans. Aerospace Electronic Systems, 10, pp. 487–491, 1974. [33] J. Ru, H. Chen, X. R. Li, and G. Chen, “A Range Rate Based Detection Technique for Tracking A Maneuvering Target”, Proc. of SPIE, 5913, 2005. [34] B. O.S. Teixeira, M. A. Santillo, R. S. Erwin, and D. S. Bernstein, “Spacecraft Tracking Using Sampled-Data Kalman Filters - An Illustrative Application of Extended and Unscented Estimators”, IEEE Control Systems Magazine, pp. 78–94, 2008. [35] D. A. Vallado, Fundamentals of Astrodynamics and Applications, 2nd Ed., Microcosm Press, El Segundo, CA, 2001. [36] F. Zhao, J. Shin, G. Reich, “Information-Driven Dynamic Sensor Collaboration for Tracking Applications”, IEEE Trans. Signal Processing, 19, pp. 61–72, 2002.
Proc. of SPIE Vol. 8044 80440E-12 Downloaded From: http://spiedigitallibrary.org/ on 09/05/2013 Terms of Use: http://spiedl.org/terms