Recursive Online EM Algorithm for Adaptive Sensor Deployment and Boundary Estimation in Sensor Networks Zhen Guo,MengChu Zhou, Fellow, IEEE and Guofei Jiang Member, IEEE place sensors optimally to locate the mass objects with the maximal coverage probability and resolution by deploying a limited number of sensors only. Fig. 1 illustrates an example of a sensor network monitoring a large number of trees and vehicles in a wild area. In such cases, we focus on the locations of mass objects instead of individual objects. Even though the individual objects are moving frequently, the topology of the mass objects is less likely to change very fast.
Abstract-More and more sensor networks are required to monitor and track a large number of objects. Since the topology of mass objects is often dynamic in the real world, their boundary estimation and sensor deployment should be conducted in an adaptive manner. The “current” locations of objects detected by sensors are deemed as new observations into stochastic learning process through recursive distributed EM (Expectation-Maximization) algorithm. This paper first builds a probabilistic Gaussian Mixture model to estimate the mixture distribution of objects locations and then proposes a novel methodology to optimize the sensor deployment and estimate the boundary of objects locations dynamically. Key Words-Sensor deployment, EM maximal likelihood, boundary estimation
With the recent advances in sensor technologies, we are allowed to make use of mobile sensors, which can move to the correct places to provide the required coverage. They are used to detect targets collaboratively and monitor environments across the area of deployment. As the mass objects’ topology changes slowly, they are capable of acknowledging and moving to the desired place. The coverage may be inferior to the application requirement. Objects’ location estimation methods based on signal strength received from sensors have been intensively proposed and implemented in [4][5][6]. The positions are calculated by modeling signal propagation, which requires adequate signal coverage. Hence, one of the most important issues in sensor networks for monitoring and tracking mass objects is the selection of sensors locations. Proper sensor deployment is needed to provide adequate signal coverage and also maximize the probability of accurate detection and localization for the whole mass of objects. Sensor placement directly influences resource management and the type of back-end processing and exploitation that must be carried out with sensed data in distributed sensor networks. A key challenge in sensor resource management is to determine a sensor field architecture that optimizes cost [12], and provides high sensor coverage and resilience to sensor failures. The coverage optimization is inherently probabilistic due to the uncertainty associated with sensor detections. This paper proposes a methodology for maximizing the coverage probability under the constraints of the limited number of sensors and signal strength.
algorithm,
I. INTRODUCTION
W
IRELESS sensor networks have been under intensive researesearch. Sensor networks become a bridge between the physical world and the information systems. Since the set covering problem is NP-hard, the proper sensor placement approaches with maximal coverage probability on location estimates are highly desired. Object tracking in sensor networks has received much research attention recently. Most of the work focuses on identifying and tracking one or more individual objects [1][2][3]. Unfortunately, sufficient research efforts have yet to conduct on the issues of monitoring and tracking a large number of objects in sensor networks. It is in demand that a large number of objects be monitored and tracked concurrently by a sensor network. There are some examples of monitoring and tracking a large number of objects, such as a surveillance system for airport to monitor many people in a public area, vehicles in a highway and wild animals. These objects with large population, referred as mass objects in this paper, are usually distributed in a certain way because several interest points are attractive to them and thus located densely. In many real world applications it is necessary to
The rest of the paper is organized as follows. Section 2 formalize the problem of optimal sensor placement, and analyze the distribution of mass objects. The detection coverage is modeled as Gaussian Mixture Model. The EM and recursive EM solutions are presented to optimize the coverage problem based on Gaussian Mixture model in Section 3. Section 4 proposes a possible choice for the distributed implementation of EM algorithms in sensor networks. The details of simulation and coverage
Zhen Guo is with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, 07102, USA (phone: 973-642-7994; fax: 973-596-5680; email:
[email protected]) MengChu Zhou is with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, 07102, USA (email:
[email protected]) Guofei Jiang is with NEC Laboratory America, Princeton, NJ, 08540, USA (email:
[email protected])
1-4244-0065-1/06/$20.00 ©2006 IEEE
862
Authorized licensed use limited to: NEC Labs. Downloaded on May 5, 2009 at 11:17 from IEEE Xplore. Restrictions apply.
performance are discussed in Section 5. Section 6 concludes the paper with some discussions.
Figure 1. Based on the observations, there must be at least 3 points (regions) of interests, and each of them must be covered by at least one sensor.
Figure. 1 an example of sensor networks for mass objects monitoring
(a)
II. PROBLEM FORMULATION Accurate and computationally feasible sensor detection models are required for optimal sensor deployment. In this work, we start with the assumption used in [7] that the probability of detection of a target by a sensor varies exponentially with the distance between the target and the sensor. In other words, a target is correctly detected by a sensor distance d away with probability − ρd
, where ρ is a parameter used to model the of e quality and the rate at which its detection probability diminishes with distance. Obviously, the detection probability is equal to 1 if the target locates exactly where the sensor does. Recognizing that the covered regions are usually overlapped, an object may therefore be detected by several sensors. The probability of an individual object being precisely detected should be the mixture probability that sums up the conditional detection probability of a certain sensor multiplied with its mixture weight.
(b) Figure 2(a) Mass objects locations (b) Sensor deployment model Scatter points shown in Fig. 2 are observations of objects locations collected by sensors. Note that the observations are the previous positions where individual objects are located. During an interval of information collection, each individual object may appear at several different locations. With the objective to monitor mass objects, we focus rather on the distributions of clusters than an individual object’s movement. In this sense, one individual object may correspond to several observations because of the uncertainty in their movement. The observations, recording all locations where objects have been, are a good clue of building probability model that describes how likely the objects appear at a specific location. Here we may take “electron cloud” as an analogy, each position of observation does not necessarily mean that some object appears at that location currently. Rather, it describes how likely objects appear at the position. The observations are simply a combination of previous locations where objects have happened to be. For example, there are 1000 points (observations) in
In many practical instances, objects are symmetrically distributed around the point of interest. Such cases provide us with assumptions that the probabilistic model of locations where objects appears should be Gaussian. To improve the individual object’s detection and monitoring capability, the sensor has to be placed closer to the target. To monitor mass objects, we propose a method to reduce the sum of distances between sensors and all the objects in the covered area. Based on the above assumptions and analysis, the sensors should be located at the position with the local maximal object density in order to maximize the detection and monitoring performance. The observations of objects’ previous locations collected by sensors therefore can be used to estimate and learn the centers of object clusters and their boundaries. As an example, consider 3 groups of mass objects distributed in a 2-D covered area illustrated in 863
Authorized licensed use limited to: NEC Labs. Downloaded on May 5, 2009 at 11:17 from IEEE Xplore. Restrictions apply.
Figure 1(a), but in fact there may be only 500 objects present in the area. Sensors do not have to identify each individual object and to track how and when it moves. Instead, our objective is to analyze the historical observations of objects’ locations, learn the maximal likely distributions of the groups and find centers and boundaries of each cluster. In summary, we consider the problem of optimizing coverage for mass objects’ previous locations as a group in a statistical way rather than identifying and analyzing individual objects’ movement deterministically. Figure 1(b) illustrates an example of optimal sensor deployment to maximize the coverage performance. The center black points are the positions of sensors, and their covered areas are enclosed by three circles. In this section, we formulate this problem in statistics. Suppose that we have a set of observations
{zi } = {( xi, yi )}∈ S , i ≤ N
,
where
likelihood
function,
. This function
i =1
is also called the likelihood of the parameters given the data. The likelihood function of parameter guesses is in fact a measure of coverage likelihood for mass objects by limited sensors. In the above equations joint density arises from the marginal density function and the assumption of hidden variables and parameters value guesses [8]. The problem of optimization for coverage now turns out to be the maximization of the likelihood function. Θ = arg max Q(Θ) (3) Θ
By solving out the above equation, we could find the positions with local maximal density of each region of interests, and they are the optimal placement of sensors. To update dynamically the optimal sensor placement, we can use the maximum a posteriori solution (MAP) to estimate the dynamic distribution of objects. Similarly the model selection techniques are based on maximizing the following type of criteria: J ( M ,θ ( M )) = log(ζ (Θ | Ζ)) − P(M ) (4)
previous
that each object is Independent, Identical Distributed (IID) in the covered area. Hence, we have
where log(ζ (Θ | Ζ)) is the log-likelihood of the available data. This part can be maximized using maximal likelihood (ML) solution as mentioned above. However, introducing more sensors, hence increasing the mixture components always increase the log-likelihood but also introduce unnecessary redundant sensors. A penalty function P ( M ) is introduced to achieve the balance.
N
P ( S ) = ∏ p ( zi ) (1)
p( zi ) = ∑ α j p j ( zi | θ j ) j =1
where p( z j ) is the detection probability of an individual object. The parameter θ is the combination of mean value and variance(αμσ) of a cluster of observations which describe the distribution of a single cluster. Θ , to be mentioned below, is the whole set of
the
ζ (Θ | Ζ)) = p ( Ζ | Θ) = ∏ p ( zi | Θ)
detection probability for all objects, P ( S ) . We assume
i =1
define
N
locations of objects. In real scenarios, they are distributed similarly to Fig. 2(a). Our task is to maximize the total
M
we
III. STANDARD AND RECURSIVE EM ALGORITHM FOR OPTIMAL COVERAGE 3.1 Standard EM Algorithm for Fixed Infrastructure
θ ' s , α j is the
probability that object i is in the region covered by the j-th sensor and the conditional probability p( zi | θ j ) is the
The EM algorithm is an iterative procedure that searches for a local maximum of the log-likelihood function. In order to apply the EM algorithm for coverage optimization in sensor networks, the EM algorithm starts with initial observations and parameter estimate θ 0 , the
probability that zi can be accurately detected by the j-th sensor given that the location of the sensor j is known and object i is covered by sensor j. M is the number of mixture components In order to model the location distribution of these objects, obviously we can take advantage of the limited observations previously collected by the sensors. However, they are incomplete data due to the limitation of the observation process. Based on the limited knowledge, the maximum likelihood estimate of the parameters of an underlying distribution is desired from a given data set when the data is incomplete. The incomplete-data loglikelihood expression is as follows:
estimate θ k from the kth iteration of the EM algorithm is obtained using the previous estimate θ k −1 :
Q (θ k | θ k −1 ) = E (log p ( Z i ≤ k −1 , z k | θ ) | Z i ≤ k −1 , θ k −1 )
(5)
This step, referred as “expectation step (E-step)”, finds the expected value of the “complete-data” loglikelihood with respect to the unknown parameters given the observed positions and current parameter estimates. The above E-step equation can be expanded as follows:
Q (Θ) = log(ζ (Θ | Ζ))
N N M (2) = log ∏ p ( zi | Θ) = ∑ log ∑ α j p j ( zi | θ j ) i =1 i =1 j =1
864
Authorized licensed use limited to: NEC Labs. Downloaded on May 5, 2009 at 11:17 from IEEE Xplore. Restrictions apply.
Q (Θ | Θ g ) = ∑ log(ζ (Θ | Z g , z )) p ( z | Z g , Θ g )
Similarly,
z∈S
M
N
N
= ∑∑ log(α l pl ( zi | θl )) p (l | zi , Θ ) g
l =1 i =1 M
σ l new =
(6)
N
= ∑∑ log(α l ) p (l |zi , Θ g )
(α
where we assume that l is a random variable that label which region an individual object belongs to, and the superscript g means that the referred parameter is
can be
thought of as prior probabilities of each mixture component, that is α j = p (component j ) , which are uncorrelated with the observations of zi . Therefore, by applying Bayes’s rule, we can compute the “ownerships function” [9]:
g
pk ( z i | θ l g )
k =1
g
To maximize the expression, Θ = arg max Q (Θ | Θ ) , Θ
αl
and the term
containing θl independently since they are not related. This step is referred to as Maximization step (M-step). We introduce the Lagrange multiplier λ with the constraint that
∑α
= 1 , and solve the equation:
l
∂ M N log(α l ) p (l | zi , Θ g ) + λ (∑ α l − 1) = 0 ∑∑ ∂α l l =1 i =1 l We obtain that λ = − N 1 N α l new = ∑ p(l | zi , Θ g ) (8) N i =1
J ( M ,θ ( M )) = log(ζ (Θ | Ζ)) − P(M ) (11) = log p ( Z | θ ( M )) + log p (θ ( M )) where we introduce a prior log p (θ ( M )) for the
mixture parameters that penalize redundant sensor and complex solution. For the MAP solution, we have
Taking the derivative of the second term of equation (6) with respect to µl and setting it to zero, we have: N
µl
new
=
∑ z p (l | z , Θ i
i
i =1 N
∑ p(l | z , Θ i
g
g
parameters
The recursive EM algorithm is an online discounting version of EM algorithm. A stochastic discounting approximation procedure is conducted to estimate the parameters recursively and adaptively. In real-time monitoring and tracking of dynamic objects’ topology in sensor networks, recursive EM algorithm is better in the sense that the new observation updates the parameters estimate of the mixture with a forgetting factor degrading the influence of the out-of-date samples on the new estimates. Hence sensor networks implemented with recursive EM monitoring techniques are more capable of tracking the topological dynamics of mass objects. Unlike the standard EM using Maximum likelihood estimate, Recursive EM searches for MAP solution as mentioned in Section 2. Below we would go over a brief description of recursive EM algorithm with a detailed description in [9]. Recalling from section 2, the following type of criteria should be maximized.
(7)
we can maximize the term containing
the new
3.2 Recursive EM Algorithm for Dynamic Topology
α l g pl ( zi | θ l g ) p ( zi | θl g )
α l g pl ( zi | θl g )
new
new
g
αj
,µ
that new
new parameter θ . The EM algorithm usually converges to a local maximum of the log-likelihood function. Hence it is a good choice for mixture estimation and especially distributed (and unsupervised) applications like those mixture distributed objects in sensor networks. Because of the distributed property of sensor networks, practical and feasible sensor networks prefer distributed computation over a centralized process. A distributed implementation of the EM algorithm applied into sensor networks is described in Section 4.
can easily compute p j ( xi | θ j ) for each object i and
region j. In addition, the mixing parameters,
new
g
g
k
)
and then p(l | zi , Θ ) substituted into M-step to get the
available from the previous iteration[8]. Given Θ we
∑α
∑ p (l | z , Θ
(10) g
, σ ) = θ calculated in M-step will be g g substituted as Θ into E-step to compute p(l | zi , Θ )
l =1 i =1
M
N
Note
+∑∑ log pl ( zi | θl ) p (l |zi , Θ g )
=
i =1
)( zi − µl new )( zi − µl new )T i
N
p(l | zi , Θ g ) =
g
i
i =1
l =1 i =1
M
∑ p(l | z , Θ
∂ ∂α m
)
where p(θ (M )) ∝ exp
(9)
)
M
(log p ( Z | θ ) + log p(θ ) + λ (∑ α m − 1)) = 0 M
M
m =1
m =1
∑ cm log α m = ∏α mcm is a
Dirichlet prior [10]. For t data samples, we get
i =1
865
Authorized licensed use limited to: NEC Labs. Downloaded on May 5, 2009 at 11:17 from IEEE Xplore. Restrictions apply.
(12)
m =1
the sufficient statistics from one node to another based on a prescribed sequence through the nodes. Each node computes its local updates for the sufficient statistic. Note that these local updates are computed from local observations and estimates only available locally at each sensor node. In the forward path, in the prescribed order from 1 to M (assume we have M sensor nodes to cover M regions), each node increments the local estimates to the old cumulated estimates, and passes the new cumulated estimates to the next sensor nodes.
1 t (13) p (l | zi , Θ g ) − c ∑ K i =1 where K = t − Mc , and the parameters of the prior are cm = −c = − N / 2 . If we assume that the parameter
α m(t ) =
estimates do not change much when a new observation is added and the new ownership function p(l | zi , Θ
t +1
) can
t
be approximated by p(l | zi , Θ ) , we can obtain the
θ tj = θ tj + θ mt , j
following recursive update equation:
omt ( z t +1 ) = p(lm | zi , Θt +1 ) =
α pm ( z t m
p( z
| θm ) | Θt )
t +1
t +1
t
where θ j is the cumulative estimate of cluster j at time t, t
(14)
and
o (z ) cT − α mt − χ α mt +1 = α mt + χ 1 − McT 1 − McT (15) omt ( z t +1 ) − cT t =χ + (1 − χ )α m 1 − McT where χ = 1/ T is a fixed forgetting factor that is used to t m
t +1
t +1
χ (1 − χ )t −i being applied to the influence of t −i the old observation z . After the new estimated mixture
weight of each sensor is calculated, the online algorithm should check if there are irrelevant components to make sure no unnecessary redundant sensors used: If the mixture weight α m