Distributed Edge Detection with Composite Hypothesis Test in Wireless Sensor Networks Pei-Kai Liao† , Min-Kuan Chang‡ and C.-C. Jay Kuo† Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA† Department of Electrical Engineering, National Chung-Hsing University, Tai-Chung, Taiwan‡ E-mails:
[email protected],
[email protected],
[email protected] Abstract— A statistical approach to distributed edge sensor detection with wireless sensor networks is proposed in this research. The objective is to determine whether a target sensor is located in the edge area of a certain phenomenon or not based on data from its neighboring sensors. Due to the nature of wireless sensor networks, it is desirable to have a distributed algorithm that has a low computational complexity and a low data communication cost among sensors. With some reasonable assumptions and the aid of composite hypothesis testing, we propose data fusion as well as decision fusion methods for edge sensor detection to fulfill the aforementioned constraints. Numerical experiments are used to demonstrate the efficiency of the proposed algorithm.
Keywords: Edge detection, edge sensor detection, data fusion, decision fusion, composite hypothesis testing, wireless sensor networks. I. Introduction With an emerging need in environmental monitoring, military surveillance and security protection, research on wireless sensor networks has drawn a great amount of attention recently. In these applications, attributions of the monitored event over a certain area should be collected and analyzed. Edge information is one of these important attributions. Edge detection is a well-developed technique in the traditional image processing context. However, existing edge detection algorithms in image processing are difficult to be applied directly to the counterpart problems in wireless sensor networks due to the randomness of sensor locations within an event and the high noise level of the environment. Nowak and Mitra [1] proposed an edge estimation scheme when there exists a predefined hierarchical structure within the sensor network. This assumption seldom holds for a highly distributed wireless sensor network environment. Later, Chintalapudi and Govindan [2] proposed three approaches to distributed edge sensor detection; namely, the statistical- , the filter-, and the classifier-based approaches. Among them, the statistical-based approach is more robust with respect to environmental noise. Furthermore, unlike the other two approaches, the information about the geographic locations of sensors is not needed in this approach. However, only a rough idea of the statistical approach was sketched in [2]. The treatment was kind of heuristic. In this work, the definition of an edge sensor is modified to overcome the shortcoming of the original definition given in [2]. Besides, distributed detection and data fusion techniques and theory [3], [4] are used here to provide a detailed description of the processing algorithm and its performance analysis. It is worthwhile to point out that a simple hypothesis test technique for edge detection was given in [5] to provide an optimal detection solution, which was further simplified to an algorithm of low complexity when the IEEE Communications Society Globecom 2004
129
signal power and noise information is known. However, the signal power and noise information is actually difficult to obtain in the real-world environment. Thus, in contrast with the approach taken in [5], the composite hypothesis test technique is used to deal with a more realistic and general environment in this work. The rest of this paper is organized as follows. The edge definition and sensor model are described in Sec. 2. The proposed edge detection methods are described in Sec. 3. Simulation results are presented in Sec. 4. Finally, concluding remarks and future work is given in Sec. 5. II. Edge Definition and Sensor Model A. Edge Definition In the system of our concern, sensors do not have to be distributed uniformly over the area of interest. (They are however assumed to be uniformly distributed in the simulation for convenience.) We first clarify the definition of an edge sensor and its relationship with the real edge of a phenomenon. According to [2], an edge sensor is defined as follows. Definition 1: Definition of an edge sensor by Chintalapudi and Govindan [2]: • The edge sensor is inside the monitored event. • The distance between the real edge and the edge sensor is less than r. The parameter r is called the tolerance range. This edge sensor definition is illustrated in Fig. 1, where we assume that the monitored region is the dangerous region while the other region is the safe region. Please note that the edge region is only defined in the monitored (or dangerous) region, but not in the safe region. Sensors
Sensors
Safe Region
Monitored Dangerous Region
Real Edge Edge Sensors
Fig. 1. Illustration of edge sensors as defined in [2].
This definition may encounter a serious problem in some applications. For example, consider the case where a person walking from the safe region towards to the dangerous region. The person will not receive any warning signal until he enters the dangerous area since edge sensors are all inside the dangerous region. To give another example, when 0-7803-8794-5/04/$20.00 © 2004 IEEE
mobile sensors are used, those sensors outside the monitored region will not be aware so as to move closer to the real edge location to increase the sensor density locally to provide more detailed and reliable information about the real edge. Only those edge sensors, which are inside the monitored area, will move. To overcome the drawbacks of edge sensors as defined in Definition 1, we modify the definition of an edge sensor in this work as follows. Definition 2: Modified definition of an edge sensor If the distance between the real edge and a senor is less than r, then the sensor is an edge sensor no matter where it is located. With the above modified definition, both the edge sensors outside and inside the monitored event have to be detected and, therefore, the real edge can be included in the internal area of the detected edge region at the expense of increasing the width of the edge region from r to 2r.
As shown in [4], the maximum likelihood (ML) ratio test provides the optimal decision result for an appropriately chosen threshold. Thus, our main concern here is how to choose the decision threshold. Outside the Monitored Event Tolerance Range Non-edge Sensor
r R Edge Sensor
r r
r
R
r
Edge Sensor
R r Non-edge Sensor
R Radio Range
Real Edge
Inside the Monitored Event
B. Sensor Model Each sensor collects data from its neighboring sensors, which are inside its radio range, and then makes decision based on the received data. Let us assume that there are N − 1 neighboring sensors around a target sensor within radius R, which is called the radio range. The measurement of the ith neighboring sensor at a particular time instance k can be modeled as yi [k] = mi [k] + ni [k],
(1)
where mi [k] is the signal strength and ni [k] is the measurement noise perceived by the ith neighboring sensor at time k. Furthermore, we assume that the signal takes a binary level, whose value can be Ai and 0, where Ai is the signal amplitude, which might vary from one sensor to another. Also, P (mi [k] = Ai ) = PAi and P (mi [k] = 0) = 1 − PAi . The variable ni [k] in (1) is assumed to be the white Gaussian noise in both the space and the time domains. Thus, the probability density function (PDF) of the measurement yi [k] of the ith sensor can be written as P (yi ) = PAi N (Ai , σi2 ) + (1 − PAi )N (0, σi2 ),
(2)
where N (Ai , σi2 ) is the PDF of a Gaussian distribution with mean Ai and variance σi2 . Equation (2) is valid for any value of k. Unlike the assumption made in [5], we do not have perfect knowledge about the values of Ai and σi2 . Thus, Equation (2) gives a family of PDFs parameterized by Ai and σi2 . In addition, since the signal amplitude is allowed to vary among different sensors, our detection scheme applies to binary and multiple signal levels. III. Proposed Methods for Edge Region Detection A. Basic Idea The problem of edge region detection can be transformed to the problem of edge sensor detection since detected edge sensors can be viewed as markers of the edge region. The latter problem can be formulated as a binary hypotheses test as follows. • HF 0 : The target sensor is not an edge sensor. • HF 1 : The target sensor is an edge sensor. IEEE Communications Society Globecom 2004
130
Fig. 2. Illustration of edge and non-edge sensors.
Before we proceed to the formal mathematical derivation. Let us have an intuitive argument first. Fig. 2 illustrates the difference between edge sensors and non-edge sensors. If the radio range R is equal to the tolerance range r, a non-edge sensor has its neighboring sensors covered by a circle of radius R either all inside the monitored event or all outside the monitored event. In contrast, an edge sensor has its neighbors distributed in both regions. This property can be used to develop a detection algorithm. If R > r, the above statement has to be modified. That is, for a non-edge sensor, the majority of its neighboring sensors is located either outside or inside the monitored region. This observation will be translated to a certain criterion to distinguish these two cases. In the following, two approaches are proposed to detect edge sensors. They are suitable for cases without and with a severe communication bandwidth constraint, respectively. B. One-Level Decision Method This approach is called the one-level decision method since only one hypothesis test is conducted in the processing. It is devised for wireless sensor networks that have a relatively large communication bandwidth. Case I: R = r For simplicity, the radio range is assumed to be the same as the tolerance range first. It was derived in [5] that the following fusion rule based on the maximum likelihood ratio test (MLRT) provides the optimal solution for the edge sensor detection when the information about the PDFs is known: N PAi i=1 [1 + 1−PAi · Λ(yi (k))] ln(Λ(y(k))) = ln −1 N PAi 1 + ( 1−P )N · i=1 Λ(yi (k)) Ai HF 1 P (HF 0 ) > ln = TF , < P (HF 1 ) HF 0 i −A) . where Λ(yi (k)) = exp A(2y 2 2σ i
0-7803-8794-5/04/$20.00 © 2004 IEEE
(3)
Since parameters A and σi2 are unknown in the current context, we have to determine the optimal estimates for these parameters first. The maximum likelihood estimation (MLE) technique is used to estimate parameters, A and σi2 as given below: Aˆ = Avgi∈E Aˆi , (4) Ki −1 (yi (k) − Aˆi )2 , (5) σˆi2 = k=0 Ki where E represents the event the ith sensor is inside Kthat i −1 yi (k) the monitored event, Aˆi = k=0Ki and Ki is the number of temporal domain samples. Since Aˆi is a random σ2
variable with PDF equal to N (Ai , Kii ) and
Ki σˆi2 σi2
is a ran-
dom variable with PDF equal to χ2 (Ki − 1), the accuracy of these two estimates increases with the number Ki of temporal samples. For the MLRT given by (3), it is reasonable to assume PAi = 0.5 in the initial state with k = 0, since we have no idea about which sensor is inside the monitored event in the beginning. As to P (HF 0 ) and P (HF 1 ), we do not know their exact values. However, we can vary the threshold value TF over a wide range and calculate the corresponding detection and false alarm probabilities. Case II: R > r Next, we proceed to the case where the radio range R is larger than the tolerance range r. We need to find a criterion to differentiate an edge sensor and a non-edge sensor. As shown in Fig. 2, the real edge crosses the radio range as well as tolerance range for an edge sensor. Thus, the proposed detection algorithm consists of two steps.
HL0 : The sensor is outside the monitored event. HL1 : The sensor is inside the monitored event. Global Hypothesis (the 2nd-level decision): • HF 0 : The target sensor is not an edge sensor. • HF 1 : The target sensor is an edge sensor. It was proved in [5] that the likelihood ratio test provides a nearly optimal solution when the PDFs are perfectly known. However, we do not have the exact signal and noise power information, but a family of parameterized PDFs. Then, the local hypothesis can be written as • •
HL0 : yi (k) = ni (k), HL1 : yi (k) = Ai + ni (k),
where yi (k) is the measured data from the ith sensor, Ai = 0, ni (k) is zero-mean white Gaussian noise with variance σi2 and k is the time index. By applying GLRT (Generalized Likelihood Ratio Test), the local decision rule can be expressed as 2 ,H ) f (yi (k), Aˆi , σˆ1i L1 LG (yi (k)) = ˆ 2 f (y (k), σ , H ) i
C. Two-Level Decision Method This approach is called the two-level decision method since there are two hypotheses tests to be performed. It is developed for those wireless sensor networks with very limited communication bandwidth. With this method, measured data are first processed and a binary decision is made by each local sensor, which is called the local test. The binary decision result is then exchanged among sensors for final decision making, which is called the global test. Case I: R = r Again, the radio range R is the same as the tolerance range r in this case. The two hypothesis tests are given below. Local Hypothesis (the 1st-level decision): IEEE Communications Society Globecom 2004
131
L0
0i
HL1 > TL , < HL0
(7)
2 are chosen to maximize the numerator where Aˆi and σˆ1i ˆ 2 and σ0i is chosen to maximize the denominator. After some ˆ2 ˆ algebraic manipulations, we can show that Ai and σ1i can
be calculated via Aˆi = y¯i =
Ki −1 k=0
yi (k)
Ki
and (5), respec-
2 can be computed tively. Furthermore, the parameter σˆ0i Ki −1 2 1 ˆ 2 via σ0i = Ki k=0 yi (k). Therefore, the local decision rule can be simplified as
Step 1 Detect whether the real edge crosses its radio range. Step 2 Detect whether the real edge crosses its tolerance range. In the first step, each sensor collects and processes data from its neighboring sensors within its radio range to decide if the real edge crosses it using the same algorithm presented in Case I. If the answer is negative, we claim that the sensor is a non-edge sensor. If the answer is positive, we conduct the test in the second step. That is, the sensor only processes data collected from sensors within its tolerance range to decide if the real edge crosses it. In other words, the MLRT given by (3) will be used twice with two different thresholds, TR and Tr .
(6)
y¯i 2
1 Ki
Ki −1 k=0
(yi (k) − y¯i )2
HL1 > TL , < HL0
(8)
where Ki is the number of samples in the time domain for the ith sensor. By multiplying both the numerator and the denominator i with K , we can show that the new numerator is a random σ2 i
variable with the χ2 (1) distribution while the new denominator is a random variable with the χ2 (Ki −1) distribution. These two random variables are independent of each other. The false alarm probability can be computed via PF A
= P(
X1
X2 Ki −1
> (Ki − 1) · TL |HL0 )
(9) = P (F > (Ki − 1) · TL |HL0 ), Ki −1
2 Ki −1 yi (k) (yi (k)−y¯i )2 k=0 k=0 , X = and F where X1 = 2 2 K i σi σi2 is a random variable of Fisher’s F distribution with r1 = 1 and r2 = Ki − 1. Given a tolerable false alarm probability, we can compute the threshold TL . The detection probability can be calculated via X2 2 TL PD = P > σi |HL1 V Ki 0-7803-8794-5/04/$20.00 © 2004 IEEE
∞
0
σi ∞
−σi
+ 0
Ki −1
L Ki
vT
Probability of Detection Probability of False Alarm
h(x, v)dxdv L Ki
vT
h(x, v)dxdv,
(10)
−∞
Ki −1
yi (k)
(yi (k)−y¯i )2
where X = k=0Ki , V = k=0 σ2 and h(x, v) i is the joint PDF of a Gaussian random variable and a Chisquare random variable of the following form: 1 (x − Ai )2 Ki h(x, v) = exp − 2π σi2 Ki · σ i Ki −1 −v 1 −1 2 × exp . (11) Ki −1 v 2 Γ( Ki −1 )2 2
Probability of Detection Probability of False Alarm
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
Probability
∞
Probability
=
0.5 0.4
0.5 0.4
0.3
0.3
0.2
0.2
0.1 0
0.1 −2
0
2 4 Threshold T
6
8
10
F
0
−2
0 2 4 6 8 Threshold T in the second step
10
F
(a) case I
(b) case II
Fig. 3. The detection and false alarm rates as a function of the threshold for the one-level decision method.
2
With PF A and PD obtained above, the following global decision rule can be derived: ln
HF 1 (1 + B1 )m · (1 + B2 )N −m > −1 TF , < 1 + B1m · B2N −m HF 0
(12)
A. One-Level Decision Method
P (HL0 )·(1−PF A ) L0 )·PF A where B1 = PP(H (HL1 )·PD , B2 = P (HL1 )·(1−PD ) , N is the number of sensors inside the tolerance range of the target sensor and m is the number of sensors that are decided to be inside the monitored event. Decision rule (12) can be further simplified to the following form [5]:
Decide HF 0 Decide HF 1
: m < m1 or m > m2 , : m1 < m < m2 ,
(13)
where m1 and m2 (with m1 < m2 ) are two parameters related to the values of the false alarm probability and the detection probability. The above rule reduces the complexity of the decision fusion problem greatly, and the fusion sensor can perform it easily and fast. Case II: R > r If the radio range R is larger than the tolerance range r, it is desirable that measurements from sensors located between the radio range and the tolerance range can help the target sensor to improve the detection rate. Similar to the case in the one-level decision case, the detection algorithm consists of the following two steps. Step 1 Detect whether the real edge crosses its radio range. Step 2 Detect whether the real edge crosses its tolerance range. In each step, decisioin rule (13) will be used yet with different m1 and m2 . If the target sensor passes the test in the first step, we will proceed to the second step. If it passes the second test as well, the target sensor is claimed to be an edge sensor. If the target sensor fails in any of the above two tests, it is claimed to be a non-edge sensor. IV. Simulation Results We conducted computer simulation with the following settings. There are 1000 sensors randomly distributed over a 10 by 10 square so that the average sensor density is 10 sensors per unit square. The SNR value is 10dB and IEEE Communications Society Globecom 2004
the signal amplitude is set to 1. The tolerance range r is equal to 0.5 and the number of temporal samples, Ki , is 5 for each sensor. In addition, the signal amplitude and the noise power are assumed to be fixed over the sensor field although they are unknown in advance.
132
We plot the detection probability PD and the false alarm probability PF A as a function of the threshold value TF for Cases I and II in Fig. 3(a) and (b), respectively. For Case I, the radio range, R, is set to 0.5, which is the same as the tolerance range, r. The decision rule given by (3) is applied. For Case II, R is set to 2r = 1. The decision rule (3) is applied in each step with different thresholds. Fig. 3(b) shows the performance with respect to the threshold value in the 2nd Step while the threshold in the first step is fixed to be 30. As shown in Fig. 3(a), the performance of Case I is quite poor. To keep the false alarm probability below 10%, the highest detection rate can be achieved is around 65%. This poor performance can be explained as follows. The average number of sensors inside the tolerance range of a target sensor is around 7, the amount of the data received from the neighboring sensors is not large enough to combat the measurement noise. The performance can be increased by collecting more temporal samples, which implies a delayed decision. By comparing Fig. 3(a) and 3(b), we see that Case II has much better performance than Case I. For Case II, when the false alarm rate is around 10%, the detection rate can reach 85%. Thus, the detection probability is greatly enhanced for the same false alarm probability in Case II. The performance improvement is due to a larger value of R, where we are able to collect more spatial domain samples. B. Two-Level Decision Method In this subsection, we examine the performance of the two-level decision method, where decision rules (8) and (12) are applied. The false alarm rate of local decision, PF A , is set to 0.01 so that the error rate of sensors detected inside the monitored event is controlled to be around 1%. With highly reliable local decision results, the accuracy of the global decision can be largely improved. A scenario of edge sensor detection is given in Figs. 4 0-7803-8794-5/04/$20.00 © 2004 IEEE
x : Sensors detected inside the monitored event * : Sensors detected outside the monitored event o : Detected edge sensors − − : Real edge
10
10
9
9
8
8
7
7
6
6
5
5
4
4
3
3
2
2
1
1
0 0
2
4
6
8
x : Sensors detected inside the monitored event * : Sensors detected outside the monitored event o : Detected edge sensors − − : Real edge
0 0
10
Fig. 4. Edge sensor detection with the two-level decision method when TF = 3 for Case I.
2
0.9
0.8
0.8
0.7
0.7
0.6
0.6
Probability
Probability
1 0.9
0.5 0.4
0.4 0.3
0.2
0.2
0.1
0.1 0 2 4 Threshold TF
6
8
10
0.5
0.3
0
8
Probability of Detection Probability of False Alarm 1
−2
6
Fig. 6. Edge sensor detection result with the two-level method for Case II.
Probability of Detection Probability of False Alarm
0
4
10
−2
0 2 4 6 8 Threshold TF in the second setp
10
Fig. 5. The detection and false alarm rates for the two-level decision method for Case I.
Fig. 7. The detection and false alarm rates for the two-level decision method for Case II.
and 6, where there are two monitored events bounded by a square and a circle. Detected edge sensors, interior sensors, and exterior sensors are marked with “o”, “x” and a dot, respectively. The decision results for Case I (R = r) and Case II (R = 2r) using the two-level decision method are shown in Figs. 4 and 6, respectively. As shown in these figures, it is observed that the performance of the proposed method is not affected by the shape and the number of monitored events. The false alarm rate and the detection rate as a function of the threshold value are shown in Fig. 5 for Case I. We see that the detection rate can be kept above 60% when the false alarm rate is less than 10%. Please also note that, even though an analytical formula to relate the threshold value to the system performance (i.e. the detection probability and the false alarm probability) in the global decision rule is lacking, one can use Fig. 5 as a reference for the threshold value assignment. For Case II, the detection rate and the false alarm rate are plotted as a function of the threshold value in Fig. 7. In this case, the global decision rule (12) is applied in each step with different thresholds. The threshold TF in Step 1 is chosen to be 20 and the false alarm rate of local decision is limited to be no larger than 0.01. As shown in Fig. 6, edges of both monitored events are well detected. Since R is equal to 2r = 1 in Case II, more data are collected from the neighboring sensors of a target sensor to provide more reliable information than Case I. Consequently, more accurate decision and better system performance can be
achieved in Step 2. By comparing Figs. 5 and 7, we see that the false alarm rate is greatly reduced in Case II. This is consistent with results of the one-level detection method.
IEEE Communications Society Globecom 2004
133
V. Conclusion Based on these simulation results, we conclude that it is advantageous to have a radio range larger than the tolerance range. However, the radio range is restricted by the low power consumption requirement of sensors. Generally speaking, the value of the radio range R and the tolerance range r should be assigned adaptively according to the sensor density and the power consumption requirement for each target sensor. References [1] R. Nowak and U. Mitra, “Boundary estimation in sensor networks : theory and methods,” in 2nd International Workshop on Information Processing in Sensor Networks, Apr. 2003, pp. 22–36. [2] K. K. Chintalapudi and R. Govindan, “Localized edge detection in sensor fields,” in IEEE International Workshop on Sensor Network Protocols and Applications, May 2003, pp. 59–70. [3] R. Viswanathan and P. K. Varshney, “Distributed detection with multiple sensors : Part I - fundamentals,” Proceedings of the IEEE, vol. 85, no. 1, pp. 54–69, Jan. 1997. [4] P. K. Varsheney, Distributed Detection and Data Fusion, New York : Springer-Verlag, 1996. [5] P.-K. Liao, M.-K. Chang, and C.-C. Jay Kuo, “Distributed edge sensor detection with one- and two-level decisions,” in ICASSP, May 2004.
0-7803-8794-5/04/$20.00 © 2004 IEEE