case we consider, this is true due to the monotone likelihood ratio property of Gaussian pdfs. Hence the result can be applied to any problem with independent ...
On the Optimality of Finite Level Quantizations for Distributed Signal Detection Jun Hu and Rick S. Blum EECS Department Lehigh University Bethlehem, PA 18015 Abstract
Distributed multisensor detection problems with quantized observation data are investigated for cases of nonbinary hypotheses. The observations available at each sensor are quantized to produce a multiple-bit sensor decision which is sent to a fusion center. At the fusion center, the quantized data are combined to form a nal decision using a predetermined fusion rule. Firstly, the necessary conditions for Bayes optimum sensor decision rules are described. Then it is demonstrated that there is a maximum number of bits that should be used to communicate the sensor decision from a given sensor to the fusion center. This maximum is based on the number of bits used to communicate the decisions from all the other sensors to the fusion center. If more than this maximum number of bits is used, the performance of the optimum scheme will not be improved. In some special cases of great interest, the bound on the number of bits that should be used can be made signi cantly smaller. Finally, the optimum way to distributed a xed number of sensor decision bits across the sensors is described for two-sensor cases. Illustrative numerical examples are presented at the end of this paper.
Keywords: distributed signal detection, decentralized detection, quantization for detection, M-ary hypothesis, multiple bit sensor decisions.
This paper is based on work supported by the Oce of Naval Research under Grant No. N00014-971-0774 and by the National Science Foundation under Grant No. MIP-9703730.
1 Introduction Signal detection algorithms which process quantized data taken from multiple sensors continue to attract attention [1, 2, 3, 4, 5]. Such algorithms have been classi ed as distributed signal detection algorithms. The majority of distributed signal detection research has focused on cases with statistically independent observations, binary hypothesis testing problems and binary sensor decisions [6, 7]. Studies on nonbinary hypothesis testing problems have been lacking. An early paper on this topic [8] provided equations describing the necessary conditions for the optimum sensor processing. A more complete discussion which includes a thorough treatment of the necessary conditions for the case of independent observations is given in [9]. A nice discussion of the complexity of cases with dependent observations is also given in [9]. Neither [8] nor [9] give any numerical examples. A numerical procedure for nding the optimum processing scheme was provided in [10] for cases with dependent observations and nonbinary hypothesis and a few numerical examples are provided. However, studies of the properties of optimum schemes have been lacking. In this paper, we demonstrate that no more than a certain number of bits should be used to communicate a sensor decision from a particular sensor to the fusion center. Using more than that number of bits at a given sensor is unnecessary and will not generally lead to increases in performance. The number of bits which should be used is related to the number of bits used to communicate the other sensor decisions to the fusion center. The paper is organized as follows. In Section 2 we present the model for the observa1
tions and the distributed decision making. The necessary conditions for optimum sensor decision rules given a xed fusion rule are produced in Section 3. In Section 4, we prove that a nite number of bits can be used for the sensor decision at a given sensor without sacri cing any performance. We also strengthen our results for some special cases of great interest. In Section 5, we consider how to allocate bits between two sensors in the case where the total number of bits used for sensor decisions is xed. Section 6 provides several illustrative numerical examples. Finally, our conclusions appear in Section 7.
2 Problem Formulation Consider a multiple hypothesis testing problem using L sensors, where H ; H ; : : : ; HK ? 0
1
1
are K hypotheses, and P (Hi); i = 0; : : : ; K ? 1 are the prior probabilities for each hypothesis. Assume an mk dimensional vector of observations
yk = [yk; ; yk; ; : : : ; yk;m ]; 1
2
k
yk;l 2 R
is observed at the kth sensor, where k = 1; : : : ; L. De ne y = [y ; y ; : : : ; yL], and let 1
2
p(y j Hi); i = 0; : : : ; K ? 1 denote the known joint conditional probability density functions (pdfs) under each hypothesis. Note that we do not assume y ; y ; : : : ; yL are independent 1
2
when conditioned on the hypothesis. Let P (Dj j Hi) represents the probability that a nal decision for hypothesis Hj is made given hypothesis Hi is true. Here we consider the criterion of minimum probability of error. De ned as
Pe = 1 ? P c 2
(1)
where
Pc =
KX ?1 i=0
P (Hi)P (Di j Hi)
Our goal is to design a system minimizing Pe. A typical centralized detection system requires each sensor to transmit its observations to a fusion center, then the fusion center produces a decision U as 0
U = d(y) = d(y ; y ; : : : ; yL) 0
1
(2)
2
where d is the decision rule and U = j corresponds to deciding that hypothesis Hj is 0
true. A well known decision rule for centralized detection systems is to compare the joint likelihood ratio to a threshold. Due to a variety of reasons, it may be dicult to realize such a centralized system in practice. In a distributed detection system, the observation data at the kth sensor are rst quantized to a discrete variable Uk taking on only Nk possible values. For simplicity , it is common to choose Nk to be a power of K , or 1
Nk = K n . Thus Uk can be represented as a nk -digit, K-ary number produced by an nk k
dimensional vector of decisions
Uk = [Uk; ; Uk; ; : : : ; Uk;n ? ]; 0
1
k
1
Uk;l 2 f0; 1; : : : ; K ? 1g
made at the kth sensor, using a group of decision rules
dk (yk ) = [dk; (yk ); dk; (yk ); : : : ; dk;n ? (yk )] 0
1
k
1
Uk;l = dk;l(yk ) k = 1; : : : ; L; l = 0; : : : ; nk ? 1 1
Also see Remark 1 following Theorem 1.
3
(3)
Please note that each dk;l denotes a scalar function while d with only single subscript denotes a vector of functions. To avoid confusion between Nk and nk , Nk is called the number of quantization levels rather than the number of possible values in the sequel. De ne the combination of all but the last sensor decisions as U = [U ; U ; : : : ; UL? ]. All 1
2
1
sensor decisions are sent to the fusion center to determine a nal decision U with a chosen 0
fusion rule F
U = F (U ; UL ) = F (U ; U ; : : : ; UL) 0
1
2
= F (U ; ; : : : ; U ;n1? ; : : : ; UL; ; : : : ; UL;n ? ) 10
1
1
0
L
1
= F (d ; (y ); : : : ; d ;n1? (y ); : : : ; dL; (yL); : : : ; dL;n ? (yL)) 10
1
1
1
1
0
L
1
(4)
where U and all Uk;l can take on any of the values 0; : : : ; K ? 1. 0
There are three issues need to be considered in distributed detection, how many quantization levels or bits should be used for decision at each sensor, how these decisions should be made and which fusion rule should be employed to combine these decisions. We will discuss each of these issues in the sequel.
3 Optimum Sensor Processors Suppose all dk (yk ) are xed so that only the fusion rule can be optimized. Such cases are similar to centralized detection cases. So we will focus on the cases where both sensor decision rules and fusion rule must be optimized. We rst assume that a fusion rule has been chosen. Under the xed fusion rule, we can prove that the decision rules of sensors must satisfy a group of necessary conditions to 4
minimize Pe. Let us consider the probability Prob(U = i j y). Due to our con guration, 0
Prob(U = i j y) is an indicator function
8 > > < 1 if d(y) = i, Prob(U = i j y) = > > : 0 otherwise.
0
Using Bayes rule, for any particular sensor decision Uk;l
Prob(U = i j y) = 0
=
KX ?1 j =0 KX ?1 j =0
Prob(Uk;l = j j y)Prob(U = i j y; Uk;l = j ) 0
Prob(Uk;l = j j yk )Prob[U = i j y; dk;l(yk ) = j ] 0
(5)
De nition 1 A class of functions Lj;k;l(yk ) are de ned as follows Lj;k;l(yk ) = where dy =
KX ?1 Z
QL dy k k
i=0
Z
dy Prob(U = i j y; dk;l(yk ) = j )P (Hi)p(Y j Hi) dy 0
k
(6)
=1
Theorem 1 Suppose Pe is minimized, then 8k = 1; : : : ; L; l = 0; : : : ; nk ? 1; dk;l(yk ) must satisfy the following conditions
8yk 2 Rmk dk;l(yk ) = j if 8j 6= j LLj;k;l((yyk )) > 1 j0 ;k;l k Outline of the proof. From (1), 0
Pe = 1 ? = 1? Using (5),
Pe = 1 ?
KX ?1 Z i=0
Z
KX ?1
P (Hi)P (Di j Hi) Z Z Prob(U = i j y)P (Hi)p(y j Hi)dy
i=0 KX ?1 Z i=0
0
Z KX?
P (Hi)p(y j Hi)dy
1
j =0
Prob(Uk;l = j j yk )Prob[U = i j y; dk;l(yk ) = j ] 0
5
(7)
(8)
We can exchange the order of those sums and integrals and extract the factor that depends on dk;l(yk ). Then we get KX ?1 Z
Prob(Uk;l = j j yk ) ) Z dy Prob[U = i j y; dk;l(yk ) = j ]P (Hi)p(y j Hi) dy dyk k i KX ? Z Lj;k;l(yk )dyk = 1?
Pe = 1 ? j (KX ? Z
=0
1
0
=0
1
j =0 fy :d k
y
jg
k;l ( k )=
To minimize Pe requires making the sum of the negative terms as large as possible. Thus the sets fyk : dk;l(yk ) = j g should only include those yk that would make the sum larger. In another words, each yk is assigned to the set with the largest Lj;k;l(yk ). Moreover, each
yk 2 Rmk must belong to one set. Note that if all Lj;k;l(yk ) have continuous pdfs, we can assign yk to any one set if more than one function Lj;k;l(yk ) yield the largest value without aecting performance. 2
Remark 1 Note that the above proof does not depend on how many possible values dk;l(yk ) can take on. The key is to determine which function Lj;k;l (yk ) is the largest regardless of how many functions there are. Thus the result of Theorem 1 is valid under a more general condition where Nk = Ckn and Ck are not all same. Then obviously it should also be valid k
for those cases where the number of quantization levels Nk can not be presented as a power of K since we can just set Ck = Nk and all nk equal to 1.
4 Limits on Number of Bits Used in Sensor Decisions Here we will show that the number of sensor decisions which should be used at one sensor is limited by the number of sensor decisions used at the other sensor. Furthermore, 6
due to the non-uniqueness of describing a distributed detection system by de ning a fusion rule and a set of sensor decision rules, there are many combinations of fusion rules and sensor decision rules which are optimum. To see this just consider a K = 2 case, complementing all the sensor decision rule outputs and the fusion rule inputs will change nothing. Using this idea we present one speci c fusion rule that can always be used to achieve optimum global performance for some special cases.
Theorem 2 If the combination of the rst L ? 1 sensor decisions, U , consists of n components, where n =
PL? n , then a scheme which obtains optimum global performance k k 1 =1
can be found in the case where the Lth sensor makes nL = K n decisions. Thus using an
Lth sensor with nL > K n will not improve the optimum global performance.
Outline of the proof. U is a vector of K-valued integers. We can also see it as a number in the K-ary number system. Then U takes on the values of 0; : : : ; K n ? 1 and we can view F (U ; UL ) as a function of two variables, U and UL. The theorem will be proved if we show that whenever nL > K n
9i; j = 0; : : : ; K n ? 1; i 6= j 8U = 0; : : : ; K n ? 1 F (U ; i) F (U ; j ) L
(9)
Thus it does not make sense to distinguish the decision of UL = i and UL = j since either decision leads to exactly the same nal decision. We can choose a new decision rule at the
Lth sensor which uses just one value to represent these two values i; j without changing performance. If UL is xed, F (U ; UL) degenerates to a function of one variable which maps each possible value of U into a possible value of U . Alternately we can say each value of 0
7
UL will correspond to a function from U to U . Since U takes on K n dierent values and 0
U takes on K dierent values. Obviously there are only K K dierent mapping patterns n
0
between the domain and the range. Hence there are totally K K dierent functions from n
U to U . Whenever nL > K n, the number of possible values of UL is greater than the 0
number of dierent functions. Thus at least two values of UL will correspond to the same function. This statement is equivalent to (9). 2
Remark 2 Suppose the number of quantization levels at the kth sensor, Nk ; k = 1; : : : ; L, can not be represented as a power of K , then a slightly more general result can be obtained. Using the same argument we can prove an optimum scheme can be found in the case of
NL = K N , where N = QkL? Ni. 1 =1
Theorem 3 In the case of nL = K n, we can obtain optimum performance by employing the speci c fusion rule
U = F (U ; UL) = UL;U U = 0; : : : ; K n ? 1 0
(10)
where UL = [UL;0 ; UL;1 ; : : : ; UL;K ?1 ]. n
Outline of the proof. We have shown that a scheme with nL = K n can be used to achieve optimum performance. If we could also show that the fusion rule in (10) can be used with some groups of sensor decision rules to implement any scheme in this case, Theorem 3 is proved. Consider an arbitrary scheme in the case of nL = K n . It consists of a fusion rule F^ and a group of decision rules d^ ; : : : ; d^L. For a given K-ary integer B 1
8
taking on any of the values 0; : : : ; K K ? 1, let [b ; : : : ; bK ? ] denote its digits. Now de ne n
0
n
1
a group of sets
B = fyL : F^ (0; d^L(yL)) = b ; : : : ; F^ (K n ? 1; d^L(yL) = bK ? g n
0
1
After determining the set B corresponding to each possible value of B , we can use these sets to de ne a new decision rule for the Lth sensor.
dL;l(yL) = bl if yL 2 B ; l = 0; : : : ; K n ? 1 The fusion rule in (10), with the sensor decision rules d^ ; : : : ; d^L? and dL can insure the 1
1
overall scheme produces the same output for any yL. Thus the fusion rule in (10) will always allow us to obtain optimum performance. Further, generally we can not decrease
nL any more without aecting the global performance. 2 In Theorem 2, the upper bound of nL grows rapidly as n increases. However in some more speci c cases, we can have a stronger result. As an example, consider a hypothesis testing problem with statistically independent Gaussian data with dierent mean vectors under each hypothesis.
mk = 1; yk 2 R; k = 1; : : : ; L
Hi : y = [y ; y ; : : : ; yL] N (Mi ; C ); i = 0; : : : ; K ? 1 1
2
(11)
where C = diag[ ; ; : : : ; L], j ; j = 1; : : : ; L are the variances of observation data yj . 2 1
2 2
2
2
And Mi = [E fy j Hig; E fy j Hig; : : : ; E fyL j Hig]T ; i = 0; : : : ; K ? 1 are the mean 1
2
vectors under each hypothesis. 9
De nition 2 A single-interval is a set of real numbers, fx j a < x < bg. Such a single-interval could be a nite length interval, the entire real line (a=-1 and b=+1), semi-in nite interval (either a=-1 or b=+1 while the other is nite), or the empty set (a=b).
Theorem 4 For the Gaussian shift in mean problem in (11), a scheme which obtains optimum global performance can be found in the case of nL = n + 1.
Outline of the proof. It is sucient to prove that all optimum schemes used in this case can be implemented by a new scheme with nL n + 1. Let us consider an arbitrary optimum scheme. For statistically independent observations, (7) reduces to a set of comparisons of a likelihood ratio to a corresponding threshold. Each comparison yields a semi-in nite interval. So the intersection of these comparisons is a single-interval. For xed U , F (U ; dL(yL)) is a multiple-step function that can be determined by those thresholds between neighboring single-intervals, the number of which is less than K . For each values of U , we need less than K thresholds, t ;U ; : : : ; tK ? ;U , and U has K n possible val1
1
ues, so less than K K n = K n thresholds can completely determine F (U ; dL(yL)). We +1
can put all the thresholds together and sort them by ascending order. The sorted sequence of thresholds are denoted by t0 < : : : < t0K +1? . These thresholds will divide the entire 1
n
1
real axis of yL into not more than K n intervals. For any value of U , all points in each +1
interval fyL : t0k < yL < t0k g will yield the same F (U ; dL(yL)) If we assign a dierent +1
value of UL to each interval, the new scheme will satisfy nL n + 1 and implement the considered arbitrary optimum scheme. 2 10
Remark 3 From the proof of Theorem 4, it is clear that this stronger result is only based on the optimum sensor test being a direct threshold test of the observation. In the case we consider, this is true due to the monotone likelihood ratio property of Gaussian pdfs. Hence the result can be applied to any problem with independent observations and monotone sensor likelihood ratios. Studies have shown that a large family of pdfs have monotone likelihood ratios [11]. Cases of dependent observations may also possess the required property. Some cases with Gaussian pdfs are shown to have this property in [12]. Theorem 4 can be used in all these cases.
5 A Case with Fixed Overall Number of Decisions In a two sensor system where the total number of decisions is xed, we denote a scenario where the rst sensor makes n possible decisions and the second sensor makes n possible 1
2
decisions by (n ; n ), Note that increasing both n and 3 n will certainly improve the 1
2
1
2
global performance, so the constraint is reasonable.
Theorem 5 Given n + n is xed, then regardless of the statistical characteristics of 1
2
observation data, a scheme which obtains optimum global performance and satis es the following inequality can be found.
logK n n K n1 1
2
Outline of the proof. Suppose we have found an (n ; n ) scheme which achieves opti1
2
mum global performance but n > K n1 . From Theorem 2, we know the excessive decisions 2
used at the second sensor are wasted. We can allocate some of them to the rst sensor to 11
improve the overall performance. Thus we prove n K n1 . Now switching n and n in 2
1
2
the argument, we can also prove that n K n2 , or logK n n . 2 1
1
2
Remark 4 Suppose the number of quantization levels Nk can not be represented as a power of K . Using the relationship between Nk and nk , we can get an inequality for N1 and N2 .
logK N N K N1 1
2
Theorem 6 In those special cases with independent observations, known signals and noise pdfs which lead to monotone likelihood ratios, a scheme which obtains optimum global performance and satis es the following inequality can be found.
n ?1n n +1 1
2
1
Outline of the proof. The proof of this theorem is quite simple. All we need to do is to substitute the stronger result from Theorem 4 in place of the result from Theorem 2 into the proof of Theorem 5. Then we can prove both n n + 1 and n n + 1. 2
1
1
2
So the theorem is proved. This result implies the optimum scheme would like to divide the number of decisions evenly or nearly evenly between the two sensors. If the sum of
n + n is even, then n = n , otherwise n = n 1. 2 1
2
1
2
1
2
Remark 5 In the above cases, if the levels of quantization Nk can not be represented as a power of K , the inequality for N1 and N2 is
N N KN K 1
2
12
1
Opposed to the result of Theorem 6, the value of N1 and N2 are still to some extent undetermined. Only numerical techniques can nd the exact N1 ; N2 used by the optimum scheme.
We can demonstrate our results in the plane of (N ; N ) or the plane of (n ; n ), as 1
2
1
2
shown in Figure 1. In each of the four parts of this gure we label the Theorem or Remark it illustrates. We know only schemes in the region between the two curves can obtain optimum global performance, so we never need to search in those regions outside the two curves.
6 Numerical Results In the following numerical investigations, we consider detecting a known signal in Gaussian noise with two sensors for K = 2 and K = 4. In the rst case, U and U are binary variables and there are 2 1+1 = 16 possible 1
2
2
nonrandomized fusion rules: two rules are trivial (U = 1; U = 0), four rules ignore one 0
0
of the sensors (U = U ; U = U ; U = U ; U = U ), four are AND rules (U = U U ; U = 0
1
0
1
0
2
0
2
0
1
2
0
U U ; U = U U ; U = U U ), four are OR rules (U = U + U ; U = U + U ; U = U + 1
2
0
1
2
0
1
2
0
1
2
0
1
2
0
1
U ; U = U + U ), and two rules are exclusive-or operations (U = U U ; U = U U ). 2
0
1
2
0
1
2
0
1
2
If either U or U takes on more than two values, or a nonbinary case, there will be even 1
2
more fusion rules. We can use the following equations to calculate the total number of possible fusion rules. Here we do not necessarily consider independent observations.
M = KK 13
n
1 +n2
(12)
Of course, theoretically we could compute the performance of the best scheme using each possible nonrandomized fusion rule and pick the one with minimum probability of error. But as n and n grow, this requires signi cant computation. For instance, in a (1,3) or 1
2
(2,2) scenario, the number of fusion rules is M = 2 4 = 256. which is 16 times of that in 2
a (1,1) scenario. For a xed fusion rule, we employ a discretized Gauss-Seidel iterative algorithm [10, 12] and attempt to nd all solutions to the necessary conditions in Section 3. With every set of initial decision rules we have tried, the algorithm always converges to the same result in nite number of iteration steps. Due to this, we take the solution we found as the optimum solution and use this group of sensor decision rules and the fusion rule to calculate Pe. Assume the observation data at the sensors consists of dierent constant signals for each hypothesis and additive Gaussian noise n N (0; 3) and n N (0; 2). For two 1
2
hypotheses,
H0 : Y = ?1 + n ; Y = ?1 + n 1
1
2
H1 : Y = 1 + n ; Y = 1 + n 1
1
2
2
2
We also assume all hypotheses have equal prior probabilities and the noise samples at the two sensors are statistically independent. Therefore,
P (H ) = P (H ) = 21 02 3 2 B66 ?1 77 66 3 p(Y ; Y j H ) N B B@64 75 ; 64 ?1 0 0
1
2
1
0
14
31 0 77C 75CCA 2
02 3 B 66 1 77 B@ 64 75 ; p(Y ; Y j H ) N B 1
2
1
1
2 66 3 64
31 0 77C 75CCA
02
After trying all possible fusion rules for the (1,1), (1,2) and (1,3) scenarios we nd the best rule for each scenario. The resulting optimum decision regions and Pe for optimum centralized rules and optimum distributed rules with dierent number of decisions are provided in Figure 2. The notation \CEN" stands for centralized rules, and the notation \DIS" indicates distributed rules. In each part, a couple of lines divided the entire plane into several regions. Each region is assigned to a hypothesis. The little \x"s represent the signal for each hypothesis. Observing these interesting examples, we see that optimum centralized regions for this problem have boundaries which are straight lines. All distributed fusion rules try to use a combination of horizontal lines and vertical lines to approximate the straight lines. The more decisions are used, the better the approximation. Except when nL > K n , in this case n > 2 , excessive decisions at the second sensor 1
2
can not improve the optimum global performance. This is illustrated by part (c) and (d) of Figure 2. In Figure 3, we vary SNR at the second sensor while xing SNR at the rst sensor and compute Pe for centralized and various distributed schemes. The dotted line represents the result obtained by using only the rst or the second sensor. The gure indicates that when the ratio of SNRs is less than 0.1, all distributed schemes will choose only the rst sensor to make a decision. It is not surprising because the second sensor can provide little information. When the ratio of SNRs is greater than 10, all distributed 15
schemes have similar performances to the centralized schemes. We can see that for nearly identical SNRs there is a distinguishable improvement in global performance when using two sensors instead of one. However the dierence of performance between centralized scheme and distributed scheme is much smaller. This suggests that only one or two binary decisions can give adequate performance with considerable complexity decrease in cases like this one. We continue our investigation in this case for other distributed schemes. This time instead of xing the number of decisions at the rst sensor, we x the total number of decisions at the two sensors. Again we have tried all possible fusion rules and show the best results we have found in Figure 4. From Figure 4, we see that the (2,2) scheme implements the best approximate boundary so that it yields the best performance in this case. Figure 5 provides the curve of probabilities of error plotted against the ratio of sensor SNRs for this case. We can see that the three distributed schemes perform quite well. They always perform much better than the scheme using only one sensor. Moreover, the (2,2) scheme always yields the best global performance of the three distributed schemes. Now we add two more hypotheses and still assume all hypotheses have equal prior probabilities.
P (H ) = P (H ) = P (H ) = P (H ) = 41 02 3 2 31 B66 ?2 77 66 3 0 77CC p(Y ; Y j H ) N B B@64 75 ; 64 75CA ?2 0 2 0
1
1
2
2
0
16
3
02 3 2 B66 ?1 77 66 3 B@64 75 ; 64 p(Y ; Y j H ) N B
31 0 77C 75CCA
02 3 B 66 1 77 p(Y ; Y j H ) N B B@ 64 75 ;
2 66 3 64
31 0 77C 75CCA
02 3 B 66 2 77 p(Y ; Y j H ) N B B@ 64 75 ;
2 66 3 64
31 0 77C 75CCA
1
1
1
2
1
2
?1
2
2
1
3
02
02
02 2 In this case we can not try all fusion rules due to the enormous computation. Thus we try only fusion rules in a small set which we consider most likely to be optimum. We have tried dozens of such fusion rules and none of them gives a better results than the rule we suggest in (10). Figure 6 gives the optimum decision regions and Pe for centralized rules and several distributed rules, including two cases using (10) and another rule using the same number of decisions. The graphs labeled \OPT(1,4)" and \OPT(4,1)" are results obtained from (10). The dierent order of the two numbers in brackets implies we switch the sensor which makes the extra decisions. Note that only making extra decisions at the sensor with greater SNR yields optimum performance. Part (d) is the same as part (c). However the fusion rule used in (d) is
8 > > min(U ; ; U ; ) > > > > > < min(U ; ; U ; ) U = F (U ; U ) = > > min(U ; ; U ; ) > > > > > :U ; 0
1
20
21
U = 0;
21
22
U = 1;
22
23
U = 2;
2
23
17
1
1
1
U = 3: 1
This illustrates that the combination of fusion rule and its corresponding group of sensor decision rules which yields optimum global performance are not necessarily unique.
7 Conclusion We investigated distributed detection problems for cases with nonbinary hypotheses. The necessary conditions which describe the optimum sensor decision rules under a given fusion rule are provided. Further, we uncover some interesting general properties of distributed detection schemes. In particular we consider how much information must be transmitted from one sensor when the total amount of information transmitted from all the other sensors is xed. We show there is an upper bound on the number of bits for a given sensor decision that should be used. Using more bits at the sensor will not generally lead to improvements in the performance of the optimum scheme. In some special cases, for example, with independent observations and monotone likelihood ratios, we show that a stronger result can be obtained. The stronger result provides with a much smaller bound on the number of bits which should be used. These results give important guidance in the design of distributed detection schemes. Finally we analyze the issue of how to allocate bits across the sensors when the total number of bits used for all sensor decisions is xed.
References [1] R. R. Tenney and N. R. Jr. Sandell, \Detection with distributed sensors," IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-17, pp. 501-510, April
18
1981. [2] Z. Chair and P. K. Varshney, \Optimal data fusion in multiple sensor detection systems," IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-22, pp. 98-101, Jan. 1986. [3] I. Y. Hoballah and P. K. Varshney, \Distributed Bayesian signal detection," IEEE Transactions on Information Theory, Vol. IT-35, No. 5, pp. 995-1000, Sept. 1989.
[4] P. K. Willett and D. Warren, \The suboptimality of randomized test in distributed and quantized detection systems," IEEE Transactions on Information Theory, Vol. IT-38, pp. 355-361, 1992. [5] W. W. Irving and J. N. Tsitsiklis, \Some properties of optimal thresholds in decentralized detection," IEEE Transactions on Automatic Control, Vol. 39, No. 4, pp. 835-838, April 1994. [6] R. Viswanathan and P. K. Varshney, \Distributed detection with multiple sensors part I: fundamentals," Proceedings of the IEEE, pp. 54-63, Jan. 1997. [7] R. S. Blum, S. A. Kassam, and H. V. Poor, \Distributed detection with multiple sensors: part II - advanced topics," Proceedings of the IEEE, pp. 64-79, Jan. 1997. [8] F. A. Sadjadi, \Hypothesis testing in a distributed environment," IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-22, No. 2, pp. 134-137, March 1986.
19
[9] J. N. Tsitsiklis, \Decentralized Detection," in Advances in Statistical Signal Processing - Vol. 2: Signal Detection, H. V. Poor and J. B. Thomas, Eds. JAI Press:
Greenwich, CT, 1993. [10] Z. B. Tang, K. R. Pattipati, and D. L. Kleinman, \A distributed M-ary hypotheses testing problem with correlated observations," IEEE Transactions on Automatic Control, Vol. 37, No. 7, pp. 1042-1046, July 1992.
[11] E. L. Lehmann, Testing statistical hypotheses, New York: Wiley, 1986. [12] P. F. Swaszek, P. Willett and R. S. Blum, \Distributed detection of dependent data - the two sensor problem," 29th Annual Conference on Information Sciences and Systems, Princeton University, pp. 1077-1082, Princeton, N.J., March 1996.
20
List of Figures Figure 1 - Possible regions for optimum schemes Figure 2 - Optimum decision regions for one K=2 case Figure 3 - Optimum global performance for one K=2 case Figure 4 - Optimum decision regions for another K=2 case Figure 5 - Optimum global performance for another K=2 case Figure 6 - Optimum decision regions for one K=4 case
21
(a) The case of Theorem 5 (n1,n2)
(b) The case of Remark 4 (N1,N2)
16
16
14
14
K=2 K=3 K=4
12 10
K=2 K=3 K=4
12 10
8
8
6
6
4
4
2
2 5
10
15
5
(c) The case of Theorem 6 (n1,n2)
10
15
(c) The case of Remark 5 (N1,N2)
16
16
14
14
12
12
10
10
8
K=2 K=3 K=4
8
K=2 K=3 K=4
6 4
6 4
2
2 5
10
15
5
10
15
Figure 1: Possible regions for optimum schemes (a) CEN
Pe=0.1807
(b) DIS(1,1) Pe=0.2151
5
5
x
x
0
0 x
−5 −5
x
0
−5 −5
5
(c) DIS(1,2) Pe=0.1974
0 (d) DIS(1,3) Pe=0.1974
5
5
x
x
0
0 x
−5 −5
5
x
0
−5 −5
5
0
5
Figure 2: Optimum decision regions for one K=2 case
22
FIRST
0.25
SECOND
Probability of error (Pe)
DIS(1+1) DIS(1+2)
0.2
CEN
0.15
0.1
0.05
0 −1 10
0
1
10 Ratio of SNR (SNR2/SNR1)
10
Figure 3: Optimum global performance for one K=2 case case (a) SECOND
Pe=0.2398
(b) DIS(3,1) Pe=0.2029
5
5
x
x
0
0 x
−5 −5
x
0
−5 −5
5
(c) DIS(2,2) Pe=0.1932
0 (d) DIS(1,3) Pe=0.1974
5
5
x
x
0
0 x
−5 −5
5
x
0
−5 −5
5
0
5
Figure 4: Optimum decision regions for another K=2 case
23
FIRST
0.25
SECOND
Probability of error (Pe)
DIS(3+1) DIS(2+2)
0.2
DIS(1+3)
0.15
0.1
0.05
0 −1 10
0
1
10 Ratio of SNR (SNR2/SNR1)
10
Figure 5: Optimum global performance for another K=2 case (a) CEN
Pe=0.4143
(b) OPT(4,1) Pe=0.4277
5
5
x
x
x
x
0
0 x
x
x
x
−5 −5
0
−5 −5
5
(c) OPT(1,4) Pe=0.4228
0
5
(d) DIS(1,4) Pe=0.4228
5
5
x
x
x
x
0
0 x
x
x
−5 −5
x
0
−5 −5
5
0
5
Figure 6: Optimum decision regions for one K=4 case
24