pdfs which lead to monotone likelihood ratio, a scheme which obtains optimum global perfor- mance and satis es the following inequality can be found. n1 ?1 n2.
Presented at the 2nd International Conferrence on Information Fusion
FUSION'99
On the Maximum Number of Sensor Decision Bits Needed for Optimum Distributed Signal Detection Jun Hu and Rick S. Blum EECS Department Lehigh University Bethlehem, PA 18015
Abstract Distributed multisensor detection
problems with quantized observation data are investigated for cases of nonbinary hypotheses. The observations available at each sensor are quantized to produce a multiplebit sensor decision which is sent to a fusion center. At the fusion center, the quantized data are combined to form a nal decision using a predetermined fusion rule. Firstly, it is demonstrated that there is a maximum number of bits which should be used to communicate the sensor decision from a given sensor to the fusion center. This maximum is based on the number of bits used to communicate the decisions from all the other sensors to the fusion center. If more than this maximum number of bits is used, the performance of the optimum scheme will not be improved. Then in some special cases of great interest, the bound on the number of bits that should be used can be made signi cantly smaller. Finally, the optimum way to distributed a xed number of bits across the sensor decisions is described for two-sensor cases. Illustrative numerical examples are presented at the end of this paper. Keywords: distributed signal detection, decentral-
ized detection, quantization for detection, M-ary hypothesis, multiple bit sensor decisions. This paper is based on work supported by the Oce of Naval Research under Grant No. N00014-97-1-0774 and by the National Science Foundation under Grant No. MIP-9703730.
1 Introduction
Signal detection algorithms which process quantized data taken from multiple sensors continue to attract attention [1, 2, 3, 4, 5]. Such algorithms have been classi ed as distributed signal detection algorithms. The majority of distributed signal detection research has focused on cases with statistically independent observations, binary hypothesis testing problems and binary sensor decisions [6, 7]. Studies on nonbinary hypothesis testing problems have been lacking. An early paper on this topic [8] provided equations describing the necessary conditions for the optimum sensor processing. A more complete discussion which includes a thorough treatment of the necessary conditions for the case of independent observations is given in [9]. A nice discussion of the complexity of cases with dependent observations is also given in [9]. Neither [8] nor [9] give any numerical examples. A numerical procedure for nding the optimum processing scheme was provided in [10] for cases with dependent observations and nonbinary hypothesis and a few numerical examples are provided. However, studies of the properties of optimum schemes have been lacking. In this paper, we demonstrate that no more than a certain number of bits should be used to communicate a sensor decision from a particular sensor to the fusion center. Using more than that number of bits at a given sensor is unnecessary and will not generally lead to increases in performance. The number of bits which should be used is limited by the number of bits used to communicate all the other sensor
decisions to the fusion center. The paper is organized as follows. In Section 2 we present the model for the observations and the distributed decision making. In Section 3, we prove that a nite number of bits can be used for the sensor decision at a given sensor without sacri cing any performance. In Section 4, we strengthen our results for some special cases of great interest. In Section 5, we consider how to allocate bits between two sensors in the case where the total number of bits used for sensor decisions is xed. Section 6 provides some illustrative numerical examples. Finally, our conclusions appear in Section 7.
2 Problem Formulation
Consider a multiple hypothesis testing problem using L sensors, where H0 ; H1 ; : : : ; HK ?1 are K hypotheses, and P (Hi ); i = 0; : : : ; K ? 1 are the prior probabilities for each hypothesis. Assume an mk dimensional vector of observations
yk = [yk;1; yk;2 ; : : : ; yk;m ]; k
yk;l 2 R
is observed at the kth sensor, k = 1; : : : ; L. De ne y = [y1 ; y2 ; : : : ; yL ], and let p(y j Hi ); i = 0; : : : ; K ? 1 denote the known joint conditional probability density functions (pdfs) under each hypothesis. Note that we do not assume y1 ; y2 ; : : : ; yL are independent when conditioned on the hypothesis. Let P (Dj j Hi ) represents the probability that a nal decision for hypothesis Hj is made given hypothesis Hi is true. Here we consider the criterion of minimum probability of error.
Pe = 1 ? Pc
(1)
where
Pc =
KX ?1 i=0
P (Hi )P (Di j Hi)
Our goal is to design a system minimizing Pe . A typical centralized detection system requires each sensor to transmit its observations
to a fusion center, then the fusion center produces a decision U0 as
U0 = d(y) = d(y1; y2 ; : : : ; yL)
(2)
where d is the decision rule and U0 = j corresponds to deciding that hypothesis Hj is true. A well known decision rule for centralized detection systems is to compare the joint likelihood ratio to a threshold. Due to a variety of reasons, it may be dicult to realize such a centralized system in practice. In a distributed detection system, the observation data at the kth sensor are rst quantized to a discrete variable Uk taking on only Nk possible values. For simplicity1, it is common to choose Nk to be a power of K , or Nk = K n . Thus Uk can be represented as a nk -digit, K-ary number produced by an nk dimensional vector of decisions k
Uk = [Uk;0 ; Uk;1 ; : : : ; Uk;n ?1 ]; k
Uk;l 2 f0; 1; : : : ; K ? 1g made at the kth sensor, using a group of decision rules
dk = [dk;0 ; dk;1 ; : : : ; dk;n ?1 ] Uk;l = dk;l (yk ) k = 1; : : : ; L; l = 0; : : : ; nk ? 1 k
(3)
Please note that each dk;l denotes a scalar function while d with only single subscript denotes a vector of functions. To avoid confusion between Nk and nk , Nk is called the number of quantization levels in the sequel. De ne the combination of all but the last sensor decision as U = [U1 ; U2 ; : : : ; UL?1 ]. All sensor decisions are sent to the fusion center to determine a nal decision U0 with a chosen fusion rule F
U0 = F (U ; UL ) = F (U1 ; U2 ; : : : ; UL) = F (d1;0 (y1 ); : : : ; d1;n1 ?1 (y1 ); : : : ; dL;0 (yL ); : : : ; dL;n ?1 (yL)) (4) L
where U0 and all Uk;l can take on any of the values 0; : : : ; K ? 1. 1 Also see Remark 1 following Theorem 1.
3 Limits on Number of Bits Used in Sensor Decisions
Here we will show that the number of bits which should be used to represent the sensor decision at one given sensor is limited by the number of bits used at all the other sensors. Furthermore, due to the non-uniqueness of describing a distributed detection system by de ning a fusion rule and a set of sensor decision rules, there are many combinations of fusion rules and sensor decision rules which are optimum. To see this just consider a K = 2 case. Complementing all the sensor decision rule outputs and the fusion rule inputs will change nothing. We also present one speci c fusion rule that can always be used to achieve optimum global performance for some special cases.
Theorem 1 If the combination of the rst
L ? 1 sensor decisions, U , consists of n comP L ?1 nk , then a scheme ponents, where n = k=1
which obtains optimum global performance can be found in the case where the Lth sensor makes nL = K n decisions. Thus using an Lth sensor with nL > K n will not improve the optimum global performance.
Outline of the proof. U is a vector of K-
valued integers. We can also see it as a number in the K-ary number system. Then U takes on the values of 0; : : : ; K n ? 1 and we can view F (U ; UL ) as a function of two variables, U and UL. The theorem will be proved if we show that whenever nL > K n
6 j 9i; j = 0; : : : ; K n ? 1; i = n 8U = 0; : : : ; K ? 1 F (U ; i) F (U ; j ) (5) L
Thus it does not make sense to distinguish the decision of UL = i and UL = j since either decision leads to exactly the same nal decision. We can choose a new decision rule at the Lth sensor which uses just one value to represent these two values i; j without changing performance. If UL is xed, F (U ; UL ) degenerates to a function of one variable which maps each possible value of U into a possible value
of U0 . Alternately we can say each value of UL will correspond to a function from U to U0 . Since U takes on K n dierent values and U0 takes on K dierent values. Obviously there are only K K dierent mapping patterns between the domain and the range. Hence there are totally K K dierent functions from U to U0 . Whenever nL > K n, the number of possible values of UL is greater than the number of dierent functions. Thus at least two values of UL will correspond to the same function. This statement is equivalent to (5). 2 Remark 1 Suppose the number of quantization levels at the kth sensor, Nk , can not be represented as a power of K , then a slightly more general result can be obtained. Using the same argument we can prove an optimum scheme can Qbe found in the case of NL = K N , ?1 Ni . where N = kL=1 Theorem 2 In the case of nL = K n, we can obtain optimum performance by employing the speci c fusion rule U0 = F (U ; UL ) = UL;U (6) where UL = [UL;0 ; UL;1 ; : : : ; UL;K ?1 ], U = 0; : : : ; K n ? 1. Outline of the proof. We have shown that a scheme with nL = K n can be used to achieve optimum performance. If we could also show that the fusion rule in (6) can be used with some groups of sensor decision rules to implement any scheme in this case, Theorem 2 is proved. Consider an arbitrary scheme for the case of nL = K n . It consists of a fusion rule F^ and a group of decision rules d^1 ; : : : ; d^L . For a given K-ary integer B taking on any of the values 0; : : : ; K K ? 1, let [b0 ; : : : ; bK ?1 ] denote its digits. Now de ne a group of sets
B = fyL : F^ (0; d^L (yL )) = b0 ; : : : ; F^ (K n ? 1; d^L (yL) = bK ?1 g After determining the set B corresponding to each possible value of B , we can use these sets to de ne a new decision rule for the Lth sensor. dL;l (yL) = bl if yL 2 B ; l = 0; : : : ; K n ? 1 n
n
n
n
n
n
The fusion rule in (6), with the sensor decision rules d^1 ; : : : ; d^L?1 and dL can insure the overall scheme produces the same output for any yL. Thus the fusion rule in (6) will always allow us to obtain optimum performance. Further, generally we can not decrease nL any more without aecting the global performance. 2
4 A Stronger Result
In Theorem 1, the upper bound of nL grows rapidly as n increases. However in some more speci c cases, we can have a stronger result. As an example, consider a hypothesis testing problem with statistically independent Gaussian data with dierent mean vectors under each hypothesis.
mk = 1; yk 2 R; k = 1; : : : ; L Hi : y = [y1; y2 ; : : : ; yL] N (Mi; C ); i = 0; : : : ; K ? 1 (7) where C = diag[12 ; 22 ; : : : ; L2 ], j2 are the variances of observation data yj and Mi = [E fy1 j Hi g; E fy2 j Hi g; : : : ; E fyL j Hi g]T ; i = 0; : : : ; K ? 1 are the mean vectors under each hypothesis.
De nition 1 A single-interval is a set of real numbers, fx j a < x < bg. Such a single-
interval could be a nite length interval, the entire real line (a=-1 and b=+1), a semiin nite interval (either a=-1 or b=+1 while the other is nite), or the empty set (a=b).
Theorem 3 For the Gaussian shift in mean
problem in (7), a scheme which obtains optimum global performance can be found in the case of nL = n + 1.
Outline of the proof. It is sucient to prove
that all optimum schemes used in this case can be implemented by a new scheme with nL n + 1. Let us consider an arbitrary optimum scheme. For statistically independent observations, It is equivalent to a set of comparisons of a likelihood ratio to a corresponding threshold. Each comparison yields a semi-in nite interval. So the intersection of these comparisons is a single-interval. For
xed U , F (U ; dL (yL )) is a multiple-step function that can be determined by those thresholds between neighboring single-intervals, the number of which is less than K . For each values of U , we need less than K thresholds, t1;U ; : : : ; tK ?1;U , and U has K n possible values, so less than K K n = K n+1 thresholds can completely determine F (U ; dL (yL )). We can put all the thresholds together and sort them by ascending order. The sorted sequence of thresholds are denoted by t01 < : : : < t0K +1?1 . These thresholds will divide the entire real axis of yL into not more than K n+1 intervals. For any value of U , all points in each interval fyL : t0k < yL < t0k+1 g will yield the same F (U ; dL (yL)) If we assign a dierent value of UL to each interval, the new scheme will satisfy nL n + 1 and implement the considered arbitrary optimum scheme. 2 n
Remark 2 From the proof of Theorem 3, it is
clear that this stronger result is only based on the optimum sensor test being a direct threshold test of the observation. In the case we consider, this is true due to the monotone likelihood ratio property of Gaussian pdfs. Hence the result can be applied to any problem with independent observations and monotone sensor likelihood ratios. Studies have shown that a large family of pdfs have monotone likelihood ratios [11]. Cases of dependent observations may also possess the required property. Some cases with Gaussian pdfs are shown to have this property in [12]. Theorem 3 can be used in all these cases.
5 A Case with Fixed Overall Number of Sensor Decisions
In a two-sensor system where the total number of bits is xed, we denote a scheme by (n1 ; n2 ), where n1 and n2 are the number of bits used for decision from the rst sensor and the second sensor respectively. Note that increasing both n1 and n2 will improve the performance, so the constraint is reasonable.
Theorem 4 Given n1 + n2 is xed, then regardless of the statistical characteristics of ob-
servation data, a scheme which obtains optimum global performance and satis es the following inequality can be found.
logK n1 n2 K n1
Outline of the proof. Suppose we have
found a scheme (n1 ; n2 ) which achieves optimum global performance but n2 > K n1 . From Theorem 1, we know the excessive decisions used at the second sensor is wasted. We can allocate some of them to the rst sensor to improve the overall performance. Thus we prove n2 K n1 . Now switching n1 and n2 in the argument, we can also prove that n1 K n2 , or logK n1 n2 . 2
Remark 3 Suppose the number of quantization levels Nk can not be represented as a power of K . Using the relationship between Nk and nk , we can get an inequality for N1 and N2 .
As opposed to the result of Theorem 5, the value of N1 and N2 are still to some extent undetermined. Only numerical techniques can nd the exact N1 ; N2 used by the optimum scheme. (a) The case of Theorem 4 (n1,n2)
(b) The case of Remark 3 (N1,N2)
16
16
14
14
K=2 K=3 K=4
12 10
10
8
8
6
6
4
4
2
K=2 K=3 K=4
12
2 5
10
15
5
(c) The case of Theorem 5 (n1,n2)
15
(c) The case of Remark 4 (N1,N2)
16
16
14
14
12
12
10
10
8
10
K=2 K=3 K=4
8
K=2 K=3 K=4
6 4 2
6 4 2
5
10
15
5
10
15
logK N1 N2 K N1
Figure 1: Regions for optimum scheme
Theorem 5 In those special cases with inde-
We can demonstrate our results in the plane of (N1 ; N2 ) or the plane of (n1 ; n2 ), as shown in Figure 1. In each of the four parts of this gure we label the Theorem or Remark it illustrates. We know only schemes in the region between the two curves can obtain optimum global performance, so we never need to search in those regions outside the two curves.
pendent observations, known signals and noise pdfs which lead to monotone likelihood ratio, a scheme which obtains optimum global performance and satis es the following inequality can be found.
n1 ? 1 n2 n1 + 1
Outline of the proof. The proof of this the-
orem is quite simple. All we need to do is to substitute the stronger result from Theorem 3 in place of the result from Theorem 1 in the proof of Theorem 4. Then we can prove both n2 n1 + 1 and n1 n2 + 1. So the theorem is proved. This result implies the optimum scheme would like to divide the number of decisions evenly or nearly evenly between the two sensors. If the sum of n1 + n2 is even, then n1 = n2 , otherwise n1 = n2 1. 2
Remark 4 In the above cases, if the number of quantization levels Nk can not be represented as a power of K , the inequality for N1 and N2 will be N1 N KN 2 1 K
6 Numerical Results
In the following numerical investigations, we consider detecting a known signal in Gaussian noise with two sensors. If both U1 and U2 are binary variables, there are 221+1 = 16 possible nonrandomized fusion rules: two rules are trivial (U0 = 1; U0 = 0), four rules ignore one of the sensors (U0 = U1 ; U0 = U1 ; U0 = U2 ; U0 = U2 ), four are AND rules (U0 = U1 U2 ; U0 = U1 U2 ; U0 = U1 U2 ; U0 = U1 U2 ), four are OR rules (U0 = U1 + U2 ; U0 = U1 + U2 ; U0 = U1 + U2 ; U0 = U1 + U2 ), and two rules are exclusiveor operations (U0 = U1 U2 ; U0 = U1 U2 ). If either U1 or U2 takes on more than two values, or one bit, there will be more fusion rules. We can use the following equations to calculate the
total number of possible fusion rules. M = K K 1+ 2 (8) Of course, theoretically we could compute the performance of the best scheme using each possible nonrandomized fusion rule and pick the one with minimum probability of error. But as n1 and n2 grow, this require signi cant computation. For instance, in a scheme of (1,3) 4or (2,2), the number of fusion rules is M = 22 = 256. which is 16 times of that in a scheme of (1,1). For a xed fusion rule, we employ a discretized Gauss-Seidel iterative algorithm [10, 12] and attempt to nd all solutions to the necessary conditions in Appendix A. With every set of initial decision rules we have tried, the algorithm always converges to the same result in a nite number of iteration steps. Due to this, we take the solution we found as the optimum solution and use this group of sensor decision rules and the fusion rule to calculate Pe. Assume the observation data at the sensors consists of dierent constant signals for each hypothesis and additive Gaussian noise n1 N (0; 3) and n2 N (0; 2). H0 : Y1 = ?1 + n1; Y2 = ?1 + n2 H1 : Y1 = 1 + n1; Y2 = 1 + n2 We also assume all hypotheses have equal prior probabilities and the noise samples at the two sensors are statistically independent. Therefore, P (H0 ) = P (H1 ) = 21 ! " # ? 1 p(Y1 ; Y2 j H0 ) N ( ?1 ; 30 02 ) ! " # 1 p(Y1 ; Y2 j H1 ) N ( 1 ; 30 02 ) After trying all possible fusion rules for the scheme of (1,1), (1,2) and (1,3), we nd the best rule for each scheme. The resulting optimum decision regions and Pe for optimum centralized rules and optimum distributed rules
Pe=0.1807
(b) DIS(1,1) Pe=0.2151 5
n
x
x
0
0 x
−5 −5
x
0
−5 −5
5
(c) DIS(1,2) Pe=0.1974
0
5
(d) DIS(1,3) Pe=0.1974
5
5
x
x
0
0 x
−5 −5
x
0
−5 −5
5
0
5
Figure 2: Optimum decision regions (I) FIRST
0.25
SECOND DIS(1+1)
Probability of error (Pe)
n
(a) CEN 5
DIS(1+2)
0.2
CEN
0.15
0.1
0.05
0 −1 10
0
10 Ratio of SNR (SNR2/SNR1)
1
10
Figure 3: Optimum global performance (I) with dierent number of bits used in the sensor decisions are provided in Figure 2. The notation \CEN" stands for centralized rules, and the notation \DIS" indicates distributed rules. In each part, a line divided the entire plane into two regions. Each region is assigned to a hypothesis. The little \x"s represent the signal for each hypothesis. Observing these interesting examples, we see that optimum centralized regions for this problem have boundary which is a straight line. All distributed fusion rules try to use a combination of horizontal lines and vertical lines to approximate the straight line. The more bits are used, the better approximation is achieved. But when nL > K n , in this case n2 > 21 , excessive decisions at the
7 Conclusion
We investigated distributed detection problems for cases with nonbinary hypotheses. We uncover some interesting general properties of
(a) SECOND
Pe=0.2398
(b) DIS(3,1) Pe=0.2029
5
5
x
x
0
0 x
−5 −5
x
0
−5 −5
5
(c) DIS(2,2) Pe=0.1932
0
5
(d) DIS(1,3) Pe=0.1974
5
5
x
x
0
0 x
−5 −5
x
0
−5 −5
5
0
5
Figure 4: Optimum decision regions (II) FIRST
0.25
SECOND DIS(3+1) Probability of error (Pe)
second sensor can not improve the optimum global performance. This is illustrated by part (c) and (d) of Figure 2. In Figure 3, we vary SNR at the second sensor while xing SNR at the rst sensor and compute Pe for centralized and various distributed schemes. The dotted line represents the result obtained by using only the rst or the second sensor. The gure indicates that when the ratio of SNRs is less than 0.1, all distributed schemes will choose only the rst sensor to make a decision. It is not surprising because the second sensor can provide little information. When the ratio of SNRs is greater than 10, all distributed schemes have similar performances to the centralized schemes. We can see that for nearly identical SNRs there is a distinguishable improvement in global performance when using two sensors instead of one. However the dierence of performance between centralized scheme and distributed scheme is much smaller. This suggests that only one or two binary decisions can give adequate performance with considerable complexity decrease in cases like this one. We continue our investigation in this case for other distributed schemes. This time instead of xing the number of decisions at the rst sensor, we x the total number of decisions at the two sensors. Again we have tried all possible fusion rules and show the best results we have found in Figure 4. From Figure 4, we see that the scheme of (2,2) implements the best approximate boundary so that it yields the best performance in this case. Figure 5 provides the curve of probabilities of error plotted against the ratio of sensor SNRs for this case. We can see that the three distributed schemes perform quite well. They always perform much better than the scheme using only one sensor. Moreover, the scheme of (2,2) always yields the best global performance of the three distributed schemes.
DIS(2+2)
0.2
DIS(1+3)
0.15
0.1
0.05
0 −1 10
0
10 Ratio of SNR (SNR2/SNR1)
1
10
Figure 5: Optimum global performance (II) distributed detection schemes. In particular we consider how much information must be transmitted from one sensor when the total amount of information transmitted from all the other sensors is xed. We show there is an upper bound on the number of bits for a given sensor decision that should be used. Using more bits at the sensor will not generally lead to improvements in the performance of the optimum scheme. Further, in some special cases, for example, with independent observations and monotone likelihood ratios, we show that a stronger result can be obtained. The stronger result provides with a much smaller bound on the number of bits which should be used. These results give important guidance
in the design of distributed detection schemes. Finally we analyze the issue of how to allocate bits across the sensors when the total number of bits used for all sensor decisions is xed.
Appendix A Necessary Conditions for the Optimum Sensor Rules
Under a xed fusion rule, the sensor decision rules must satisfy a group of necessary conditions to minimize Pe . Due to our con guration, Prob(U0 = i j y) is an indicator function ( if d(y) = i, Prob(U = i j y) = 01 otherwise.
De nition 2 A class of functions Lj;k;l(yk )
are de ned as follows
R R Lj;k;l(yk ) = PKi=0?1 Prob(U0 = i j y; dk;l (yk ) = j )P (Hi )p(Y j Hi) dydy (9) Q where dy = L dy k
k=1
k
Theorem 6 Suppose Pe is minimized, then 8k = 1; : : : ; L; l = 0; : : : ; nk ? 1; dk;l (yk ) must satisfy the following conditions
8yk 2 Rmk dk;l (yk ) = j if 8j = 6 j LL 0 yy > 1 0
References
j;k;l ( k )
j ;k;l ( k )
(10)
[1] R. R. Tenney and N. R. Jr. Sandell, \Detection with distributed sensors," IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-17, pp. 501-510, April 1981. [2] Z. Chair and P. K. Varshney, \Optimal data fusion in multiple sensor detection systems," IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-22, pp. 98-101, Jan. 1986. [3] I. Y. Hoballah and P. K. Varshney, \Distributed Bayesian signal detection," IEEE Transactions on Information Theory, Vol. IT-35, No. 5, pp. 995-1000, Sept. 1989.
[4] P. K. Willett and D. Warren, \The suboptimality of randomized test in distributed and quantized detection systems," IEEE Transactions on Information Theory, Vol. IT-38, pp. 355-361, 1992. [5] W. W. Irving and J. N. Tsitsiklis, \Some properties of optimal thresholds in decentralized detection," IEEE Transactions on Automatic Control, Vol. 39, No. 4, pp. 835-838, April 1994. [6] R. Viswanathan and P. K. Varshney, \Distributed detection with multiple sensors part I: fundamentals," Proceedings of the IEEE, pp. 54-63, Jan. 1997. [7] R. S. Blum, S. A. Kassam, and H. V. Poor, \Distributed detection with multiple sensors: part II - advanced topics," Proceedings of the IEEE, pp. 64-79, Jan. 1997. [8] F. A. Sadjadi, \Hypothesis testing in a distributed environment," IEEE Transactions on Aerospace and Electronic Systems, Vol. AES-22, No. 2, pp. 134-137, March 1986. [9] J. N.Tsitsiklis, \Decentralized Detection," Advances in Statistical Signal Processing Vol. 2: Signal Detection, H. V. Poor and J. B. Thomas, Eds. JAI Press: Greenwich, CT, 1993. [10] Z. B. Tang, K. R. Pattipati, and D. L. Kleinman, \A distributed M-ary hypotheses testing problem with correlated observations," IEEE Transactions on Automatic Control, Vol. 37, No. 7, pp. 10421046, July 1992. [11] E. L. Lehmann, Testing statistical hypotheses, New York: Wiley, 1986. [12] P. F. Swaszek, P. Willett and R. S. Blum, \Distributed detection of dependent data - the two sensor problem," 29th Annual Conference on Information Sciences and Systems, Princeton University, pp. 10771082, Princeton, N.J., March 1996.