A sharp upper bound for the probability of error of

3 downloads 0 Views 349KB Size Report
error of the likelihood ratio test is given for the detection in white. Gaussian noise ...... we obtain that the integral on the right-hand side is zero because the other ...
228

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 1, JANUARY 2002

A Sharp Upper Bound for the Probability of Error of the Likelihood Ratio Test for Detecting Signals in White Gaussian Noise Dominique Pastor, Roger Gay, and Albert Groenenboom, Member, IEEE

Abstract—A new sharp upper bound for the probability of error of the likelihood ratio test is given for the detection in white Gaussian noise of any random vector whose norm is greater than, or equal to, a given value and whose probability of presence is less than, or equal to, one half. Also, a new test for the detection of such vectors is described. This test does not depend on the distribution of the signal vector but nevertheless its probability of error is less than, or equal to, the given upper bound. Index Terms—Least favorable distribution, least favorable prior, likelihood ratio test, minimax test, modified Bessel functions, minimum probability of error (MPE) test, multivariate normal distributions, nonparametric detection, robust detection, threshold test, uniform distribution on spheres.

I. INTRODUCTION

D

ECISION theory provides various tests or decision rules enabling to decide, on the basis of a measurement or observation, whether a real -dimensional random signal is present or absent in a background of additive and independent noise [7], [11], [12]. Basically, each test is optimum with respect to a specific criterion. The Bayes, minimax and Neyman–Pearson criteria are suitable for problems where the probability distribution under each possible hypothesis is known, perhaps up to a scalar or vector parameter [11]. In practice, however, it may occur that “even if the model is reasonably good, our knowledge of the parameters in it […] may not be enough to justify a direct numerical evaluation of formulas derived from the model.” [7, Sec. I, p. 2232]]. When such a situation is due to small deviations from the reasonably good nominal model, an uncertainty model can be used for deriving a robust test [11, Sec. III.E.2]. Unfortunately, the degree of uncertainty in some binary hypothesis testing problems is so great that no uncertainty model can sensibly be used. It may happen that one (or both) of the two possible distributions is unknown or cannot be parametrized. For example, the echo a radar receives from a target is a kind Manuscript received June 1, 1999; revised March 15, 2001. D. Pastor was with Hollandse Signaalapparaten (part of Thales). He is now with Altran Technologies Netherlands B. V., 2132 NZ Hoofddorp, The Netherlands (e-mail: [email protected]). R. Gay is with the University of Bordeaux, Bordeaux, France (e-mail: Roger. [email protected]). A. Groenenboom is with Thales Nederland, 7550 GD Hengelo, The Netherlands (e-mail: [email protected]). Communicated by U. Madhow, Associate Editor for Detection and Estimation. Publisher Item Identifier S 0018-9448(02)00005-6.

of convolution of a known transmitted pulse and an unknown environment. Nonparametric decision addresses such situations [11, Sec. III.E.1]. Basically, nonparametric tests are designed for keeping invariant some “performance characteristic” over a wide range of possible distributions. A good example is that of the constant false alarm rate (CFAR) detectors [9] widely used in radar processing. These nonparametric tests aim at keeping the probability of false alarm at a predefined value regardless of the environment in which the radar is operating. A possible approach in robust and nonparametric detection consists of considering, as in [13], a class of signals constrained by some appropriate bounds. This is the approach adopted in this paper. The sequel describes the detection of random signals that are constrained by only two conditions. The norm must be greater than a given bound and the probability of presence, or prior, must be less than one half. In other words, the signal must be relatively big and less often present than absent. More precisely, under the assumption that the noise is white, Gaussian, and of known variance, a sharp upper bound is given for the probability of error of the minimum probability of error (MPE) test [11, Sec. II.B]. It then turns out that this bound is also a sharp upper bound for the probability of error of a threshold test derived below for detecting the signals of the class under consideration. This threshold height does not depend on the signal distribution. Therefore, the new upper bound on the probability of error is actually the invariant performance characteristic of the threshold test. The results described hereafter are derived in four main steps. 1) First, simple tests are considered. They consist of applying a threshold to the norm of the measurement vector which is either noise only or a superposition of noise and signal. It is shown that for a given prior of the signal, the probability of error of such a test is bounded by that obtained when the signal is of constant norm. For this result, the assumption that the noise is white and Gaussian can be relaxed somewhat. Spherical invariance and a form of monotonicity are sufficient. 2) The test that minimizes the probability of error among all possible tests is the MPE decision scheme. For any prior , it is shown that the least favorable signal distribution is the uniform distribution on the sphere of radius . This means that for any signal distribution, the probability of

0018–9448/02$17.00 © 2002 IEEE

PASTOR et al.: A SHARP UPPER BOUND FOR THE PROBABILITY OF ERROR OF THE LIKELIHOOD RATIO TEST

error of the associated MPE test is less than or equal to the probability of error of the best test in the least favorable case. The latter test is shown to be a threshold test and an explicit expression is given for the height of the threshold as a function of and . Like the previous one, this is a rather intuitive result: regardless of whether all possible tests or only threshold tests can be used, the most difficult cases occur when the signal is as weak as allowed under the assumptions. 3) Next, the situation is considered in which the signal distribution is the least favorable one described above, but in which the prior is unknown. It turns out that the least favorable prior is greater than one half. Moreover, the probability of error is a concave function of the prior with a unique maximum at the least favorable prior. Hence, for , the said function is increasing. The proof of these results is the most technical part of the paper, involving tight inequalities related to the Bessel functions. 4) For any minimum signal norm , an upper bound for the probability of error of a specified threshold test, and hence for the MPE test, can then be found by computing the probability of error in the least favorable case with and a signal of constant norm . The height of the threshold depends only on . Summarizing these four steps, it can be said that, as in [13], the class of hypothesis testing problems considered in this paper has the property that the MPE test for the least favorable problem is also a minimax test for the entire class. The four steps correspond to Sections IV–VII. In Section II, the detection problem is described, following [11], as a binary hypothesis testing problem and all the main results are described. Section III presents the material used in the subsequent sections on the basis of concepts from measure theory, along the lines of [2]. In particular, this material is helpful for describing the class of all possible tests. II. MAIN RESULTS This section gives an overview of the paper by summarizing the main mathematical results, after a brief problem statement. The overview is complete, except for some measure theoretical details, which are discussed in the next section. The problem is to decide on the presence or absence of a signal on the basis of a measurement or observation. This observation is a single, -dimensional real vector. If the signal is absent, this vector consists of noise only. Except in Section IV, it will be assumed that this noise has a standard normal, or Gaussian, distribution. The presence of the signal means that the observation is the superposition of noise and the signal, which is also an -dimensional random vector. Throughout this paper, it is assumed that the norm of the signal is greater than or equal to a known positive number . A suitable framework for the description of detection problems of this type is that of binary hypothesis testing, see for inis that only stance [7], [11], and [12]. The null hypothesis is that the noise is present, and the alternative hypothesis

229

observation is the sum of noise and a signal. All the above assumptions can be summarized by (1) A decision scheme or test is a map that assigns to the observation the number zero or one. The so-called threshold test that plays the crucial role in this paper is defined by if if

.

(2)

if is true or that The events that if is true are called the errors of the first and second kind, respectively. As a criterion for the quality of the test, we will use the probability of error, which is defined as a weighted average of the conditional risks (3) where the coefficient can be considered as the a priori probability that the alternative hypothesis is true. When the parameters , , and are fixed, the probability of error of the threshold test still depends on the distribution of the signal . However, in Section IV we will show that, among all , the ones distributions with the property that that minimize the probability of error of the threshold test are the distributions on the sphere of radius , characterized by the . In other words, for all problems property described by (1) we have

where is the probability of error of the threshold test for signals that always have norm when present. This probability is given by

where

is the cumulative density of the noncentral distribution with degrees of freedom, given by (19). , This result in a sense justifies studying the case but it does not justify the use of a threshold test, nor does it say which threshold to use. Therefore, the extra assumption is considered that the signal, when present, has a uniform distribution on the sphere of radius . In Section V, we show that, under this assumption, the MPE test, which minimizes the probability of , where error, is a threshold test. Moreover, if

the height of the threshold in this test is the solution for of the equation (4) For the probability of error of the test with threshold that occurs when the signal indeed has a uniform distribution

230

on the sphere of radius , we will use the notation . This function is studied in Section VI. It is is a smooth, concave funcshown that, for each , with a single maximum at . This tion on number is called the least favorable prior, because it is the prior for which the best possible test has the highest probability of error. The fact that the least favorable prior is always greater than one half has only limited relevance for applications, because it is a property of tests that require knowing the prior and that are optimal only under the condition that the signal has a uniform distribution on the sphere. A more practical result can be obtained by exploiting the related properties of the function , in is increasing on and particular the facts that that

because of (4). This is done in Section VII, where we show that, for any signal distribution satisfying the basic assumptions (1)

The fact that some test has a probability of error less than the implies that the MPE test also has bound this property. However, the latter test cannot be used if only the and are known because it depends on bounds the prior and the precise distribution of . On the other hand, the proposed threshold test does depend only on , and its performance is always as least as good as that of the MPE test in the least favorable case satisfying the bounds. As an example, we will now work out the situation for where the measurement vector has two components (see Re). For we obtain mark V.3 for the case

where is the zeroth-order modified Bessel function of the of the threshold first kind, and (4) for the optimal height reduces to

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 1, JANUARY 2002

situations where the probability of presence might also be less than one half, where the amplitude might also sometimes be greater than , and where the distribution of the phase is not necessarily uniform. III. NOTATIONS AND DEFINITIONS The detection problem is integrated in a framework where all the main objects are random vectors in the same probability space. This approach requires the introduction of a priori probabilities of presence and absence, in contrast with the Neyman–Pearson approach which avoids this. (Later, in Section VII, we will reduce the importance of the choice of these probabilities.) be three random vectors defined on a Let such that probability space

where is also a random variable. We assume that and are independent. is the probability of The real number presence of the random vector and denotes the corrupting , noise. The presence of the signal corresponds to the event . Hence, corresponds to the index of the and absence to true hypothesis in the binary hypothesis testing problem (1). , denotes the distribuFor any random vector tion of , namely, the positive measure such that

for any measurable subset of . For the same random vector, stands for the cumulative distribution function of . If is absolutely continuous with respect to the Lebesgue measure , we will denote by the corresponding density. on of the noise is absolutely We assume that the distribution . It folcontinuous with respect to the Lebesgue measure on lows from this assumption that the signal plus noise has the same property. Its density is (5)

The two components of the signal can be considered as the in-phase and quadrature components of a sinusoidal carrier. The detection of such a carrier in a background of noise is a well-known problem in detection theory, especially relevant for telecommunication and radar applications. A typical situation is that the carrier has a known amplitude but an unknown . The translation of phase, uniformly distributed on these assumptions in our terminology is that the signal to be detected has a uniform distribution on a sphere of dimension , that is a circle, with known radius. In his treatment of the “Noncoherent Detection of a Modulated Sinusoidal Carrier,” Poor [11, Example III.B.5, p. 65] makes the further assumption that the probability of presence of the carrier is one half and , shows that the ideal threshold is the solution of see [11, Example II.E.1, p. 34], corresponds to . Our main Theorem VII.1 extends the usefulness of this well-known solution. It shows that the same threshold test can be used in

These notations will be kept throughout this paper with the same meaning everywhere. We now introduce a few notions for describing tests. , such Definition III.1: Any measurable map has Lebesgue measure zero for any measurable set that with Lebesgue measure zero, is called an observation ). (in The technical condition in this definition (based on the theand orem of Lebesgue–Radon–Nikodym) makes sure that have densities. Two important examples for are the and the norm . identity in , deciding whether is or Given an observation in can be achieved using a binary decision and a binary test defined as follows. ) is any measurDefinition III.2: A binary decision (on . If is an observation in able function

PASTOR et al.: A SHARP UPPER BOUND FOR THE PROBABILITY OF ERROR OF THE LIKELIHOOD RATIO TEST

, and

a binary decision on , the composite function is called a binary (hypothesis) test.

is defined by The probability of error of a binary test . This definition is intuitively very is the index of clear. The value of the random variable the accepted hypothesis whereas the value of is the index of the true hypothesis. Therefore, an error occurs when these two makes sense values do not match. The notation mathematically, as the measure of an event in , because only one probability space has been introduced to describe the two hypotheses. In the present framework it is straightforward to compute that, for a binary test

that of threshold

. Hence, in combination with (8), we have, for any and any signal (11)

This basic inequality will be analyzed and made more precise throughout the following sections. IV. DETECTION IN SPHERICALLY INVARIANT NOISE From now on, the distribution of the noise is assumed to be spherically invariant. We introduce an auxiliary function whose properties (see Lemmas IV.1 and IV.2) are instrumental in making (11) more precise. We have, for any

(6) In an alternative approach, in which the two hypotheses are interpreted as different stochastic models that could underlie the observation, this expression cannot be derived but an expression like (6) or (3) has to be taken as the definition of the probability of error. , the likelihood ratio test that For a given observation in , where the will be considered in this paper is on is defined by binary decision if if

(7)

To avoid some technical complications, we will assume that the and only coincide on a set of functions measure . Then, the handling of equality in the likelihood ratio test does not matter. Sufficient conditions for this assumption or and that is Gaussian. Later, both are that conditions will be assumed. is that it The main property of the likelihood ratio test minimizes the probability of error among all possible tests based on the observation . , which is computed on the basis of the density Moreover, given in (5), is the likelihood ratio test which miniof among all possible obmizes the probability of error servations (8) , we also define the threshold test For any threshold as the composition of an indicator function and the norm , , which is equivalent to (2). The probability of error of such a test is

231

(12) where

is the total surface measure of the sphere . It follows from (12) that the origin in

of radius about

only depends on the norm of . Hence, we can define as the with the property unique function

For later use, we first remark that the probability of error of any threshold test can be expressed in terms of . According to Fubini’s theorem and (9), we have

(13) These two quantities are, respectively, the miss probability and the false alarm probability of the threshold test with threshold height . From (9) we now get the following lemma. Lemma IV.1: The probability of error of any threshold test equals

(14) where is the ball with radius and center . and exist, we have Moreover, since the densities (9) (10) is based on the observation of Because the threshold test , it cannot have a probability of error less than the norm

The fact that the noise is spherically invariant means that the in any point only depends on the norm of . density is a From now on, we make the stronger assumption that . Then the function nonincreasing function of has the same monotonicity property. with . If Lemma IV.2: Let a nonincreasing function then the function . also nonincreasing for any

is is

232

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 1, JANUARY 2002

Proof: Let Then

such that

for some

.

Proposition V.1: Suppose , the density of

and let

. If

is

(17) is a generalized hypergeometric function [8, where , p. 275]. Moreover, if has a uniform distribution on is the density of because

for

.

Henceforth, we assume that the norm of the signal is always greater than or equal to a given value . By always, we mean for or for almost every , and we will write every . This assumption, together with the earlier assumptions on the noise, allows us to refine the bounds on the probability of error given in (11) at the end of the previous section. ,

Proposition IV.3: For any , we have that

, and any

such

(18) Proof: Because of the spherical invariance of the noise, follows the same distribution as where is a fixed vector. Since , has a so-called noncentral distribution with degrees of freedom and noncentrality parameter . From [10], [9, p. 22, Theorem 1.3.4], its density is

(15) . A sufficient condition for equality is implies Proof: From lemma IV.2, we get that and (15) follows from (11) and (14). that , (13) reduces to Moreover, if (16)

The density of

follows since

Assuming now a uniform distribution of the signal on the , the distribution of is spherically invariant sphere is given by a formula similar to (12) and

and the second statement of the proposition is straightforward. This result, in combination with (11), provides upper bounds for the probability of error of the likelihood ratio test and we immediately have the following. Corollary IV.4: For any , we have that

,

, and any .

which allows us to compute (18) from (17). In combination with (16), this result allows us to compute the and conditional risks

such

V. DETECTING UNKNOWN DISTRIBUTIONS IN WHITE GAUSSIAN NOISE: THE CASE OF A KNOWN PRIOR From now on we assume, in addition to the previous assumptions, that the noise is Gaussian and white. More formally, we has a standard normal dissuppose that the random vector . Under this assumption, this section gives a tribution theoretical foundation to the heuristic idea that the probability is less than or equal to of error of the MPE test for the probability of error of the MPE test for the case (see Corollary V.4). First, we compute the explicit expansion of . Then, we prove that, in case of a uniform distribution of the signal on the sphere

the likelihood ratio test is exactly a threshold test (see Proposition V.2). The conclusion of this section (Corollary V.4) will then simply be a consequence of Proposition V.2 and Corollary IV.4.

(19) and, by setting

, it follows that (20)

The bound given in Proposition IV.3 can be made more precise by substituting these identities in (15). In order to compute a sharp upper bound for the probability of the likelihood ratio test when is fixed, we of error establish the following. Proposition V.2: If the distribution of the signal with radius and center on the sphere is the threshold test likelihood ratio test is defined as follows . threshold i)

If

is uniform , the where the

PASTOR et al.: A SHARP UPPER BOUND FOR THE PROBABILITY OF ERROR OF THE LIKELIHOOD RATIO TEST

then

We now obtain a sharp upper bound for bining Corollary IV.4 and Proposition V.2.

is the unique solution of (21)

ii) iii)

If If

then then

. .

The probability of error of the test is

233

by com-

Corollary V.4: For a given probability of presence and for , is an upper bound for any signal such that of the likelihood ratio test the probability of error which is reached when the signal is uniformly distributed on . the sphere VI. AN ESTIMATION OF THE LEAST FAVORABLE PRIOR

hence (22) , if , and if for some Conversely, if the likelihood ratio test is the threshold test , , then the distribution of is uniform on the sphere , and the probability of error is . , the likeliProof: If is uniformly distributed on hood ratio test defined by (7) is computed according to the exgiven in Proposition V.1 (18). pression of , the likelihood ratio test is given by Assuming first

(23)

if if

.

is increasing. Hence, if , the likelihood ratio test is the threshold test where is the unique solution of (21).

The function

If

, we have

in Corollary V.4, does To obtain a bound that, unlike not depend on the probability of presence , we now compute . It would be possible to use the general the maximum of results on minimax testing described in [11, Ch. II, Sec. C]. However, it is quite simple to compute this maximum directly , and that is what we will from the analytic expression of do in this section. is a strictly conIt is straightforward to verify that . As a matter of fact, differentiating cave function on with respect to leads to

. Hence

which is equivalent to the trivial condition . Therefore, the likelihood ratio test is the threshold test and we can . choose , we immediately get that the likelihood ratio test can If . be seen as the threshold test , the expression (22) for the probability of When or error is a direct consequence of (19) and holds even if . The proof of the converse statement is given in Appendix A for this result is not needed for the proof of the main Theorem VII.1. Remark V.3: If , the results described in Proposition V.2 . can take a simple form since for In particular, we will be interested later in the case which it turns out that

and that

where is the cumulative distribution function of the standard . normal distribution

verifies (21). It follows that the since the threshold for is exactly that of . sign of , obtaining We can differentiate (21) with respect to since is increasing and is deis decreasing and is strictly creasing. Thus, . Therefore, there exists only one value of concave in in , the so-called least favorable prior , that . This least favorable prior is maximizes the function the solution of the equation

which says that the conditional risks are equal. We now consider the hypothesis testing problem in which the signal is known to have a uniform distribution on the sphere , but in which the a priori probability of presence is unknown. The so-called minimax hypothesis test for this situwith . ation is the threshold test The probability of error of this test is less than or equal to , also when . This follows from [11, Proposition II.C.1]. According to Proposition IV.3, the bound . Under circumstances where remains valid whenever this is the only thing known about the distribution of , and in which the prior determining the distribution of is unknown, it as a good is therefore reasonable to use the minimax test suboptimal test. is not required However, even if the computation of is at all to carry out the minimax hypothesis test, since , simply the solution of the equation . the bound for the probability of error still depends on for all . This is why we will now show that On the basis of this estimation we will be able to give another bound, and another test, in Section VII. This bound and this test are valid for a wide class of signals, and do not require the . computation of

234

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 1, JANUARY 2002

Since

is the point where the strictly concave function defined in (22) reaches its maximum in and itself is greater than , a necessary and sufficient since is that condition for

with

and for

It follows that

(24) In our proof, this inequality will be a consequence of the following facts: i) ii)

Hence, since , we have of

, is a decreasing function of

.

in view of the definition

, which is the solution of (21) with Setting , the only result from the previous sections we really need for the proof of i) and ii) is the expression (27) imTaking into account that the power series expansion of , (27) gives the approximation mediately shows that (28) (25)

where

is a function such that

tuting (28) back into (27) and dividing by

. Substileads to

which is a direct consequence of (23), (19), (20), and the definition of the complementary incomplete gamma function We obtain in this way that [1, p. 260, eq. 6.5.3]. We begin by proving i) which is an easy consequence of the , the optimal value of the threshold asymptotic behavior of for the detection of a signal with norm always equal to and . probability of presence has the following asympLemma VI.1: The function totic expressions for tends to infinity:

where to get

is a constant which is independent of . It is now easy

Substituting this result in (28) gives (26). Inequality i) is a straightforward consequence of this lemma because it follows from (25) that

(26) with Proof: For Remark V.3. For [8, p. 275], that

, for any . , the result is a trivial consequence of , we have, from [1, p. 377, eq. 9.6.47] or

where . According to [1, p. 377, eq. 9.7.1] or [8, p. 123], the modified Bessel function of the first kind has the asymptotic expansion

and that

As a matter of fact, equality in i) can be proved as shown in Appendix C by using the asymptotic behavior of the optimal . threshold height We now prove ii) by showing that the derivative of is negative. We introduce

PASTOR et al.: A SHARP UPPER BOUND FOR THE PROBABILITY OF ERROR OF THE LIKELIHOOD RATIO TEST

for any positive real numbers , . We then have

235

[1, p. 377, eq. 9.6.47], that (29)

For the sake of convenience, we will use the notations (30) in the remainder of this section. It follows from (29) and (30) is proving the monothat proving the monotonicity of . The derivative of this function tonicity of is given by (31) where the partial derivatives can be obtained by differentiation under the integral sign. First, we have (32) , using [8, p. 241, eq. 9.2.2] for the In the same way, for derivative of the hypergeometric function, we obtain

With the substitution factor in (36) is now given by

, the sign-determining

It follows that a sufficient condition for the negativeness of is that and for all The proofs of these inequalities, given in Appendix B, are based on a technique from the theory of ordinary differential equations, see [5]. We can now state the main result of this section. Proposition VI.2: The least favorable prior

satisfies

(33) where the second step is obtained by developing the function in a power series, integrating by parts all terms and summing again. From (31)–(33) and the fact that (34) which follows from (21) with

and (30), we get

(35) and differenIntroducing the function , which follows from (34), to obtain the tiating , (35) can be written as ratio

VII. A SHARP UPPER BOUND FOR THE PROBABILITY OF ERROR IN DETECTING UNKNOWN DISTRIBUTIONS We now conclude by giving an upper bound for the probability of error of the MPE likelihood ratio test for detecting a random vector whose norm is always greater than, or equal to, a given minimum value and whose probability of presence is less than, or equal to, one half. This upper bound is associated with a threshold test. The threshold and the upper bound only depend on and the dimension of the vector of observations. The practical relevance of the threshold test is that, unlike the likelihood ratio test, it can be applied when the precise distribution of the signal and the probability of presence are not known. be three random Theorem VII.1: Let be a random variable defined vectors and let such that , , and on the same probability space are independent, , and . be the function of the positive real variable Let

(36) For proving ii), it is now sufficient to show that the last factor in . the right-hand side of (36) is negative for all This fact follows from the properties of the ratio of two Bessel functions. In fact, if we set and it follows from

where

is the unique positive solution for in the equation . Then, for any such that , and , is an for any probability of presence upper bound for the probability of error of both the likelihood and the threshold test . This bound is reached ratio test

236

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 1, JANUARY 2002

by both tests when has a uniform distribution on the sphere and . is always at least Proof: Since the likelihood ratio test as good as any threshold test, see (11), it is sufficient to show . The bound (15) given in Proposition that IV.3 can be written as

for every continuous function we get tuting

APPENDIX A PROOF OF THE CONVERSE STATEMENT IN PROPOSITION V.2 We begin with a lemma of analytic nature and afterwards, we will go back to the probabilistic aspect of the problem, that is the proof of the converse part in Proposition V.2. Recall that is another way of saying and . Lemma A.1: Let be a strictly positive real number. The noris the only malized Lebesgue measure on the unit sphere which complex measure of total measure equal to on verifies (37)

. Substi-

According to [4, p. 321, eq. 3.387], [6, p. 8, eq. 1.3], [7, p. 275], we have

According to (23) and (24), the coefficient of in the right-hand , (22) with side is positive. Hence, with the definition of (19) and (20), it follows that

If the random vector has a uniform distribution on the sphere and if then, according to Proposition V.2, the likecoincides with the threshold test lihood ratio test .

on

The relation between this function and the hypergeo[1, p. 377, eq. 9.6.47] completes metric function the proof that satisfies (37). We now prove that the normalized Lebesgue measure on is the only complex measure of total measure equal to on this sphere which satisfies (37). Let us assume that there exists such that, for every another measure on

where

is the Laplace transform of a measure on . Denoting , it follows that for all . be a spherical harmonic of degree . We have the Let now identity [3, Definition 2.6.1., Lemma 2.6.2., eq. 2.6.5., p. 37]

where for and

. Proof: To show that satisfies (37) we will distinguish and . two cases: we have whereas any meai) If sure of total measure equal to on can be written as with The power series expansion of gives , and the identity

ii)

gives . , where is the In the case . The following identity Lebesgue measure on can be found in [6, Ch. 1, eq. (1.2)]:

where of order . For . We have

and is the Bessel function , let us denote where

According to the analytic continuation principle this equality is . In particular, if , we obtain, for valid for every every

Hence, Fubini’s theorem gives

and, since for every such that , we obtain that the integral on the right-hand side is zero because the other factors on the right-hand side are nonzero. The identity

PASTOR et al.: A SHARP UPPER BOUND FOR THE PROBABILITY OF ERROR OF THE LIKELIHOOD RATIO TEST

for every spherical harmonic , hence . arbitrary degree implies that

of

We now prove the converse part of Proposition V.2. for some random signal vector Let us suppose that whose norm is almost everywhere equal to , a probability , and a threshold . From this of presence . The assumption and (11) it follows that can be derived by applying (7) to and test , where is using (17). Thus, we find that the function defined in Proposition V.2 i). Once again, applying , using (5) and , (7) but now to we find that if if Because

Since

. it follows from this that

237

were nonempty, the real number would be positive. whereas . This would imply Then , and that is increasing in some that , neighborhood of . Hence, since is positive in follows. the contradiction , according to the asymptotic We also have development of the modified Bessel function of first kind ([1, p. 377, eq. 9.7.1] or [8, p. 123, eq. 5.11.10]). Let us remark that, although we think the above lemma is probably a classical result, we were unable to find a precise reference. So we have provided a proof. It follows from the lemma that the map has an inverse. We conclude that it is an analytic diffeomorphism. This means that it is an analytic, bijective function whose inverse is also analytic. In fact, we know even more about this inverse. Lemma B.2: The inverse function of satisfies the following inequality:

verifies (21), the right-hand side is equal to . The change of variables , , and gives

According to Lemma A.1, it follows that is the normalized . We conclude that the distribution Lebesgue measure on of is uniform on . APPENDIX B PROPERTIES OF THE FUNCTION Lemma B.1: For , the function is an onto . increasing map from , the result holds since . Proof: For , it follows from the differentiation properties of For Bessel functions ([1, p. 376, eq. 9.6.26] or [8, p. 110, eq. 5.7.9]) that the function satisfies the following Riccati equation in :

(39) and let be the right-hand Proof: Let us set side of (39). For small , (39) follows from the power series developments

hence it suffices to show that there are no with . Suppose the contrary. From the continuity of and it follows that there is some such that for and . . But from the fact that This implies that satisfies the differential equation (38) it follows that

(38) The function is holomorphic in a neighborhood of the origin, and is positive on . Furthermore, the funcis locally Lipshitzian in the variable in the open set tion (this follows, for instance, from the fact that ). Furthermore, if is small enough, we have . It is also not hard to see that the function is a and that the function lower fence an upper fence on every set since and for . It then follows from the classical theory of differential equations [5] that the unique maximal solution of (38) (roughly speaking, the longest is defined possible integral curve), passing through and coincides with . on The function is increasing. We argue by contradiction. In . If the closed set fact,

Thus, (39) is proven by contradiction. We will now prove that for From [1, p. 375. eq. 9.6.7] or [8, p. 108, eq. 5.7.1], respectively, [1, p. 377, eq. 9.7.1] or [8, p. 123, eq. 5.11.10], it follows that (40) (41)

238

Let

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 1, JANUARY 2002

be a stationary point of . In other words, suppose that . In addition to the differential equation (38) we have , which allows us to compute

[1, p. 376, eq. 9.6.18], valid here since

where

, allow us to write

is independent of . It follows that, for a new constant

According to (39) we have

Hence, we have shown that implies . From this property combined with (40) and (41), it follows that for all . APPENDIX C A LIMIT PROPERTY FOR THE PROBABILITY OF ERROR FOR SIGNALS WITH BIG NORM

for . where, in the last step, we have used From (42) and (43), it follows that the integral in (25) vanishes.

Lemma C.1: We have the limit

For

Proof: For , we use Remark V.3 and Lemma VI.1. , Lemma VI.1 proves that

Hence, we only have to show that the first term in (25) tends to as tends to . Since is decreasing for large enough, it follows from Lemma VI.1 and (21) that

(42) The identity [1, p. 377, eq. 9.6.47]

and the integral representation

(43)

ACKNOWLEDGMENT The authors wish to thank the reviewers for their keen analyses of the paper and their hints that were especially useful for writing the Introduction. REFERENCES [1] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions. New York: Dover, 9th printing, 1972. [2] M. Basseville and I. V. Nikiforov, Detection of Abrupt Changes: Theory and Application. Englewood Cliffs, NJ: PTR Prentice-Hall, 1993. [3] S. Bochner, Harmonic Analysis and the Theory of Probability. Berkeley and Los Angeles, CA: Univ. Calif., 1960. [4] I. S. Gradshtein and I. M. Ryshik, Table of Integrals, Series and Products. New York: Academic, 1980. [5] J. H. Hubbard and B. H. West, Differential Equations: A Dynamical Systems Approach. Ordinary Differential Equations (Texts in Applied Mathematics). Berlin, Germany: Springer-Verlag, 1995, vol. 5. [6] F. John, Plane Waves and Spherical Means Applied to Partial Differential Equations. New York: Interscience, 1955. [7] T. Kailath and H. V. Poor, “Detection of stochastic processes,” IEEE Trans. Inform. Theory, vol. 44, pp. 2230–2259, Oct. 1998. [8] N. N. Lebedev, Special Functions and Their Applications. Englewood Cliffs, NJ: Prentice-Hall, 1965. [9] G. Minkler and J. Minkler, The Principles of Automatic Radar Detection in Clutter, CFAR. Baltimore, MD: Magellan, 1990. [10] R. J. Muirhead, Aspects of Multivariate Statistical Theory. New York: Wiley, 1982. [11] H. V. Poor, An Introduction to Signal Detection and Estimation, 2nd ed. New York: Springer-Verlag, 1994. [12] H. L. Van Trees, Detection, Estimation and Modulation Theory, Part I. New York: Wiley, 1968. [13] P. Willett and B. Chen, “Robust detection of small stochastic signals,” IEEE Trans. Aerosp. Electron. Syst., vol. 35, pp. 15–30, Jan. 1999.

Suggest Documents