On Logarithmically Asymptotically Optimal Hypothesis Testing for ...

On Logarithmically Asymptotically Optimal Hypothesis Testing for Arbitrarily Varying Sources with Side Information R. Ahlswede, Ella Aloyan, and E. Haroutunian

Abstract. The asymptotic interdependence of the error probabilities exponents (reliabilities) in optimal hypotheses testing is studied for arbitrarily varying sources with state sequence known to the statistician. The case when states are not known to the decision maker was studied by Fu and Shen.

1

Introduction

On the open problems session of the Conference in Bielefeld (August 2003) Ahlswede formulated among others the problem of investigation of ”Statistics for not completely specified distributions” in the spirit of his paper [1]. In this paper, in particular, coding problems are solved for arbitrarily varying sources with side information at the decoder. Ahlswede proposed to consider the problems of inference for similar statistical models. It turned out that the problem of ”Hypothesis testing for arbitrarily varying sources with exponential constraint” was already solved by Fu and Shen [2]. This situation corresponds to the case, when side information at the decoder (in statistics this is the statistician, the decision maker) is absent. The present paper is devoted to the same problem when the statistician has the possibility to make decisions after receiving the complete sequence of states of the source. This, still simple, problem may be considered as a beginning of the realization of the program proposed by Ahlswede. This investigation is a development of results from [3]-[9] and may be continued in various directions: the cases of many hypotheses, non complete side information, sources of more general classes (Markov chains, general distributions), identification of hypotheses in the sense of [10].

2

Formulation of Results

An arbitrarily varying source is a generalized model of a discrete memoryless source, distribution of which varies independently at any time instant within a certain set. Let X and S be finite sets, X the source alphabet, S the state alphabet. P(S) is a set of all possible probability distributions P on S. Suppose a statistician makes decisions between two conditional probability distributions

Work was partially supported by INTAS grant 00738.

R. Ahlswede et al. (Eds.): Information Transfer and Combinatorics, LNCS 4123, pp. 547–552, 2006. c Springer-Verlag Berlin Heidelberg 2006

548

R. Ahlswede, E. Aloyan, and E. Haroutunian

of the source: G1 = {G1 (x|s), x ∈ X , s ∈ S}, G2 = {G2 (x|s), x ∈ X , s ∈ S}, thus there are two alternative hypotheses H1 : G = G1 , H2 : G = G2 . A sequence x = (x1 , . . . , xN ), x ∈ X N , N = 1, 2, ..., is emitted from the source, and sequence s = (s1 , . . . , sN ) is created by the source of states. We consider the situation when the source of states is connected with the statistician who must decide which hypothesis is correct on the base of the data x and the state sequence s. Every test ϕ(N ) is a partition of the set X N into two disjoint subsets (N ) (N ) (N ) X N = As As , where the set As consists of all vectors x for which the first hypothesis is adopted using the state sequence s. Making decisions about these hypotheses one can commit the following errors: the hypothesis H1 is rejected, but it is correct, the corresponding error probability is (N )

α1|2 (ϕ(N ) ) = max GN 1 (As (N )

s∈S N

| s),

if the hypothesis H1 is adopted while H2 is correct, we make an error with the probability (N ) (N ) | s). α2|1 (ϕ(N ) ) = max GN 2 (As s∈S N

Let us introduce the following error probability exponents or ”reliabilities” E1|2 and E2|1 , using logarithmical and exponential functions at the base e: lim − N −1 ln α1|2 (ϕ(N ) ) = E1|2 ,

(1)

lim − N −1 ln α2|1 (N ) (ϕ(N ) ) = E2|1 .

(2)

(N )

N →∞

N →∞

The test is called logarithmically asymptotically optimal (LAO) if for given E1|2 it provides the largest value to E2|1 . The problem is to state the existence of such tests and to determine optimal dependence of the value of E2|1 from E1|2 . Now we collect necessary basic concepts and definitions. For s=(s1 , . . . , sN ), s ∈ S N , let N (s | s) be the number of occurrences of s ∈ S in the vector s. The type of s is the distribution Ps = {Ps (s), s ∈ S} defined by Ps (s) =

1 N (s | s), N

s ∈ S.

For a pair of sequences x ∈ X N and s ∈ S N , let N (x, s|x, s) be the number of occurrences of the pair (x, s) ∈ X × S in the pair of vectors (x, s). The joint type of the pair (x, s) is the distribution Qx,s = {Qx,s (x, s), x ∈ X , s ∈ S} defined by Qx,s (x, s) =

1 N (x, s | x, s), N

x ∈ X , s ∈ S.

On Logarithmically Asymptotically Optimal Hypothesis Testing

549

The conditional type of x for given s is the conditional distribution Qx|s = {Qx|s (x|s), x ∈ X , s ∈ S} defined by Qx|s (x|s) =

Qx,s (x, s) , Ps (s)

x ∈ X , s ∈ S.

Let X and S are random variables with probability distributions P = {P (s), s ∈ S} and Q = {Q(x|s), x ∈ X , s ∈ S}. The conditional entropy of X with respect to S is: P (s)Q(x|s) ln Q(x|s). HP,Q (X | S) = − x,s

The conditional divergence of the distribution P ◦ Q = {P (s)Q(x|s), x ∈ X , s ∈ S} with respect to P ◦ Gm = {P (s)Gm (x|s), x ∈ X , s ∈ S} is defined by D(P ◦ Q||P ◦ Gm ) = D(Q Gm |P ) =

P (s)Q(x|s) ln

x,s

Q(x|s) , Gm (x|s)

m = 1, 2.

The conditional divergence of the distribution P ◦G2 = {P (s)G2 (x|s), x ∈ X , s ∈ S} with respect to P ◦ G1 = {P (s)G1 (x|s), x ∈ X , s ∈ S} is defined by D(P ◦ G2 ||P ◦ G1 ) = D(G2 G1 |P ) =

P (s)G2 (x|s) ln

x,s

G2 (x|s) . G1 (x|s)

Similarly we define D(G1 G2 |P ). Denote by P N (S) the space of all types on S for given N , and QN (X , s) the (N ) set of all possible conditional types on X for given s. Let TPs ,Q (X | s) be the set of vectors x of conditional type Q for given s of type Ps . It is known [3] that | QN (X , s) |≤ (N + 1)|X ||S|,

(3)

(N + 1)−|X ||S| exp{N HPs ,Q (X|S)} ≤| TPs ,Q (X | s) |≤ exp{N HPs ,Q (X|S)}. (4) (N )

Theorem. For every given E1|2 from (0, min D(G2 G1 | P )) P ∈P(S)

E2|1 (E1|2 ) = min

min

P ∈P(S) Q:D(Q||G1 |P )≤E1|2

D(Q G2 | P ).

(5)

We can easily infer the following (N )

Corollary. (Generalized lemma of Stein): When α1|2 (ϕ(N ) ) = ε > 0, for N large enough (N )

(N )

α2|1 (α1|2 (ϕ(N ) ) = ε) = min exp{−N D(G1 G2 |P )}. P ∈P(S)

550


3

Proof of the Theorem

The proof consists of two parts. We begin with demonstration of the inequality E2|1 (E1|2 ) ≥ min

min

P ∈P(S) Q:D(Q||G1 |P )≤E1|2

(N )

D(Q||G2 |P ).

(6)

(N )

For x ∈ TPs ,Q (X|s), s ∈ TPs (S), m = 1, 2 we have, GN m (x | s)=

N

Gm (xn |sn )= Gm (x|s)N (x,s|x,s) = Gm (x|s)N Ps (s)Qx|s (x|s) = x,s

n=1

x,s

= exp {N Ps (s)(s)Qx|s (x|s) ln Gm (x|s)}= exp {N [Ps (s)(s)Qx|s (x|s) ln Gm (x|s)− x,s

x,s

−Ps (s)(s)Qx|s (x|s) ln Qx|s (x|s) + Ps (s)(s)Qx|s (x|s) ln Qx|s (x|s)]} = = exp{N

(−Ps (s)(s)Qx|s (x|s) ln

x,s

Qx|s (x|s) + Ps (s)(s)Qx|s (x|s) ln Qx|s (x|s))} = Gm (x|s)

= exp {−N [D(Q Gm |P ) + HPs ,Q (X | S)]}.

(7)

for every s is given by the Let us show that the optimal sequence of tests ϕ following sets (N ) ) = TPs ,Q (X | s). (8) A(N s (N )

Q:D(QG1 |Ps )≤E1|2

Using (4) and (7) we see that N GN m (TPs ,Q (X|s) | s) =| TPs ,Q (X | s) | Gm (x|s) ≤ exp{−N D(Q Gm |Ps )}. (N )

(N )

We can estimate both error probabilities (N)

(N)

α1|2 (ϕ(N) ) = max GN 1 (As s∈S N

s∈S N

≤ max (N + 1)|X |S| s∈S N

≤ (N + 1)|X ||S| =

max

max

max

GN 1 (TPs ,Q (X | s) | s) ≤

Ps ∈P N (S) Q:D(Q||G1 |Ps )>E1|2

Ps ∈P N (S) Q:D(Q||G1 |Ps )>E1|2

=

max

max

(N)

TPs ,Q (X | s) | s) ≤

Q:D(Q||G1 |Ps )>E1|2

Q:D(Q||G1 |Ps )>E1|2

max

max

| s) = max GN 1 (

(N )

exp{−N D(Q||G1 |Ps )} =

exp{|X ||S| ln (N + 1) − N D(Q||G1 |P )} =

Ps ∈P N (S) Q:D(Q||G1 |Ps )>E1|2

exp{−N [D(Q||G1 |Ps ) − o(1)]} ≤

≤ exp{−N (E1|2 − o(1))}, where o(1) = N −1 |X ||S| ln(N + 1) → 0, when N → ∞. And then

On Logarithmically Asymptotically Optimal Hypothesis Testing

(N ) N α2|1 (ϕ(N ) ) = max GN 2 (As |s) = max G2 ( (N )

s∈S N

s∈S N

s∈S N

|X ||S|

≤ (N + 1) =

GN 2 (TPs ,Q (X|s)|s) ≤ (N )

Q:D(Q||G1 |Ps )≤E1|2

max

max

Ps ∈P N (S) Q:D(Q||G1 |Ps )≤E1|2

max

max

Ps ∈P N (S) Q:D(Q||G1 |Ps )≤E1|2

= exp{−N (

(N )

TPs ,Q (X|s)|s) =

Q:D(Q||G1 |Ps )≤E1|2

= max

551

min

exp{−N D(Q||G2 |Ps )} =

exp{−N [D(Q||G2 |Ps ) − o(1)]} =

min

Ps ∈P N (S) Q:D(Q||G1 |Ps )≤E1|2

D(Q||G2 |Ps ) − o(1))}.

So with N → ∞ we get (6) Now we pass to the proof of the second part of the theorem. We shall prove the inequality inverse to (6). First we can show that this inverse inequality is valid for test ϕ(N ) defined by (8). Using (4) and (7) we obtain (N ) (N ) (N ) N α2|1 (ϕ(N ) ) = max GN TPs ,Q (X|s)|s)) ≥ 2 (As |s) = max G2 ( s∈S N

s∈S N

≥ max

Q:D(Q||G1 |Ps )≤E1|2

≥

max (N + 1)−|X ||S|

Ps ∈P N (S)

= exp{−N (

GN 2 (TPs ,Q (X|s)|s) ≥ (N )

max

s∈S N Q:D(Q||G1 |Ps )≤E1|2

max

Q:D(Q||G1 |Ps )≤E1|2

min

min

Ps ∈P N (S) Q:D(Q||G1 |Ps )≤E1|2

exp{−N D(Q||G2 |Ps )} =

D(Q||G1 |Ps ) + o(1))}.

So with N → ∞ we get that for this test E2|1 (E1|2 ) ≤ min

min

P ∈P(S) Q:D(Q||G1 |P )≤E1|2

D(Q||G2 |P ).

(9)

Then we have to be convinced that any other sequence ϕÑ of tests defined for every s ∈ S N by the sets AÑ s such, that α1|2 (N ) (ϕ˜(N ) ) ≤ exp{−N E1|2 }, and

(N )

(10)

(N )

α2|1 (ϕ˜(N ) ) ≤ α2|1 (ϕ(N ) ),

N in fact coincide with ϕ(N ) defined in (8). Let us consider the sets AÑ A , s N s N N ˜ s ∈ S . This intersection cannot be void, because in that case As As will Ñ N N N Ñ N N A s and the probabilities G1 (As |s) and G1 (As |s) be equal to X = As cannot be small simultaneously. Now because N AÑ GN s |s) ≤ 1 (As N Ñ N GN 1 (As )|s) + G1 (As )|s) ≤ 2 · exp{−N E1|2 } = exp{−N (E1|2 + o(1)},

552


from (6) we obtain N GN 2 (As

N N AÑ s |s) ≤ G2 (As |s) ≤ exp{−N ( min

min

P ∈P(S) Q:D(Q||G1 |P )≤E1|2

D(Q||G2 |P ))}

and so we conclude that if we exclude from AÑ s the vectors x of the types N TP,Q (X|s) with D(Q||G1 |P ) > E1|2 we do not make reliabilities of the test ϕ˜(N ) N worse. It is left to remark that when we add to AÑ s all types TP,Q (X|s) with N D(Q||G1 |P ) ≤ E1|2 , that is we take AÑ s = As , we obtain that (6) and (9) are (N ) valid, that is the test ϕ is optimal.

References 1. R. Ahlswede, Coloring hypergraphs: a new approach to multi-user source coding, I and II, J. Combin., Inform. System Sciences, Vol. 4, No. 1, 75-115, 1979, Vol. 5, No. 2, 220-269, 1980. 2. F.-W. Fu and S.-Y. Shen, Hypothesis testing for arbitrarily varying sources with exponential-type constraint, IEEE Trans. Inform. Theory, Vol. 44, No. 2, 892-895, 1998. 3. I. Csisz´ ar and J. K¨ orner, Information Theory: Coding Theorems for Discrete Memoryless Systems, Academic press, New York, 1981. 4. W. Hoeffding, Asymptotically optimal tests for multinomial distributions, Ann. Math. Stat., Vol. 36, 369-401, 1965. 5. I. Csisz´ ar and G. Longo, On the error exponent for source coding and for testing simple statistical hypotheses, Studia Sc. Math. Hungarica, Vol. 6, 181-191, 1971. 6. G. Tusnady, On asymptotically optimal tests, Ann. Statist., Vol. 5, No. 2, 385-393, 1977. 7. G. Longo and A. Sgarro, The error exponent for the testing of simple statistical hypotheses, a combinatorial approach, J. Combin., Inform. System Sciences, Vol. 5, No 1, 58-67, 1980. 8. L. Birgé, Vitesse maximales de décroissance des erreurs et tests optimaux associés, Z. Wahrsch. verw. Gebiete, Vol. 55, 261–273, 1981. 9. E. A. Haroutunian, Logarithmically asymptotically optimal testing of multiple statistical hypotheses, Probl. Control and Inform. Theory, Vol. 19, No. 5–6, 413–421, 1990. 10. R. Ahlswede and E. Haroutunian, On logarithmically asymptotically optimal testing of hypotheses and identification: research program, examples and partial results, preprint.

On Logarithmically Asymptotically Optimal Hypothesis Testing for ...

On Logarithmically Asymptotically Optimal Hypothesis Testing for ...

Suggest Documents

Asymptotically Optimal Tests and Optimal Designs for Testing the

Asymptotically Optimal Tests and Optimal Designs for Testing the

Testing the optimal defence hypothesis for two

Asymptotically Optimal Agents

Asymptotically Optimal Topological Quantum

Asymptotically optimal designs on compact algebraic manifolds

REMARKS ON LOGARITHMICALLY REGULARITY CRITERIA FOR ...

Hypothesis Testing

HYPOTHESIS TESTING

Generalizing Informed Sampling for Asymptotically Optimal Sampling ...

Asymptotically-optimal Path Planning for Manipulation using ...

Asymptotically Optimal Control for Time-inhomogeneous Networks

Hypothesis Testing for Beginners - LSE

Asymptotically Optimal Multi-Object Auctions

Asymptotically Optimal Covering Designs - CiteSeerX

Optimal Distributed Binary Hypothesis Testing with ... - Semantic Scholar

Testing the optimal defense hypothesis in nature ... - Semantic Scholar

3. hypothesis testing

3. hypothesis testing

on distributed sequential hypothesis testing

Introduction to Hypothesis Testing

Testing the Manifold Hypothesis

On Multiple Hypothesis Testing with Rejection Option

A tutorial on teaching hypothesis testing