On the Deterministic-Code Capacity of the Two-User Discrete ...

2 downloads 0 Views 384KB Size Report
May 29, 2006 - memoryless arbitrarily varying general broadcast channel (AVGBC) was ... chosen without a complete knowledge of the channel law.
On the Deterministic-Code Capacity of the Two-User Discrete Memoryless Arbitrarily Varying General Broadcast Channel with Degraded Message Sets Eran Hof



Shraga I. Bross



May 29, 2006

Abstract An inner bound on the deterministic-code capacity region of the two-user discrete memoryless arbitrarily varying general broadcast channel (AVGBC) was characterized by Jahn, assuming that the common message capacity is nonzero; however, he did not indicate how one could decide whether the latter capacity is positive. Csisz´ar and Narayan’s result for the single-user AVC establishes the missing part in Jahn’s characterization. Nevertheless, being based on Ahlswede’s elimination technique, Jahn’s characterization is not applicable for symmetrizable channels under state constraint. Here, the various notions of symmetrizability for the two-user broadcast AVC are defined. Sufficient nonsymmetrizability condition that renders the common message capacity of the AVGBC positive is identified using an approach different from Jahn’s. The decoding rules we use establish an achievable region under state and input constraints for the family of degraded message sets codes over the AVGBC.

I. Introduction The classic paper by Shannon [1] studies the fundamental laws of reliable communication when both the transmitter and the receiver have full knowledge of the law governing the channel. However, various situations exist when either the encoder or the decoder must be chosen without a complete knowledge of the channel law. In subsequent studies, a variety of channel models and the ultimate limits of communication have been proposed and extensively studied for such situations (see [6] for a comprehensive review). The arbitrarily varying channel (AVC) was introduced by Blackwell, Breiman and Thomasian [2] to model a memoryless channel whose law may vary with time in an arbitrary and unknown manner during the transmission of a codeword. A two-user discrete memoryless arbitrarily varying ∗ †

Department of Electrical Engineering Technion, Haifa 32000, Israel email:[email protected] Department of Electrical Engineering Technion, Haifa 32000, Israel email:[email protected]

1

general broadcast channel (AVGBC) is a pair of transition probabilities (V, W ) from X × S into Y and from X × S into Z , respectively, where X , S , Y and Z are finite sets, each containing at least two elements. We interpret V (y | x, s) and W (z | x, s) as the conditional laws that the channel output for user 1 is y ∈ Y and that the channel output for user 2 is z ∈ Z given that the channel input symbol is x ∈ X and that the channel state is s ∈ S . Notice that V and W are both a single user arbitrarily varying channel (AVC) with a common input alphabet X , common state set S and output alphabets Y and Z respectively. The channel operation on n-tuples x = (x1 , · · · , xn ) ∈ X n , s = (s1 , · · · , sn ) ∈ S n , y = (y1 , · · · , yn ) ∈ Y n and z = (z1 , · · · , zn ) ∈ Z n is given by n Y V n (y | x, s) , V (yk | xk , sk ) W n (z | x, s) ,

k=1 n Y

W (zk | xk , sk ) .

k=1

Using the average probability of error criterion with deterministic codes, Jahn characterized in [3] an achievable region for the AVGBC assuming that a common message with non-vanishing rate can be communicated reliability to both terminals. This region is obtained, via the “elimination” technique due to Ahlswede [4], namely by concatenating short prefixes to the broadcast deterministic code in order to inform the decoder which of the “exponentially few” broadcast deterministic codes is actually used, thereby eliminating the correlation in the original code. However, no computable characterization of AVGBC’s for deciding whether such a prefix code exists was given in [3]. For the single-user AVC, it was shown in [11] that if an AVC is such that its deterministic code capacity is positive, then the AVC must be nonsymmetrizable (see Definition 8 in the following). A computable characterization of AVC’s with positive deterministic code capacity was finally completed by Csisz´ar and Narayan [5] who showed that nonsymmetrizability is also a sufficient condition for a positive deterministic code capacity. Consequently, Csisz´ar and Narayan’s result establishes the missing part in Jahn’s characterization of an achievable region for the AVGBC. Nevertheless, Csisz´ar and Narayan have observed, that under the imposition of a state constraint, the deterministic-code capacity of the single-user AVC may be positive even for a symmetrizable AVC. However, the elimination technique, used by Jahn, can no longer be applied as long as the channel is symmetrizable [5, 6]. Hence, the motivation for studying alternative coding and decoding techniques that are suitable for a symmetrizable AVGBC (in a sense to be defined later) wherein the state sequence is subject to any given constraint, is apparent. In this work we consider achievable regions for the AVGBC with no reliance on the elimination technique, hence the techniques developed here may be essential when analyzing 2

a symmetrizable AVGBC subject to a state constraint. Our work on the AVGBC was motivated by the work of Csisz´ar and Narayan [5] for the single-user AVC and the work of Gubner [7, 8], Ahlswede and Cai [10] for the multiple-access AVC. The main contribution of this work is a characterization of an inner bound on the region of achievable rates, under state and input constraints, for the family of degraded message sets codes over the AVGBC. The paper is organized as follows. The definitions of randomized and deterministic codes, the definition of the family of degraded message sets codes, and the relevant achievable regions are given in section III. Section IV provides the various definitions of symmetrizability for the AVGBC, while in section V we present the decoding rules we use to derive our main results. Our main results are stated in section VI, while the proofs are relegated to the Appendixes (for further details the reader is referred to [15, 16]). Section VII demonstrates the implication of the main results via a simple example. II. Preliminaries A. Notation Henceforth, we adopt the following notation conventions. Random variables will be denoted by capital letters, while their realizations will be denoted by the respective lower case letters. Whenever the dimension of a random vector is clear from the context the random vector will be denoted by a bold face letter, that is, X denotes the random vector (X1 , X2 , . . . , Xn ), and x = (x1 , x2 , . . . , xn ) will designate a specific sample value of X. The alphabet of a scalar random variable X will be designated by a caligraphic letter X . The n-fold Cartesian power of a generic alphabet V , that is, the set of all n-vectors over V , will be denoted V n . Let A , B, C be finite sets, D (A ) denotes the set of all probability laws on A and D (A | B) denotes the set of all channels from B to A . For QABC ∈ D (A × B × C ), we define IA;B (QABC ) , I (A; B) 1 , where the defining expression on the right is the usual average mutual information computed with the law indicated on the left. Recall that the type [14] of x ∈ X n is defined to be the law Px given by Px (a) , N (a | x) /n for a ∈ X , where N (a | x) denotes the number of occurrences of a in x. Similarly, the joint type of a pair (x, y), is defined by the joint law Pxy given by Pxy (a, b) , N (a, b | x, y) /n for a ∈ X and b ∈ Y , where N (a, b | x, y) denotes the number of occurrences of the pair (a, b) in ((x1 , y1 ) , · · · , (xn , yn )). 1

We choose to write as argument in the l.h.s the joint probability law of all the variables involved in the problem at hand, though the specific mutual information in the r.h.s may be restricted only to a subset of the variables.

3

We write type classes as TPX , {x : x ∈ X n , Px = PX } , TPXY

, {(x, y) : x ∈ X n , y ∈ Y n , Pxy = PXY } .

Similarly, we write conditioned type classes as TPX|Y (y) , {x : x ∈ X n , (x, y) ∈ TPXY } TPX|Y Z (y, z) , {x : x ∈ X n , (x, y, z) ∈ TPXY Z } . B. Useful bounds We state below a few basic bounds the proofs of which can be found in [14, Ch. 1.2]: 1. The number of possible joint types of sequences of length n is polynomial in n. 2. We have the following bounds on the size of type classes (n + 1)−|X | exp {nH (X)} ≤ |TPX | ≤ exp {nH (X)} , if TPX 6= ∅, ¯ ¯ ¯ ¯ −|X ||Y | (n + 1) exp {nH (Y | X)} ≤ ¯TPY |X (x)¯ ≤ exp {nH (Y | X)} , if TPY |X (x) 6= ∅,

(1) (2)

3. For any probability law f : X → Y , X

f n (y | x) ≤ exp {−nD (PXY ||PX × f )}

(3)

y∈TPY |X (x)

where PX × f denotes the distribution on X × Y with probability mass function PX (x) f (y | x). III. The AVGBC We begin with the definition of a code for the AVGBC. Definition 1: Let My , Mc , Mz and n be positive integers. A deterministic (n, My , Mc , Mz )-code for the AVGBC is a triple of mappings (f, φ, ψ) where f : {1, · · · , My } × {1, · · · , Mc } × {1, · · · , Mz } → X n , φ : Y n → {1, · · · , My } × {1, · · · , Mc }, 4

and ψ : Z n → {1, · · · , Mc } × {1, · · · , Mz }. The mapping f is called an encoder, the mapping φ is called a decoder for user 1, the mapping ψ is called a decoder for user 2, the integer n is the block length and the triple of nonnegative real numbers ( n1 log My , n1 log Mc , n1 log Mz ) is called the rate triple of the code (private user 1 rate, common rate, and private user 2 rate, respectively). Here log and later on exp, are understood as being to the base 2. Setting xjik , f (j, i, k), j = 1, . . . , My , i = 1, . . . , Mc and k = 1, . . . , Mz , we call {xjik } the set of codewords or the codebook. The average error probabilities, given the state sequence s ∈ S n , when using a deterministic (n, My , Mc , Mz )-code on an AVGBC are defined by: My Mc Mz X XX 1 eφ (s) , My Mc Mz j=1 i=1 k=1

eψ (s) ,

1 My Mc Mz

X

V n (y | xjik , s)

y:φ(y)6=(j,i)

My Mc Mz X XX

X

W n (z | xjik , s) .

j=1 i=1 k=1 z:ψ(z)6=(i,k)

Definition 2: A triple of nonnegative real numbers (Ry , Rc , Rz ) is said to be achievable for the AVGBC if for every ² > 0, δ > 0 and sufficiently large n, a deterministic (n, My , Mc , Mz )code exists with 1 log My > Ry − δ, n 1 log Mc > Rc − δ, n 1 log Mz > Rz − δ n and with probabilities of decoding error eφ (s) ≤ ² and eψ (s) ≤ ² ,

∀s ∈ S n .

Definition 3: The deterministic code capacity region of the AVGBC, denoted by CD (V, W ), is defined by CD (V, W ) , {(Ry , Rc , Rz ) : (Ry , Rc , Rz ) is achievable} . Definition 4: A randomized code (F, Φ, Ψ) is a random variable with values in the set of all deterministic (n, My , Mc , Mz )-codes (f, φ, ψ). While the law of the r.v. (F, Φ, Ψ) may depend on the set of channels indexed by s ∈ S n , it is not allowed to depend on s or on the transmitted message (j, i, k). 5

The expected average error probabilities, given the state sequence s ∈ S n , when using a randomized code on an AVGBC are defined by:   My Mc Mz X X X X 1 eΦ (s) , E  V n (y | xjik , s) My Mc Mz j=1 i=1 k=1 y:Φ(y)6=(j,i)   My Mc Mz X X X X 1 eΨ (s) , E  W n (z | xjik , s) , My Mc Mz j=1 i=1 k=1 z:Ψ(z)6=(i,k)

and the randomized code capacity region CR (V, W ) is defined analogously to CD (V, W ). Next, let U , Uy × Uc × Uz , where Uy , Uc and Uz are finite sets, and suppose Q ∈ ¡ ¢ D U × X and r ∈ D (S ). We define probability laws on U × X × S × Y and on U × X × S × Z as follows (Q × r × V ) (u, x, s, y) , Q (u, x) r (s) V (y | x, s) (Q × r × W ) (u, x, s, z) , Q (u, x) r (s) W (z | x, s) . After setting (Q × rV ) (u, x, y) ,

X

(Q × r × V ) (u, x, s, y)

s∈S

(Q × rW ) (u, x, z) ,

X

(Q × r × W ) (u, x, s, z) ,

s∈S

define IY∗ ;Uy |Uc (Q, V ) ,

inf IY ;Uy |Uc (Q × rV ) .

r∈D(S )

∗ ∗ ∗ The quantities IY,U (Q, V ), IY∗ ;Uy ,Uc (Q, V ), IZ;U (Q, W ), IZ,U (Q, W ) and y ;Uc z ;Uc z |Uc ∗ IZ;U (Q, W ) are defined similarly. c ,Uz

Let ¡ ¢ R ∗ U , Q, V, W , {(Ry , Rc , Rz ) : 0 ≤ Ry < IY∗ ;Uy |Uc (Q, V ) , ∗ 0 ≤ Rc < IY,U (Q, V ) , y ;Uc

0 ≤ Ry + Rc < IY∗ ;Uy ,Uc (Q, V ) , ∗ 0 ≤ Rz < IZ;U (Q, W ) , z |Uc ∗ (Q, W ) , 0 ≤ Rc < IZ,U z ;Uc ∗ 0 ≤ Rz + Rc < IZ;U (Q, W )} c ,Uz

6

and denote by R ∗ (V, W ) the closed convex hull of [

¡ ¢ R ∗ U , Q, V, W ,

U ,Q

¡ ¢ where the union is taken over all finite sets U = Uy × Uc × Uz and laws Q ∈ D U × X satisfying QUy Uz |Uc (uy , uz | uc ) = QUy |Uc (uy | uc ) QUz |Uc (uz | uc ). Setting R¯C (V, W ) , {Rc ∈ R+ : (0, Rc , 0) ∈ CD (V, W )} the following theorem, due to Jahn [3], provides an achievable region for the AVGBC subject to a condition on R¯C (V, W ). Theorem 1 [3]: For every AVGBC R ∗ (V, W ) ⊂ CR (V, W ) , and CD (V, W ) = CR (V, W ) if int{R¯C (V, W )} 6= ∅ . Obviously, one would like to know exactly when R¯C (V, W ) has a nonempty interior. Apparently, Csisz´ar and Narayan’s result completes the missing characterization for Theorem 1 in the sense that R ∗ (V, W ) ⊂ CD (V, W ) if both AVC’s V and W are nonsymmetrizable. Otherwise, int{CD (V, W )} = ∅. However, it was shown in [5] that for a single user AVC under a state constraint, the deterministic code capacity may be positive even for symmetrizable channel (see [8] for similar result for the AVMAC). Recall that Jahn’s result relies on Ahlswede’s “elimination” technique [4] which, as explained in [5, 6], is no longer applicable in the presence of a state constraint when the channel is symmetrizable. For that reason alternative coding techniques which do not rely on Ahlswede’s “elimination” technique need to be considered. Our results in this work concern a restricted family of codes for the AVGBC, namely the class of codes with degraded message sets originally defined in [12, 13]. The following definitions regard the AVGBC when user 2 has a degraded message set. Similar definitions apply when user 1 has a degraded message set. Definition 5: Let M , M 0 and n be positive integers. A deterministic (n, M, M 0 )-code for the AVGBC with degraded message sets is a triple of mappings (f, φ, ψ) where f : {1, . . . , M } × {1, . . . , M 0 } → X n , φ : Y n → {1, . . . , M } × {1, . . . , M 0 } and ψ : Z n → {1, . . . , M }. The mapping f is called an encoder, the mapping φ is called a decoder for user 1 and the mapping ψ is called a decoder for user 2, the integer n is the block length and the ¡ ¢ pair of nonnegative real numbers n1 log M, n1 log M 0 is the rate pair of the code. The sets {1, · · · , M } and {1, · · · , M 0 } are called the common message set and the private message set, respectively. Notice that the decoder for user 2 decides only on the common message that was sent, while the decoder for user 1 decides on both the common and private messages. Setting xij , f (i, j), i = 1, . . . , M , j = 1, . . . , M 0 , we call {xij } the set of codewords. 7

Using a deterministic (n, M, M 0 )-code on the AVGBC with degraded message sets, the average error probabilities for a state sequence s ∈ S n are 0

M M 1 XX eφ (s) , M M 0 i=1 j=1 0

M M 1 XX eψ (s) , M M 0 i=1 j=1

X

V n (y | xij , s)

(4)

W n (z | xij , s) .

(5)

y:φ(y)6=(i,j)

X z:ψ(z)6=i

Definition 6: A pair of nonnegative real numbers (R, R0 ) is said to be achievable for the AVGBC with degraded message sets if for every ² > 0, δ > 0, and sufficiently large n, a deterministic (n, M, M 0 )-code exists with 1 log M > R − δ, n 1 log M 0 > R0 − δ n and with probabilities of error eφ (s) ≤ ² and eψ (s) ≤ ², ∀s ∈ S n . (2)

We denote by CD (V, W ) the set of achievable rate triples for the AVGBC when user 2 has (2)

a degraded message set, that is – CD (V, W ) = {(R0 , R, 0) ∈ CD (V, W )}. We now consider the AVGBC with degraded message sets subject to input and/or state constraints. Let g(x) and l(s) be given nonnegative functions on X and S , respectively. For x = (x1 , · · · , xn ) and s = (s1 , · · · , sn ), we define n

g (x) ,

1X g (xi ) n i=1 n

1X l (s) , l (si ) . n i=1 Definition 7: A pair of nonnegative real numbers (R, R0 ) is said to be achievable under input constraint Γ and state constraint Λ, over the AVGBC with degraded message sets, if for every ² > 0, δ > 0, and sufficiently large n, a deterministic (n, M, M 0 )-code exists with codewords {xij }, i = 1, ..., M , j = 1, ..., M 0 , each satisfying g (xij ) ≤ Γ , code rates satisfying 1 log M > R − δ , n 1 log M 0 > R0 − δ n 8

and with probabilities of error eφ (s) ≤ ² and eψ (s) ≤ ², ∀s ∈ S n : l (s) ≤ Λ . (2)

We denote by CD (V, W, Λ, Γ) the set of achievable rate triples for the AVGBC, when user 2 has a degraded message set, under state and input constraints (Λ, Γ). Setting gm = (2)

maxx∈X g(x) and lm = maxs∈S l(s), CD (V, W, lm , Γ) denotes the set of achievable rate triples for the AVGBC with degraded message sets under input constraint Γ and no state (2)

constraint, while CD (V, W, Λ, gm ) denotes the set of achievable rate triples for the AVGBC with degraded message sets under state constraint Λ and no input constraint. The sets (1)

(1)

(1)

(1)

CD (V, W ), CD (V, W, Λ, Γ), CD (V, W, lm , Γ) and CD (V, W, Λ, gm ), for the case when user 1 has a degraded message set, are defined similarly. (2) The main result of this work is a characterization of inner bounds on CD (V, W, Λ, Γ) (1)

and CD (V, W, Λ, Γ) as well as sufficient conditions under which these inner bounds are nonempty. IV. Symmetrizability The following definitions of symmetrizability will play a crucial role in determining whether CD (V, W ) has an empty interior. Definition 8 [11]: The AVC V is said to be symmetrizable-X if there exists a transition probability U˜ from X into S such that X X V (y | x, s) U˜ (s | x0 ) = V (y | x0 , s) U˜ (s | x) , ∀x ∈ X , x0 ∈ X , y ∈ Y . (6) s∈S

s∈S

If no such U˜ exists, we say that V is nonsymmetrizable-X . Definition 9: For a given set U and a law Q ∈ D (U × X ), the AVC V is said to be symmetrizable-X |U if there exists a transition probability U˜ from X × U into S such that X

V (y | x, s) U˜ (s | u, x0 ) =

s∈S

X

V (y | x0 , s) U˜ (s | u, x) ,

s∈S 0

∀u ∈ U , x ∈ X , x ∈ X , y ∈ Y such that Q (u, x) > 0 and Q (u, x0 ) > 0 . If no such U˜ exists, we say that V is nonsymmetrizable-X |U . For any finite set U and a law Q ∈ D (U × X ), define X ¡ ¢ QX|U V (y | u, s) , QX|U (x | u) V (y | x, s) ¡

¢

QX|U W (z | u, s) ,

x∈X

X

x∈X

9

QX|U (x | u) W (z | x, s) .

(7)

Definition 10: For a given set U and a law Q ∈ D (U × X ), the AVC QX|U V is said to be symmetrizable-U if there exists a transition probability U˜ from U into S such that X¡

X¡ ¢ ¢ QX|U V (y | u, s) U˜ (s | u0 ) = QX|U V (y | u0 , s) U˜ (s | u) ,

s∈S

s∈S

∀u ∈ U , u0 ∈ U , y ∈ Y such that QU (u) > 0 and QU (u0 ) > 0 .

(8)

If no such U˜ exists, we say that QX|U V is nonsymmetrizable-U . We observe the following relations between the various symmetrizability definitions. If V is symmetrizable-X – i.e. U˜ satisfies (6), then U˜ satisfies (7) for any U and Q ∈ D (U × X ). In addition, if V is symmetrizable-X |U for any set U and any law Q, then it must be symmetrizable-X . Hence, The AVC V is symmetrizable-X iff it is symmetrizableX |U for any U and Q. Multiplying (6) by any QX|U (x | u) QX|U (x0 | u0 ) and summing over all x, x0 shows that QX|U V is also symmetrizable-U . In addition, if for any set U and any law Q, QX|U V is symmetrizable-U , then V must be symmetrizable-X . Hence, The AVC V is symmetrizable-X iff for any set U and any law Q, QX|U V is symmetrizableU . Next, let the set U and the law Q be such that V is symmetrizable-X |U – i.e. U˜ satisfies (7), then multiplying (7) by QX|U (x0 | u) QX|U (x | u0 ) and summing over all x, x0 shows that QX|U V is symmetrizable-U (notice that the other direction does not necessarily follow). Consequently, setting SX |U and SU to be the sets of all pairs (U , Q) of a set U and a law Q such that V is nonsymmetrizable-X |U and QX|U V is nonsymmetrizable-U , respectively, then SU ⊂ SX |U and SU 6= ∅ iff V is nonsymmetrizable-X . The following example illustrates how the choice of the set U and the law Q ∈ D (U × X ) affects the various nonsymmetrizability conditions. Example: Let X = {1, 2, 3}, S = {1, 2}, Y = {1, 2} and Z = {1, 2}, and let V (y | x, s) and W (z | x, s) be the AVC’s from X × S to Y and X × S to Z , respectively, as depicted in Figure 1. Let U = {1, 2} and consider first the following law  0.25 u = 1       0.25 u = 1  0 u=1 Q1 (u, x) = 0.25 u = 2     0 u=2    0.25 u = 2

Q1 ∈ D (U × X ) x=1 x=2 x=3 . x=1 x=2 x=3

Both V and W are nonsymmetrizable-X and nonsymmetrizable-X |U (assuming the existence of a probability transition function satisfying either (6) or (7) results in a contradiction). However, as observed above, this does not imply that the channels QX|U V and QX|U W are 10

s=1

s=2

1 RRRR1R

V

W

RRR RRR RRR RRR R) B1 ¦¦ ¦ ¦ ¦¦ ¦ ¦¦ 2 RRRR1R RRR ¦¦¦ RR¦ R ¦ RRR ¦ RRR ¦ R( ¦ ¦ ll5 2 0.5 ¦¦ l l ll ¦¦ lll ¦¦ l0.5 lll l ¦ ¦llll

1 RRRR1R

RRR RRR RRR RRR R) ll6 1 l l lll 0.5llll l l ll lll 2 RRRR0.5 RRR RRR RRR RRR RR( l5 2 l lll l l lll 1 llll l lll

3

3

1 RRRR1R

1 RRRR1R

RRR RRR RRR RRR R) B1 ¦¦ ¦ ¦ ¦¦ ¦ ¦¦ 2 RRRR1R RRR ¦¦¦ RR¦ R ¦¦ RRRRR ¦ RR( ¦¦ ¦ l5 2 l 0.5 ¦ ll l ¦ l ¦ lll ¦¦ 0.5 lll ¦l¦lllll

3

RRR RRR RRR RRR R) ll¦6 B 1 l l lll ¦¦ 0.5llll ¦ l l ¦¦ ll l l ¦ l ¦¦ 2 RRRR0.5 RRR ¦¦ RRR¦¦ R ¦¦ RRRRR ¦ RR( ¦¦ ¦ l5 2 0.5 ¦ lll l ¦ l ¦ lll ¦¦ 0.5 lll ¦l¦lllll

3

Figure 1: AVC’s V (y | x, s) and W (z | x, s).

11

nonsymmetrizable-U . Indeed, both channels QX|U V and QX|U W are symmetrizbale-U . Now, consider the law Q2 ∈ D (U × X ) defined by  1 u=1  60   1  u=1  6   19 u=1 60 Q2 (u, x) = 19 u=2  60   1  u=2    61 u=2 60

x=1 x=2 x=3 . x=1 x=2 x=3

It can be easily verified that for this choice, both QX|U V and QX|U W are nonsymmetrizableU (hence, both channels V and W are also nonsymmetrizable-X |U and nonsymmetrizableX ). V. Decoding To define the decoding rules that we use in Propositions 1-2 and Theorems 2-3 for the AVGBC with degraded message sets, we proceed as follows. For η ≥ 0, we define the following sets of laws PU SZ and PU XSY for the random variables U , X, S, Y and Z with values in U , X , S , Y and Z , respectively Cηψ ,

© ¡ ¡ ¢¢ ª PU SZ : D PU SZ ||PU × PS × QX|U W ≤ η

Cηφ , {PU XSY : D (PU XSY ||PU X × PS × V ) ≤ η} .

(9)

¡ ¢ Here D denotes (Kullback-Leibler) information divergence, PU × PS × QX|U W denotes a law on U × S × Z with probability mass function ¡ ¡ ¢¢ ¡ ¢ PU × PS × QX|U W (u, s, z) = PU (u) PS (s) QX|U W (z | u, s) , and PU X × PS × V denotes a law on U × X × S × Y with probability mass function (PU X × PS × V ) (u, x, s, y) = PU X (u, x) PS (s) V (y | x, s) . The decoding rules we use are extensions of the single user rule developed in [5], whereas extensions to the AVMAC can be found in [7] and [10]. For a given nonempty finite set U , and a law Q ∈ D (U × X ) we consider sets of sequences ui ∈ U n and xij ∈ X n , i = 1, · · · , M and j = 1, · · · , M 0 , referred as the sub-codewords and the codewords, respectively, that are chosen in a manner defined in appendix III. The decoding rules for users 1 and 2, when user 2 has a degraded message set, are defined as follows. Definition 11: Terminal 1 decoding rule: Let φ (y) = (i, j) iff an s ∈ S n exists such that 12

1. The joint type Pui ,xij ,s,y belongs to Cηφ . 2. For each competitor (i0 , j 0 ), i0 = 6 i, 1 ≤ i0 ≤ M , 1 ≤ j 0 ≤ M 0 , such that Pui0 ,xi0 j0 ,s0 ,y ∈ Cηφ for some s0 ∈ S n , we have I (U XY ; U 0 X 0 | S) ≤ η, where U, U 0 , X, X 0 , S, Y denote dummy random variables such that the joint type of (ui , ui0 , xij , xi0 j 0 , s, y) equals PU U 0 XX 0 SY . 3. For each competitor (i, j 0 ), j 0 6= j, 1 ≤ j 0 ≤ M 0 , such that Pui ,xij0 ,s0 ,y ∈ Cηφ for some s0 ∈ S n , we have I (XY ; X 0 | U S) ≤ η, where U, X, X 0 , S, Y denote dummy random variables such that the joint type of (ui , xij , xij 0 , s, y) equals PU XX 0 SY . If no such (i, j) exists, we declare an error. Akin to [10] this decoder considers all possible message pairs (i, j), and screens the remaining spurious codewords after the joint typicality test, as follows. • For any spurious codeword (i, j 0 ) which belongs to the “cloud” identified by the index i we additionally require that I(XY ; X 0 |U S) ≤ η. • For any spurious codeword (i0 , j 0 ) which belongs to another “cloud”— i.e. not that identified by the index i— we additionally require that I(U XY ; U 0 X 0 |S) ≤ η. As a result the complexity of this decoder is similar to that of an ML decoder. However, the decoding rule for the second user (whose aim is to decode i) may consider just all possible common messages. Hence, we consider the following decoder for Terminal 2 which decodes the common message i while considering the private messages to be part ¡ ¢ of the channel noise – i.e. we implement a decoding rule for the channel QX|U W . The consequence of this implementation is a more stringent nonsymmetrizability requirement for ¡ ¢ the AVC QX|U W than the necessary (and sufficient) nonsymmetrizabiliy-X requirement for communicating over the AVC W . Definition 12: Terminal 2 decoding rule: Let ψ (z) = i iff an s ∈ S n exists such that 1. The joint type Pui ,s,z belongs to Cηψ . 2. For each competitor i0 6= i, 1 ≤ i0 ≤ M , such that Pui0 ,s0 ,z ∈ Cηψ for some s0 ∈ S n , we have I (U Z; U 0 | S) ≤ η, where U, U 0 , S, Z denote dummy random variables such that the joint type of (ui , ui0 , s, z) equals PU U 0 SZ . If no such i exists, we declare an error.

13

A main step of the proof of Proposition 1 will consist of showing that these decoding rules are unambiguous if η is sufficiently small. The main ingredient in our proof is a technical Lemma (Lemma A.7 in Appendix III) which establishes the existence and the properties of a deterministic broadcast code with degraded message sets for the case at hand — it is an extension of a similar Lemma for the single user case [5, Lemma 4]. The reason for studying only degraded message sets is due to the fact that, so far, we’ve been unable to establish a similar technical result for the broadcast channel with non-degraded message sets. Notice that the Terminal 2 decoding rule is a restricted version of the Terminal 1 decoding rule. In particular, when the communication involves just a common message, both users can implement the restricted decoding rule with U = X and Q(x | u) = 1 iff u = x. In that case, as one would expect, this decoding rule reduces to the single user decoding rule in [5, Definition 3]. When the AVGBC is subject to a state constraint Λ, the same decoding rules are used with the exception that instead of the sets Cηψ and Cηφ , we use the sets Cηψ (Λ) and Cηφ (Λ), respectively, defined by © ª PU SZ : PU SZ ∈ Cηψ , E [l (S)] ≤ Λ © ª Cηφ (Λ) , PU XSY : PU XSY ∈ Cηφ , E [l (S)] ≤ Λ .

Cηψ (Λ) ,

VI. Main Results The characterization of the deterministic code capacity region for the multiple-access AVC was provided by Jahn [3], assuming that the capacity region has a nonempty interior. It is shown in [7, 8, 10] that the same nonsymmetrizability conditions are necessary either to first show that the above region has a nonempty interior and then apply the elimination technique (as Jahn did) or to directly apply Gubner’s approach in order to obtain an achievable region subject to a state constraint. This is due to the fact that in order to apply the elimination technique a pair of indices, indicating which pair of codebooks is used by the encoders, needs to be sent to the receiver. In sharp contrast, for the AVGBC if one wishes to establish a non-vanishing common message rate in order to apply the elimination technique — the nonsymmetrizability requirements are just nonsymmetrizable-X for both channels V and W . However, as we dispense with the elimination approach, it is conceivable that a receiver opting to decode both the common as well as the private message would result in a more stringent nonsymmetrizability requirement. In order to state our main results we begin with auxiliary results which characterize 14

achievable regions, in the absence of a state constraint, via coding/decoding techniques that are suitable for AVGBC subject to a state constraint. A. Deterministic coding in the absence of a state constraint Let U be a finite set and for a law Q ∈ D (U × X ) and a law r ∈ D (S ) define IY∗ ;X|U (Q, V ) , ∗ IZ;U (Q, W ) ,

IY∗ ;X (Q, V ) ,

inf IY ;X|U (Q × rV )

(10)

inf IZ;U (Q × rW )

(11)

inf IY ;X (Q × rV ) .

(12)

r∈D(S ) r∈D(S ) r∈D(S )

Next, let n

(2)

Rd (U , Q, V, W ) ,

(R0 , R, 0) :

0 ≤ R0 < IY∗ ;X|U (Q, V ) , ∗ 0 ≤ R < IZ;U (Q, W )

o 0 ≤ R + R0 < IY∗ ;X (Q, V ) .

(13)

yc This region is identical to the region Rinn (B), defined by Jahn [3, Remark IIB6] and proved, as a consequence of Theorem 1, to be achievable if int{R¯C (V, W )} 6= ∅.

The following proposition provides sufficient conditions under which there exists a non(2)

empty achievable region in CD (V, W ) ⊂ CD (V, W ) using deterministic coding for the AVGBC with degraded message sets. Proposition 1: Suppose there exist a finite nonempty set U and a law Q ∈ D (U × X ), satisfying QX (x) > 0 for all x ∈ X and QU (u) > 0 for all u ∈ U , such that V is (2)

nonsymmetrizable-X |U and QX|U W is nonsymmetrizable-U . Then the region Rd (U , Q, V, W ) is achievable using deterministic coding for the AVGBC with degraded message sets. A proof is given in Appendix I. We observe that for any law Q ∈ D (U × X ) satisfying QX (x) > 0 for all x ∈ X and QU (u) > 0 for all u ∈ U , if V is nonsymmetrizable-X |U and QX|U W is nonsymmetrizable∗ U then IY∗ ;X|U (Q, V ), IZ;U (Q, W ), and IY∗ ;X (Q, V ) are strictly positive. Consequently, (2)

Rd (U , Q, V, W ) has a nonempty interior. To see this, suppose IY∗ ;X|U (Q, V ) = 0. Then there exists some r ∈ D (S ) with IY ;X|U (Q × rV ) = 0. This implies that for each u and P x such that Q (u, x) > 0, s V (y | x, s) r (s) is not a function of x. However, taking then U (s | x, u) = r (s) will symmetrize-X |U the AVC V . Similarly, if IY∗ ;X (Q, V ) = 0 then P there exists some r ∈ D (S ) with IY ;X (Q × rV ) = 0. This implies that s V (y | x, s) r (s) 15

is not a function of x. However, taking then U (s | x) = r (s) will symmetrize-X the AVC ∗ V . The same argument shows that IZ;U (Q, W ) is strictly positive as well. Define, similarly to (13), the region n (1) Rd (U , Q, V, W ) , (0, R, R0 ) : ∗ 0 ≤ R0 < IZ;X|U (Q, W ) ,

0 ≤ R < IY∗ ;U (Q, V ) , 0

0 ≤ R+R
0 for all x ∈ X and QU (u) > 0 for all u ∈ U , such that QX|U V is nonsymmetrizable-U and W is nonsymmetrizable-X |U . (1)

(1)

Then the region Rd (U , Q, V, W ) is an achievable region in CD (V, W ) ⊂ CD (V, W ) using deterministic coding. Using similar arguments as in the observation following Proposition 1, (1)

the region Rd (U , Q, V, W ) has a nonempty interior. As a result, if both QX|U V and QX|U W (1) are nonsymmetrizable-U , then any convex combination of the regions Rd (U , Q, V, W ) (2)

and Rd (U , Q, V, W ), both of which are nonempty, is achievable for the AVGBC under deterministic coding. As observed in section IV, the nonsymmetrizability-X |U is a more stringent condition than nonsymmetrizability-X . Nevertheless, as already mentioned, the nonsymmetrizabilityX |U requirement on V is a consequence of our Terminal 1 decoder, while the nonsymmetrizabilityU requirement on QX|U W is a consequence of our choice of a decoder for Terminal 2 (as mentioned in section V). Furthermore, as observed in section IV, the nonsymmetrizabilityU requirement on the channel QX|U W in Proposition 1 is the most stringent condition compared to all other nonsymmetrizability conditions. It is possible, however, that for a given AVGBC (V, W ), a set U and a law Q ∈ D (U × X ) will not satisfy the nonsymmetrizability-U condition for the channel QX|U W , whereas the channel V is nonsymmetrizable-X |U and the channel W is nonsymmetrizableX (this means that reliable communication should be possible for the channel W ). In that case, one could implement at Terminal 2, the first two stages of Terminal 1 decoding rule (Definition 11) 2 . The trade-off, however, is that instead of bounding the achievable common ∗ rate R by IZ;U (Q, W ) (see eq. 13), the suggested decoding rule imposes an upper bound on

the achievable sum-rate as follows Proposition 2: Suppose there exist a finite nonempty set U and a law Q ∈ D (U × X ), 2

Notice that stage 3 of definition 11 is not needed because Terminal 2 need not decide between various competitors for the private message sent to user 1 - that’s why the channel W should be just nonsymmetrizableX.

16

satisfying QX (x) > 0 for all x ∈ X and QU (u) > 0 for all u ∈ U , such that V is (2) nonsymmetrizable-X |U and W is nonsymmetrizable-X . Then the region R˜ (U , Q, V, W ) d

defined by (2) R˜d (U , Q, V, W ) ,

n

(R0 , R, 0) :

0 ≤ R0 < IY∗ ;X|U (Q, V ) , © ªo ∗ 0 ≤ R + R0 < min IY∗ ;X (Q, V ) , IZ;X (Q, W ) . is achievable using deterministic coding for the AVGBC with degraded message sets. The proof of Proposition 2 follows by similar arguments to those used in the proof of Proposition 1. Similarly, when the channel W is nonsymmetrizable-X |U and the channel V is nonsymmetrizable-X , we obtain using similar arguments to Proposition 2, the following (1) achievable region in CD (V, W ) n (1) R˜d (U , Q, V, W ) , (0, R, R0 ) : ∗ 0 ≤ R0 < IZ;X|U (Q, W ) , © ªo ∗ 0 ≤ R + R0 < min IY∗ ;X (Q, V ) , IZ;X (Q, W ) .

When the communication involves just a common message, both users can implement the restricted decoding rule of Terminal 2 as specified in section V. Consequently, the sufficient conditions of Proposition 1 can be reduced to nonsymmetriziablity-X of both V and W , which according to [11] is also a necessary condition. As mentioned in section I, this result could be obtained using Theorem 1 (due to Jahn [3]) together with the Csisz´ar-Narayan characterization of the single user AVC [5]. The main contribution of Proposition 1 is therefore the much stronger result and its proof which are valid for less restrictive communication situations (in the sense that the channel might be symmetrizable) that involve state constraints, for which Ahlswede’s elimination technique, used by Jahn, is no longer applicable. B. Deterministic coding in the presence of state and input constraints For notational convenience, we define g (QX ) =

X

QX (x) g (x)

x∈X

l (r) =

X

s∈S

17

r (s) l (s) .

Next, define IY∗ ;X|U (Q, V, Λ) , ∗ IZ;U (Q, W, Λ) ,

IY∗ ;X (Q, V, Λ) ,

inf IY ;X|U (Q × rV ) ,

(14)

inf IZ;U (Q × rW ) ,

(15)

inf IY ;X (Q × rV ) ,

(16)

r∈D(S ) l(r)≤Λ

r∈D(S ) l(r)≤Λ r∈D(S ) l(r)≤Λ

and let n

(2)

Rd (U , Q, V, W, Λ) ,

(R0 , R, 0) :

0 ≤ R0 < IY∗ ;X|U (Q, V, Λ) , ∗ (Q, W, Λ) 0 ≤ R < IZ;U

o 0 ≤ R + R0 < IY∗ ;X (Q, V, Λ) .

(17)

Next, define the functionals ΛV (QX ) ,

XX

min

˜ ˜ ∈D(S U |X )

QX (x) U˜ (s | x) l (s) ,

x∈X s∈S

where D˜ (S | X ) denotes the set of all channels U˜ ∈ D (S | X ) satisfying (6), ΛV (Q, U ) ,

min

˜ ˜ ∈D(S U |U ×X )

XXX

Q (u, x) U˜ (s | u, x) l (s) ,

u∈U x∈X s∈S

where D˜ (S | U × X ) denotes the set of all channels U˜ ∈ D (S | U × X ) satisfying (7), and finally ΛV (QU ) ,

min

˜ ˜ ∈D(S U |U )

XX

QU (u) U˜ (s | u) l (s) ,

u∈U s∈S

where D˜ (S | U ) denotes the set of all channels U˜ ∈ D (S | U ) satisfying (8). The functionals ΛW (QX ), ΛW (Q, U ) and ΛW (QU ) are defined similarly. The functionals ΛV (·) are extensions of the single user functional Λ0 (P ) defined in [5, equation (2.13)]. As in the single-user case, ΛV (Q, U ) is a continuous function of Q if D˜ (S | U × X ) 6= ∅ – i.e., if V is symmetrizable-X |U , whereas ΛV (Q, U ) , ∞ if V is nonsymmetrizable-X |U . The following theorem provides sufficient conditions under which (2)

there exists a nonempty achievable region in CD (V, W, Λ, gm ) for the AVGBC with degraded message sets subject to a state constraint.

18

Theorem 2: Given Λ > 0 and arbitrarily small α > 0, for any sufficiently large block length n and for any nonempty set U and a law Q ∈ D (U × X ) satisfying ΛV (Q, U ) ≥ Λ + α ΛW (QU ) ≥ Λ + α QX (x) > 0

∀x ∈ X

QU (u) > 0

∀u ∈ U ,

(18)

(2)

the region Rd (U , Q, V, W, Λ) is achievable using deterministic coding for the AVGBC with degraded message sets. Analogously to the relations between the various symmetrizability definitions, observed in section IV, we have for any set U and law Q ∈ D (U × X ) the following relations between the functionals ΛV (QU ), ΛV (Q, U ) and ΛV (QX ) ΛV (QU ) ≤ ΛV (Q, U ) ≤ ΛV (QX ) . Observe that when U˜ satisfies (6), then U˜ satisfies (7), however, it may happen that U˜ satisfies (7) but not (6) – therefore ΛV (Q, U ) ≤ ΛV (QX ). Similar argument establishes the first inequality. Hence, it is possible that for a given AVGBC (V, W ), a set U and a law Q ∈ D (U × X ) could not satisfy ΛW (QU ) ≥ Λ + α in (18), while ΛV (Q, U ) ≥ Λ + α and ΛW (QX ) ≥ Λ + α. In that case, one could implement, as already suggested, stages 1 and 2 of the Terminal 1 decoding rule (Definition 11) instead of the regular Terminal 2 decoding rule (Definition 12). As a result, we have the following. Theorem 3: Given Λ > 0 and arbitrarily small α > 0, for any sufficiently large block length n and for any nonempty set U and a law Q ∈ D (U × X ) satisfying ΛV (Q, U ) ≥ Λ + α ΛW (QX ) ≥ Λ + α QX (x) > 0

∀x ∈ X

QU (u) > 0

∀u ∈ U ,

(2) the region R˜d (U , Q, V, W, Λ) defined by n (2) R˜d (U , Q, V, W, Λ) , (R0 , R, 0) :

0 ≤ R0 < IY∗ ;X|U (Q, V, Λ) , © ∗ ªo 0 ∗ 0 ≤ R + R < min IY ;X (Q, V, Λ) , IZ;X (Q, W, Λ) ,

19

is achievable using deterministic coding for the AVGBC with degraded message sets. The proof of this result follows via similar arguments to those used in the proof of Theorem 2. When the transmission is subject to an input constraint Γ, the addition of the constraint g (QX ) ≤ Γ in Propositions 1-2 and Theorems 2-3, establishes achievable regions in (2)

(1)

(2)

CD (V, W, lm , Γ), CD (V, W, lm , Γ) and CD (V, W, Λ, Γ). Similarly, a proper change of the (1) (1) conditions in (18) yields an achievable regions in CD (V, W, Λ, gm ) and CD (V, W, Λ, Γ). We consider next the case where the communication involves just a common message, then letting U = X and assuming that both users implement the restrictive decoding rule as specified in section V, the conditions (18) of Theorem 2 can be reduced to ΛV (QX ) ≥ Λ + α ΛW (QX ) ≥ Λ + α . Hence, we have the following Corollary. Corollary: The common message capacity region of the AVGBC under state constraint has a nonempty interior if for an arbitrarily small α > 0, a type P on X exists such that ΛV (P ) ≥ Λ + α, ΛW (P ) ≥ Λ + α.

(19)

Notice, that according to [5, Lemma 1], setting α = 0 in (19) yields necessary conditions for the common message capacity region to have a nonempty interior. In the absence of a state constraint, we can establish using Proposition 1 that the con(2) (1) vex hull of the regions Rd (U , Q, V, W ) and Rd (U , Q, V, W ) is achievable via the timesharing principle, hence establishing an achievable region in CD (V, W ) (similar result could (2) (1) be obtained using Proposition 2 for the regions R˜ (U , Q, V, W ) and R˜ (U , Q, V, W )). d

d

This, however, is not the case in the presence of a state constraint. Defining the region (1)

Rd (U , Q, V, W, Λ) in a similar way to (17), all that can be guaranteed is that the union of (2) (1) the regions Rd (U , Q, V, W, Λ) and Rd (U , Q, V, W, Λ) is achievable. The reason for that is that time-sharing is no longer permissible in the presence of a state constraint (see also [9] for the AVMAC subject to a state constraint). C. AVBC with independent state sets Certain AVBC may have the property that the channels V and W are ruled by independent state sets SV and SW , respectively – i.e. (V, W ) is a transition probabilities pair form X × SV into Y and form X × SW into Z while the channel’s arbitrarily varying states 20

sv ∈ SV and sw ∈ SW are chosen independently. If we set S = SV × SW , then these AVBC’s are special cases of our model (this restrictive model was first suggested by Jahn [3, Remark IIB8]). Nonetheless, this case can be handled via our foregoing analysis as follows. The Definitions 2 and 6 for achievable rates, should require the existence of a deterministic (n, My , Mc , Mz )-code with probabilities of decoding error eφ (sv ) ≤ ² ∀sv ∈ Svn eψ (sw ) ≤ ²

∀sw ∈ Swn .

We consider now the state constraint case, let lv (sv ) and lw (sw ) be given nonnegative functions on SV and SW , respectively and define lv (sv ) and lw (sw ) similarly to l (s). We extend Definition 7 as follows. Definition 13: A pair of nonnegative real numbers (R, R0 ) is said to be achievable under input constraint Γ and state constraints (Λv , Λw ), over the AVBC with independent state sets and degraded message sets, if for every ² > 0, δ > 0, and sufficiently large n, a deterministic (n, M, M 0 )-code exists with codewords {xij }, i = 1, ..., M , j = 1, ..., M 0 , each satisfying g (xij ) ≤ Γ , code rates satisfying 1 log M > R − δ , n 1 log M 0 > R0 − δ n

and with probabilities of error eφ (sv ) ≤ ² and eψ (sw ) ≤ ², ∀ (sv , sw ) ∈ SVn × SWn : lv (sv ) ≤ Λv and lw (sw ) ≤ Λw . Setting Λ = (Λv , Λw ), no change is necessary in the various notations of achievable rates sets. Using the above definitions for achievable rates, we need only to consider the respective marginal state sets while checking the required nonsymmetrizability and functional ΛV (·) and ΛW (·) conditions in Proposition 1-2 and Theorems 2-3. For example, in Definition 8, the AVC V is said to be symmetrizable-X if there exists a transition probability U˜ from X into SV such that X sv ∈SV

V (y | x, sv ) U˜ (sv | x0 ) =

X

V (y | x0 , sv ) U˜ (sv | x) , ∀x ∈ X , x0 ∈ X , y ∈ Y .

sv ∈SV

21

Similar modifications can be made in Terminal 1 and Terminal 2 decoding rules (Definitions 11 and 12) – i.e. narrowing the search for the state sequence s ∈ S n only to the respective marginal states. Setting S = SV × SW , no change should be made in the sub-codewords and codewords Lemma (Appendix III). VII. Example Let X = SV = SW = Z = {0, 1}, Y = {0, 1, 2}. Let V (y | x, sv ) = 1 if y = x + sv , and 0 otherwise, and let W (z | x, sw ) = 1 if z = x + sw modulo 2, and 0 otherwise. The channel’s arbitrary states sv ∈ SV and sw ∈ SW are chosen independently. Both V and W are symmetrizable-X , consequently the previos results by Jahn [3] can not establish any achievable region for the case at hand (see [5, Example 1, 2] for their characterization as a single-user AVC). We will demonstrate using Theorem 2 an achievable region in (2)

CD (V, W, Λ(v,w) , Γ) for the AVGBC (V, W ) under state constraints Λ(v,w) = (Λv , Λw ) and input constraint Γ, where the functions g(x) = x, lv (sv ) = sv and lw (sw ) = sw are used for the input and state constraints – i.e. g(x), lv (sv ) and lw (sw ) are the respective normalized Hamming weights of the binary n-sequences x ∈ X n , sv ∈ SVn and sw ∈ SWn . Let U = {0, 1} and let Q ∈ D (U × X ). We first evaluate ΛW (QU ). Notice that QX|U W is symmetrizable-U iff the channel U˜ ∈ D (SW | U ) satisfying (8), satisfies £ ¤ £ ¤ U˜ (0|1) 2QX|U (0|0) − 1 = U˜ (0|0) 1 − 2QX|U (1|1) + QX|U (0|0) + QX|U (1|1) − 1. For the purpose of convenience, we choose Q ∈ D (U × X ) such that QX|U (0|0) = QX|U (1|1). Hence, ΛW (QU ) = =

X

min

˜ W |U ) ˜ ∈D(S U

min

˜ W |U ) ˜ ∈D(S U

QU (u) U (1 | u)

u∈U

n

³ ´ o QU (0) 1 − U˜ (0|0) + (1 − QU (0)) U˜ (0|0) .

It follows that ΛW (QU ) ≥ Λw +α iff min {QU (0), 1 − QU (0)} ≥ Λw +α. Therefore, (18) is satisfied only if Λw < 1/2. We proceed in evaluating ΛV (Q, U ), notice that V is symmetrizableX |U iff U˜ ∈ D (SV | U × X ) satisfying (7) is an identity channel from X to SV . Hence, ΛV (Q, U ) =

X

X

Q(u, x)U˜ (1|u, x) = Q (0, 1) + Q (1, 1) = QX (1) .

u∈{0,1} x∈{0,1}

Since Eg(X) = QX (1) ≤ Γ, (18) is satisfied only if Λv < Γ. We continue assuming Λw = Λv = Λ < min {Γ, 1/2}, under this assumption it can be shown that QU (0) and QX|U (0|0), which under the assumption QX|U (0|0) = QX|U (1|1) completely characterize the law Q ∈ D (U × X ), can be chosen such that (18) is satisfied. 22

(2)

Next, we calculate the achievable region Rd

¡

¢ (2) U , Q, V, W, Λ(v,w) ⊂ CD (V, W, Λ(v,w) , Γ)

for the AVGBC (V, W ) under state constraints Λ(v,w) and input constraint Γ. Let PU XSW Z = Q × rW , where r ∈ D (SW ), we have I (U ; Z) = H (Z) − H (Z|U ) h ¡ ¢ = h QU (0) 1 − PSW (1) − QX|U (0|0) + 2PSW (1)QX|U (0|0) ¡ ¢i + (1 − QU (0)) PSW (1) + QX|U (0|0) − 2PSW (1)QX|U (0|0) £ ¤ − h PSW (1) + QX|U (0|0) − 2PSW (1)QX|U (0|0) ,

(20)

where h(t) , −t log t − (1 − t) log(1 − t) is the binary entropy function. Define I1 (p, q) , h(p∗q)−h(q), where p∗q = pq +(1−p)(1−q). By standard properties of mutual information I1 (p, q) is concave in p, convex in q and minimized when q = 1/2. Since Elw (SW ) = PSW (1) ≤ Λw = Λ < 1/2, it follows from (20) that inf

r∈D(SW ) l(r)≤Λw

¡ ¢ IZ;U (Q × rW ) = I1 1 − QU (0) , Λ + QX|U (0|0) − 2ΛQX|U (0|0) .

(21)

Let PU XSV Y = Q × rV , where r ∈ D (SV ), we have I (X; Y ) = H (Y ) − H (Y |X) h = H QX (1) PSV (1) , (1 − QX (1)) (1 − PSV (1)) ,

i

QX (1) + PSV (1) − 2QX (1) PSV (1) − h (PSV (1)) ,

(22)

where H(t, u, v) , −t log t − u log u − v log v. Defining I2 (p, q) , H (pq, (1 − p) (1 − q) , p + q − 2pq) − h (q) , it can be verified that I2 (p, q) is concave in p, convex in q and minimized when q = 1/2. Since Elv (SV ) = PSV (1) ≤ Λv = Λ < 1/2, it follows from (22) that inf

r∈D(SV ) l(r)≤Λv

IY ;X (Q × rV ) = I2 (QX (1), Λ) .

(23)

Using similar arguments as those leading to (23), we have inf

r∈D(SV ) l(r)≤Λv

¡ ¢ ¡ ¢ IY ;X|U (Q × rV ) = QU (0) I2 1 − QX|U (0|0) , Λ + (1 − QU (0)) I2 QX|U (0|0) , Λ . (24)

23

Hence, according to (14)-(17), (21)-(24) and Theorem 2, assuming that Λv = Λw = Λ < © ª ¢ (2) (2) ¡ min Γ, 12 , the region Rd (Q) ⊂ CD V, W, Λ(v,w) , Γ defined by n (2) Rd (Q) , (R0 , R, 0) : ¡ ¢ ¡ ¢ 0 ≤ R0 < QU (0) I2 1 − QX|U (0|0) , Λ + (1 − QU (0)) I2 QX|U (0|0) , Λ , ¡ ¢ 0 ≤ R < I1 1 − QU (0) , Λ + QX|U (0|0) − 2ΛQX|U (0|0) , o 0 0 ≤ R + R < I2 (QX (1), Λ) , (25) is achievable for the given AVGBC under state and input constraint for suitable choices of Q ∈ D(U × X ). Figure 2 sketches three achievable regions according to eq. (25) for Λ = 1/8 and Γ = 1/2. As mentioned, Q ∈ D (U × X ) has been chosen such that QX|U (0|0) = QX|U (1|1). Setting QU = 0.375, and QX|U (0|1) = 0.65, 0.75 and 0.9, satisfies the conditions in (18) for the case at hand (all three regions are rectangles since the the sum-rate bound is redundant for these specific choices of U and Q). Using analogous arguments to Theorem 2 and similar steps as those leading to (25), © ª ¢ (1) (1) ¡ assuming that Λv = Λw = Λ < min Γ, 12 , the region Rd (Q) ⊂ CD V, W, Λ(v,w) , Γ defined by (1)

n

(R, R0 ) : ¡ ¢ 0 ≤ R0 < I1 QX|U (0|0), Λ , © ¡ ¢ 0 ≤ R < min I2 (QX (1) , λ) − QU (0) I2 1 − QX|U (0|0) , λ 0≤λ≤Λ ¡ ¢ª − (1 − QU (0)) I2 QX|U (0|0) , λ , o 0 ≤ R + R0 < I1 (QX (1), Λ) ,

Rd (Q) ,

is achievable for the given AVGBC under state and input constraints for suitable choices of Q ∈ D(U × X ) (Q must satisfy the input constraint and analogous inequalities to (18)). (2)

(1)

Any union of sets Rd (Q) and Rd (Q) over suitable laws Q ∈ D (U × X ), is also achievable (convex hull is not allowed here since the communication is subject to a state constraint). Notice that Theorem 2 may provide a wider achievable region than the one presented in this example due to the specific choice of the set U and the specific choice of the law Q ∈ D (U × X ).

24

0.7 QU(0) = 0.375, QX|U(0|1) = 0.65 QU(0) = 0.375, QX|U(0|1) = 0.75

0.6

0.5

R’

0.4 QU(0) = 0.375, QX|U(0|1) = 0.9 0.3

0.2

0.1

0

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

R

Figure 2: Achievable regions for several specific choices of Q for Λ = 1/8 and Γ = 1/2. The regions are calculated according to eq. (25) for QX|U (0|1) = 0.65, 0.75 and 0.9. For all three regions QU = 0.375 and QX|U (0|0) = QX|U (1|1).

25

Acknowledgment The authors thank the reviewers for their careful reading of the manuscript and their constructive comments. Appendix I Proofs of Proposition 1 and Theorem 2 Proof of Proposition 1: Let (V, W ) be any AVGBC with degraded message sets, δ > 0 be arbitrary real number, n > n0 be any block length (all bounds in this proof are valid for n larger than a suitable threshold n0 , which depends on ² to be specified later and on the respected channel’s alphabets), U be any nonempty set and Q be any joint type on U × X such that minx∈X QX (x) > 0, minu∈U QU (u) > 0, V is nonsymmetrizable-X |U and QX|U W is nonsymmetrizable-U . We will show that there exists a code with sub-codewords {ui } and codewords {xij }, i = 1, ..., M , j = 1, ..., M 0 satisfying (ui , xij ) ∈ TQ , such that 1 1 log M + log M 0 > IY∗ ;X (Q, V ) − δ n n 1 log M 0 > IY∗ ;X|U (Q, V ) − δ, n 1 ∗ log M > IZ;U (Q, W ) − δ, n

(26) (27) (28)

and with probabilities of decoding error eφ (s) ≤ exp (−nγφ ) and eψ (s) ≤ exp (−nγψ )

∀s ∈ S n ,

(29)

where γφ and γψ are both positive real numbers. Let {ui } and {xij }, i = 1, ..., M , j = 1, ..., M 0 , be sub-codewords and codewords, respectively, with block length n as in Appendix III, with R =

1 n

log M and R0 =

1 n

log M 0

satisfying 2 IY∗ ;X|U (Q, V ) − δ ≤ R0 < IY∗ ;X|U (Q, V ) − δ, 3 2 ∗ ∗ IZ;U (Q, W ) − δ ≤ R < IZ;U (Q, W ) − δ, 3 2 0 ∗ ∗ IY ;X (Q, V ) − δ ≤ R + R < IY ;X (Q, V ) − δ 3

(30) (31) (32)

and with ² (from Lemma A.7) to be specified later. Let the decoders φ and ψ be as described in Definitions 11 and 12, respectively. The decoders φ and ψ are unambiguous if η is chosen sufficiently small. Indeed, if for some z ∈ Z n and some i0 6= i, both ui and ui0 26

satisfy conditions 1 and 2 in Definition 12, then a pair (s, s0 ) exists with the joint type of (ui , ui0 , s, s0 , z) being represented by the dummy random variables U, U 0 , S, S 0 , Z that satisfy (60), (61) and (62) simultaneously. This contradicts Lemma A.1 and establish the unambiguity of the decoder ψ. If for some y ∈ Y n and some i0 6= i, j and j 0 , both ui , xij and ui0 , xi0 j 0 satisfy conditions 1 and 2 in Definition 11, then a pair (s, s0 ) exists with the joint type of (ui , xij , ui0 , xi0 j 0 , s, s0 , y) being represented by the dummy random variables U, X, U 0 , X 0 , S, S 0 , Y that satisfy (63), (64) and (65) simultaneously. This contradicts Lemma A.2. And if for some y ∈ Y n and some i and j 0 6= j, both ui , xij and xij 0 satisfy conditions 1 and 3 in Definition 11, then a pair (s, s0 ) exists with the joint type of (ui , xij , xij 0 , s, s0 , y) being represented by the dummy random variables U, X, X 0 , S, S 0 , Y that satisfy (66), (67) and (68) simultaneously. This contradicts Lemma A.3 and establish the unambiguity of the decoder φ. To establish (26)-(29), fix any s ∈ S n . We will first prove (29) for the Terminal 1 decoding rule (i.e. the φ decoding rule), by (87) we have that ¯ ¯ ¯ ¯¯ [ 1 ¯¯ TPU XS ¯¯ (i, j) : (ui , xij , s) ∈ ¯ M M 0 ¯¯ I(U X;S)>²

≤ (number of joint types) · exp (−n²/2) ≤ exp (−n²/3) ,

(33)

where the last inequality follows since the number of possible joint types of sequences of length n is polynomial in n. Hence to obtain an exponentially decreasing upper bound on (4) it suffices to deal with sub-codewords and codewords (ui , xij ) for which (ui , xij , s) ∈ TPU XS with I (U X; S) ≤ ². Then, for PU XSY ∈ / Cηφ (cf. (9)), we have D (PU XSY ||PU XS × V ) = D (PU XSY ||PU X × PS × V ) − I (U X; S) > η−² .

(34)

˜ ), Thus, with a slight abuse of notation, denoting by the pair (V, W ) also the AVGBC (V˜ , W ˜ : U × X × S → Z , defined by V˜ : U × X × S → Y , W V˜ (y | u, x, s) = V (y | x, s) , ˜ (z | u, x, s) = W (z | x, s) , W

u ∈ U ,x ∈ X ,y ∈ Y ,z ∈ Z ,

27

by (3) and (34) we have X

X

V n (y | xij , s) =

y: Pui ,xij ,s,y ∈C / ηφ

V n (y | ui , xij , s)

y: Pui ,xij ,s,y ∈C / ηφ

X



X

V n (y | ui , xij , s)

PU XSY : PU XSY ∈C / ηφ y∈TPY |U XS (ui ,xij ,s)

X



exp {−nD (PU XSY ||PU XS × V )}

PU XSY : PU SXY ∈C / ηφ

X



exp {−n (η − ²)}

PU XSY : PU SXY ∈C / ηφ

≤ exp {−n (η − 2²)} .

(35)

Next, notice that if Pui ,xij ,s,y ∈ Cηφ and φ (y) 6= (i, j), then at least one of the conditions 2-3 of the Terminal 1 decoding rule (Definition 11) must be violated. Let us define ½ φ PU XSY ∈ Cηφ , Eη , PU XU 0 X 0 SY : PU 0 X 0 S 0 Y ∈ Cηφ , for some S 0 , ¾ 0 0 I (U XY ; U X | S) > η and Fηφ

½ , PU XX 0 SY :

PU XSY ∈ Cηφ , PU X 0 S 0 Y ∈ Cηφ , for some S 0 , ¾ 0 I (XY ; X | U S) > η .

Then it follows that X

X

V n (y | xij , s) ≤

y: Pui ,xij ,sy ∈Cηφ φ(y)6=(i,j)

eE (i, j, s)

PU XU 0 X 0 SY ∈Eηφ

+

X

eF (i, j, s) .

(36)

PU XX 0 SY ∈Fηφ

Here eE (i, j, s) =

X y: (ui ,xij ,ui0 ,xi0 j 0 ,s,y )∈TP U XU 0 X 0 SY for some i0 6=i and j 0

28

V n (y | xij , s)

(37)

and

X

eF (i, j, s) =

V n (y | xij , s) .

(38)

y: (ui ,xij ,xij 0 ,s,y )∈TP U XX 0 SY for some j 0 6=j

Since eE (i, j, s) = 0 unless PU X = QU X , PU 0 X 0 = QU X and (ui , xij , ui0 , xi0 j 0 , s) ∈ TPU XU 0 X 0 S for some i0 6= i and j 0 then it suffices to bound eE (i, j, s) only when PU XU 0 X 0 SY ∈ Eηφ satisfies +

I (U X; U 0 X 0 S) ≤ |R + R0 − I (U 0 X 0 ; S)| + ² .

(39)

Otherwise, by (90), the contribution to the first summation in (36) of the terms with PU XU 0 X 0 SY ∈ Eηφ not satisfying (39) is less then exp {−n²/3}. Bounding (37) assuming (39), we have eE (i, j, s) ≤

X

X

(i0 ,j 0 ): (ui ,xij ,ui0 ,xi0 j 0 ,s)∈TP

U XU 0 X 0 S

y∈TP

Y |U XU 0 X 0 S

V n (y | ui , xij , s) .

(40)

(ui ,xij ,ui0 ,xi0 j0 ,s)

As V n (y | ui , xij , s) is constant for y ∈ TPY |U XS (ui , xij , s) and this constant is less than or ¯´−1 ³¯ ¯ ¯ equal to ¯TPY |U XS (ui , xij , s)¯ , the inner sum in (40) is bounded by ¯ ¯ ¯ ¯ ¯TPY |U XU 0 X 0 S (ui , xij , ui0 , xi0 j 0 , s)¯ ¯ ¯ ≤ (n + 1)c exp {n [H (Y | U XU 0 X 0 S) − H (Y | U XS)]} ¯ ¯ ¯TPY |U XS (ui , xij , s)¯ ≤ exp {−n [I (Y ; U 0 X 0 | U XS) − ²]} where c, throughout this proof represents a constant depending only on the size of the respected alphabets. Hence using (91), we have n h io + eE (i, j, s) ≤ exp −n I (Y ; U 0 X 0 | U XS) − |R + R0 − I (U 0 X 0 ; U XS)| − 2² .

(41)

To further bound eE (i, j, s) when (39) holds, we distinguish between two cases: (a) R + R0 ≤ I (U 0 X 0 ; S), and (b) R + R0 > I (U 0 X 0 ; S). In case (a) from (39) we have I (U X; U 0 X 0 | S) ≤ I (U X; U 0 X 0 S) ≤ ² , hence by the definition of Eηφ , I (Y ; U 0 X 0 | U XS) = I (U XY ; U 0 X 0 | S) − I (U X; U 0 X 0 | S) ≥ η−² . Since now R + R0 ≤ I (U 0 X 0 ; S) ≤ I (U 0 X 0 ; U XS), it follows from (41) that eE (i, j, s) ≤ exp {−n (η − 3²)} . 29

(42)

In case (b) we obtain from (39) that I (U X; U 0 X 0 S) ≤ R + R0 − I (U 0 X 0 ; S) + ² . It follows that R + R0 + ² ≥ I (U X; U 0 X 0 S) + I (U 0 X 0 ; S) = I (U X; U 0 X 0 | S) + I (U X; S) + I (U 0 X 0 ; S) = I (U 0 X 0 ; U XS) + I (U X; S) ≥ I (U 0 X 0 ; U XS) , and hence |R + R0 − I (U 0 X 0 ; U XS)|

+

≤ |R + R0 + ² − I (U 0 X 0 ; U XS)|

+

= R + R0 + ² − I (U 0 X 0 ; U XS) . Substituting this into (41), it follows that for case (b) eE (i, j, s) ≤ exp {−n [I (Y ; U 0 X 0 | U XS) − R − R0 − ² + I (U 0 X 0 ; U XS) − 2²]} = exp {−n [I (U 0 X 0 ; U XSY ) − (R + R0 ) − 3²]} ≤ exp {−n [I (X 0 ; Y ) − (R + R0 ) − 3²]} .

(43)

By the definition of Eηφ , PU 0 X 0 S 0 Y ∈ Cηφ for some S 0 . Let PU 00 X 00 S 00 Y 00 ∈ C0φ defined by PU 00 X 00 S 00 Y 00 = QU X × PS 0 × V , then if η is sufficiently small then I (X 0 ; Y ) is arbitrarily close to I (X 00 ; Y 00 ), say δ I (X 0 ; Y ) ≥ I (X 00 ; Y 00 ) − . 3 ∗ Using the definition (12) of IY ;X (Q, V ) and the assumption made in (32) it follows that I (X 0 ; Y ) − (R + R0 ) ≥ I (X 00 ; Y 00 ) − (R + R0 ) −

δ 3

≥ IY∗ ;X (Q, W ) − (R + R0 ) − ≥

δ 3

δ 3

if η is sufficiently small and depends only on δ. Fixing the heretofore unspecified η accordingly (and small enough for the decoding rule to be unambiguous), (43) yields for case (b) that ¸¾ ½ · δ − 3² , (44) eE (i, j, s) ≤ exp −n 3 and that conclude the bounding procedure of (37). 30

Since eF (i, j, s) = 0 unless PU X = QU X , PU X 0 = QU X and (ui , xij , xij 0 , s) ∈ TPU XX 0 S for some j 0 6= j then it suffices to bound eF (i, j, s) only when PU XX 0 SY ∈ Fηφ satisfies +

I (X; X 0 S | U ) ≤ |R0 − I (X 0 ; S | U )| + ² .

(45)

Otherwise, by (92), the contribution to the second summation in (36) of the terms with PU XX 0 SY ∈ Fηφ not satisfying (45) is less then exp {−n²/3}. Bounding (38) assuming (45), we have X

X

eF (i, j, s) ≤

j 0 : (ui ,xij ,xij 0 ,s)∈TP

U XX 0 S

y∈TP

Y |U XX 0 S

V n (y | ui , xij , s) .

(46)

(ui ,xij ,xij0 ,s)

As V n (y | ui , xij , s) is constant for y ∈ TPY |U XS (ui , xij , s) and this constant is less than or ¯´−1 ³¯ ¯ ¯ , the inner sum in (46) is bounded by equal to ¯TPY |U XS (ui , xij , s)¯ ¯ ¯ ¯ ¯ 0 , s) (u , x , x T ¯ PY |U XX 0 S i ij ij ¯ ¯ ¯ ≤ (n + 1)c exp {n [H (Y | U XX 0 S) − H (Y | U XS)]} ¯ ¯ ¯TPY |U XS (ui , xij , s)¯ ≤ exp {−n [I (Y ; X 0 | U XS) − ²]} . Hence using (93), we have n h io + eF (i, j, s) ≤ exp −n I (Y ; X 0 | U XS) − |R0 − I (X 0 ; XS | U )| − 2² .

(47)

To further bound eF (i, j, s) when (45) holds, we distinguish between two cases: (a) R0 ≤ I (X 0 ; S | U ), and (b) R0 > I (X 0 ; S | U ). In case (a) from (45) we have I (X; X 0 | U S) ≤ I (X; X 0 S | U ) ≤ ² , hence by the definition of Fηφ , I (Y ; X 0 | U XS) = I (XY ; X 0 | U S) − I (X; X 0 | U S) ≥ η − ². Since now R0 ≤ I (X 0 ; S | U ) ≤ I (X 0 ; XS | U ), it follows from from (47) that eF (i, j, s) ≤ exp {−n (η − 3²)} . In case (b) we obtain from (45) that I (X; X 0 S | U ) ≤ R0 − I (X 0 ; S | U ) + ² . 31

(48)

It follows that R0 + ² ≥ I (X; X 0 S | U ) + I (X 0 ; S | U ) = I (X; X 0 | U S) + I (X; S | U ) + I (X 0 ; S | U ) = I (X 0 ; XS | U ) + I (X; S | U ) ≥ I (X 0 ; XS | U ) , and hence |R0 − I (X 0 ; XS | U )|

+

≤ |R0 + ² − I (X 0 ; XS | U )|

+

= R0 + ² − I (X 0 ; XS | U ) . Substituting this into (47), it follows that for case (b), eF (i, j, s) ≤ exp {−n [I (Y ; X 0 | U XS) − R0 − ² + I (X 0 ; XS | U ) − 2²]} = exp {−n [I (X 0 ; XSY | U ) − R0 − 3²]} ≤ exp {−n [I (X 0 ; Y | U ) − R0 − 3²]} .

(49)

By the definition of Fηφ , PU X 0 S 0 Y ∈ Cηφ for some S 0 . Let PU 00 X 00 S 00 Y 00 ∈ C0φ defined by PU 00 X 00 S 00 Y 00 = QU X × PS 0 × V , if η is sufficiently small then I (X 0 ; Y | U ) is arbitrarily close to I (X 00 ; Y 00 | U 00 ), say

δ . 3 Using the definition (10) of IY∗ ;X|U (Q, W ) and the assumption in (30) it follows that I (X 0 ; Y | U ) ≥ I (X 00 ; Y 00 | U 00 ) −

I (X 0 ; Y | U ) − R0 ≥ I (X 00 ; Y 00 | U 00 ) − R0 − ≥ IY∗ ;X|U (Q, W ) − R0 − ≥

δ 3

δ 3

δ 3

if η is sufficiently small and depends only on δ. Fixing η to be the minimum between the heretofore unspecified η and the one fixed in the paragraph containing (44), (49) yields for case (b) that ½ · ¸¾ δ eF (i, j, s) ≤ exp −n − 3² . (50) 3 From (35), (36), (42), (44), (48), (50) and the observations made in the paragraphs containing (39) and (45) we have that for a proper choice of ² (depends only on η and δ) eφ (s) ≤ exp (−nγφ ) 32

where γφ > 0. We turn now to prove statement (29) for the decoding rule of user 2 (i.e. the ψ decoding rule), by (33), in order to obtain an exponentially decreasing upper bound on (5) it suffices to deal with sub-codewords and codewords (ui , xij ) for which (ui , xij , s) ∈ TPU XS with I (U X; S) ≤ ². Then, for PU SZ ∈ / Cηψ , we have D (PU XSZ ||PU XS × W ) = D (PU XSZ ||PU X × PS × W ) − I (U X; S) ¡ ¡ ¢¢ ≥ D PU SZ ||PU × PS × QX|U W − I (U X; S) > η − ². Thus, by (3) we have X

X

W n (z | xij , s) =

W n (z | ui , xij , s)

z: Pui ,s,z ∈C / ηψ

z: Pui ,s,z ∈C / ηψ

X



X

W n (z | ui , xij , s)

PU XSZ : PU SZ ∈C / ηψ z∈TPZ|U XS (ui ,xij ,s)

X



exp {−nD (PU XSZ ||PU XS × W )}

PU XSZ : PU SZ ∈C / ηψ

X



exp {−n (η − ²)}

PU XSZ : PU SZ ∈C / ηψ

≤ exp {−n (η − 2²)} .

(51)

Next, notice that if Pui ,s,z ∈ Cηψ and ψ (z) 6= i, then condition 2 of Definition 12 must be violated. Let us define ½ Dηψ

,

PU U 0 XSZ :

PU SZ ∈ Cηψ , PU 0 S 0 Z ∈ Cηψ , for some S 0 , ¾ 0 I (U Z; U | S) > η .

Then it follows that X

W n (z | xij , s) ≤

ψ(z)6=i

eD ψ (i, j, s) =

eD ψ (i, j, s)

(52)

PU U 0 XSZ ∈Dηψ

z: Pui sz ∈Cηψ

where

X

X z: (ui ,ui0 ,xij ,s,z)∈TP 0 U U XSZ for some i0 6=i

33

W n (z | xij , s) .

(53)

Since eD ψ (i, j, s) = 0 unless PU X = QU X , PU 0 = QU and (ui , ui0 , xij , s) ∈ TPU U 0 XS for some i0 6= i, then it suffices to bound eD ψ (i, j, s) only when PU U 0 XSZ ∈ Dηψ satisfies +

I (U X; U 0 S) ≤ |R − I (U 0 ; S)| + ²,

(54)

otherwise, by (88), the contribution to the summation in (52) of the terms with PU U 0 XSZ ∈ Dηψ not satisfying (54) is less then exp {−n²/3}. Bounding (53) assuming (54), we have X

X

eD ψ (i, j, s) ≤

i0 : (ui ,ui0 ,xij ,s)∈TP

U U 0 XS

z∈TP

Z|U U 0 XS

W n (z | ui , xij , s) .

(55)

(ui ,ui0 ,xij ,s)

As W n (z | ui , xij , s) is constant for z ∈ TPZ|U XS (ui , xij , s) and this constant is less than or ¯´−1 ³¯ ¯ ¯ , the inner sum in (55) is bounded by equal to ¯TPZ|U XS ¯ ¯ ¯ ¯ ¯ 0 , x , s) (u , u T ¯ PZ|U U 0 XS i i ij ¯ ¯ ¯ ≤ (n + 1)c exp {n [H (Z | U U 0 XS) − H (Z | U XS)]} ¯ ¯ ¯TPZ|U XS (ui , xij , s)¯ ≤ exp {−n [I (Z; U 0 | U XS) − ²]} , hence using (89), we have n h io + eD ψ (i, j, s) ≤ exp −n I (Z; U 0 | U XS) − |R − I (U 0 ; U XS)| − 2² .

(56)

To further bound eD ψ (i, j, s) when (54) holds, we distinguish between two cases: (a) R ≤ I (U 0 ; S), and (b) R > I (U 0 ; S). In case (a) from (54) we have I (U X; U 0 | S) ≤ I (U X; U 0 S) ≤ ² hence by the definition of Dηψ , I (Z; U 0 | U XS) = I (ZU X; U 0 | S) − I (U X; U 0 | S) = I (ZU ; U 0 | S) + I (X; U 0 | SZU ) − I (U X; U 0 | S) ≥ η − ². Since now R ≤ I (U 0 ; S) ≤ I (U 0 ; U XS), it follows from from (56) that eD ψ (i, j, s) ≤ exp {−n (η − 3²)} . In case (b) we obtain from (54) that I (U X; U 0 S) ≤ R − I (U 0 ; S) + ² 34

(57)

it follows that R + ² ≥ I (U X; U 0 S) + I (U 0 ; S) = I (U X; U 0 | S) + I (U X; S) + I (U 0 ; S) = I (U 0 ; U XS) + I (U X; S) ≥ I (U 0 ; U XS) , and hence |R − I (U 0 ; U XS)|

+

≤ |R + ² − I (U 0 ; U XS)|

+

= R + ² − I (U 0 ; U XS) . Substituting this into (56), it follows that for case (b) eD ψ (i, j, s) ≤ exp {−n [I (Z; U 0 | U XS) − R − ² + I (U 0 ; U XS) − 2²]} = exp {−n [I (U 0 ; U XSZ) − R − 3²]} ≤ exp {−n [I (U 0 ; Z) − R − 3²]} .

(58)

By the definition of Dηψ , PU 0 S 0 Z ∈ Cηψ for some S 0 . Let PU 00 S 00 Z 00 ∈ C0ψ defined by ¡ ¢ PU 00 S 00 Z 00 = QU × PS 0 × QX|U W , then if η is sufficiently small then I (U 0 ; Z) is arbitrarily close to I (U 00 ; Z 00 ), say

δ I (U 0 ; Z) ≥ I (U 00 ; Z 00 ) − . 3 ∗ Using the definition (11) of IZ;U (Q, W ) and the assumption made in (31) it follows that I (U 0 ; Z) − R ≥ I (U 00 ; Z 00 ) − R −

δ 3

∗ ≥ IZ;U (Q, W ) − R −



δ 3

δ 3

if η is sufficiently small and depends only on δ. Fixing the heretofore unspecified η accordingly (and small enough for the decoding role to be unabiguous), (58) yeilds for case (b) that ½ · ¸¾ δ eD ψ (i, j, s) ≤ exp −n − 3² . (59) 3 From (59), (57), (52), (51) and the observations made in the paragraphs containing (54), we have that for a proper choice of ² (depends only on η and δ) eψ (s) ≤ exp (−nγψ ) 35

where γψ > 0. Proof of Theorem 2: The proof of Theorem 2 is identical to this of Proposition 1 except for the use of the decoding rules for the state constraint case and that instead of Lemma A.1, A.2 and A.3, we establish the unambiguity of the decoding rules using Lemma A.4, A.5 and A.6, respectively. Appendix II Unambiguity of the Decoding Rule Lemma A.1: If the AVC QX|U W is nonsymmetrizable-U and β > 0, then for a sufficiently small η, no five-tuple random variables U, U 0 , S, S 0 , Z can simultaneously satisfy PU = PU 0 = QU with min QU (u) ≥ β, u∈U

PU SZ ∈ Cηψ ,

(60)

PU 0 S 0 Z ∈ Cηψ ,

(61)

I(U 0 Z; U | S 0 ) ≤ η .

(62)

and I(U Z; U 0 | S) ≤ η ,

¡ ¡ ¢¢ Proof: Since (60) and (61) implies that D PU SZ ||PU × PS × QX|U W ≤ η and ¡ ¡ ¢¢ D PU 0 S 0 Z ||PU 0 × PS 0 × QX|U W ≤ η, the rest of the proof is identical to [5, Lemma 4] and is therefore omitted. Lemma A.2: If the AVC V is nonsymmetrizable-X and β > 0, then for a sufficiently small η, no seven-tuple random variables U, U 0 , X, X 0 , S, S 0 , Y can simultaneously satisfy PX = PX 0 = QX with min QX (x) ≥ β, x∈X

PU XSY ∈ Cηφ ,

(63)

PU 0 X 0 S 0 Y ∈ Cηφ ,

(64)

I(U 0 X 0 Y ; U X | S 0 ) ≤ η .

(65)

and I(U XY ; U 0 X 0 | S) ≤ η ,

Proof: By the definition of Cηφ (cf. 9), the condition PU XSY ∈ Cηφ , means that D (PU XSY ||PU X × PS × V ) = X PU XSY (u, x, s, y) log u,x,s,y

PU XSY (u, x, s, y) ≤η . PU X (u, x) PS (s) V (y | x, s)

36

Upon adding to this I (U XY ; U 0 X 0 | S) = X PU 0 X 0 |U XSY (u0 , x0 | u, x, s, y) PU XU 0 X 0 SY (u, x, u0 , x0 , s, y) log ≤η, 0 0 P U 0 X 0 |S (u , x | s) 0 0 u,x,u ,x ,s,y we obtain X

PU XU 0 X 0 SY (u, x, u0 , x0 , s, y)

u,x,u0 ,x0 ,s,y

PU XSY (u, x, s, y) PU 0 X 0 |U XSY (u0 , x0 | u, x, s, y) · log PU X (u, x) PS (s) V (y | x, s) PU 0 X 0 |S (u0 , x0 | s) = D (PU XU 0 X 0 SY ||PU 0 X 0 S PU X V ) ≤ 2η . Projecting both probability laws, PU XU 0 X 0 SY and PU 0 X 0 S PU X V , on X × X × Y , since the divergence does not increase, we get ³ ´ ˜ D PXX 0 Y ||QX × QX × V ≤ 2η , where

³

´ QX × QX × V˜ (x, x0 , y) , QX (x) QX (x0 ) V˜ (y | x, x0 ) ,

with V˜ (y | x, x0 ) ,

X

V (y | x, s) PS|X 0 (s | x0 ) .

s

The rest of the proof is similar to the single user case [5, Lemma 4] and is therefore omitted. Lemma A.3: If the AVC V is nonsymmetrizable-X |U , then for a sufficiently small η, no six-tuple of random variables U, X, X 0 , S, S 0 , Y can simultaneously satisfy PU X = PU X 0 = Q

with min QU (u) > 0, u∈U

PU XSY ∈ Cηφ ,

(66)

PU X 0 S 0 Y ∈ Cηφ ,

(67)

I(X 0 Y ; X | U S 0 ) ≤ η .

(68)

and I(XY ; X 0 | U S) ≤ η ,

Proof: By the definition of Cηφ (9), the condition PU XSY ∈ Cηφ , means that D (PU XSY ||PU X × PS × V ) = X PU XSY (u, x, s, y) log u,x,s,y

PU XSY (u, x, s, y) ≤η . PU X (u, x) PS (s) V (y | x, s) 37

Upon adding to this I (XY ; X 0 | U S) = X PX 0 |U XSY (x0 | u, x, s, y) PU XX 0 SY (u, x, x0 , s, y) log ≤η , 0 P X 0 |U S (x | u, s) 0 u,x,x ,s,y we obtain X

PU XX 0 SY (u, x, x0 , s, y)

u,x,x0 ,s,y

PU XX 0 SY (u, x, x0 , s, y) PU X (u, x) PS (s) PX 0 |U S (x0 | u, s) V (y | x, s) ¡ ¢ = D PU XX 0 SY ||PU X PS PX 0 |U S V ≤ 2η . · log

(69)

Using Pinsker’s inequality [14, p. 58], it follows that there exists a positive constant c such that X ¯ p ¯ ¯PU XX 0 SY (u, x, x0 , s, y) − PU X (u, x) PS (s) PX 0 |U S (x0 | u, s) V (y | x, s)¯ ≤ c 2η . u,x,x0 ,s,y P

0

P

0

Substituting PX 0 |U S = S|XPUU S X U we have ¯ X ¯ ¯PU XX 0 Y (u, x, x0 , y) − ¯ u,x,x0 ,y 0

PU X (u, x) PX 0 |U (x | u) p ≤ c 2η .

X PU (u) PS (s) s

PU S (u, s)

¯ ¯ PS|X 0 U (s | x , u) V (y | x, s)¯¯ 0

(70)

Projecting the probability laws in (69) on U × S , since the divergence does not increase, we have D (PU S ||PU × PS ) ≤ 2η , and using Pinsker’s inequality it follows that there exits an absolute constant c˜ such that for all u ∈ U , s ∈ S we have p |PU S (u, s) − PU (u) PS (s)| ≤ c˜ 2η .

(71)

Commencing with the conditions PU X 0 S 0 Y ∈ Cηφ and I (X 0 Y ; X | U S 0 ) ≤ η, we obtain in a similar manner that there exist positive constants c and c˜ such that ¯ X ¯ ¯PU XX 0 Y (u, x, x0 , y) − ¯ u,x,x0 ,y ¯ X PU (u) P 0 (s) ¯ S 0 0 PU X 0 (u, x ) PX|U (x | u) PS 0 |XU (s | x, u) V (y | x , s)¯¯ PU S 0 (u, s) s p ≤ c 2η , (72) 38

and that for all u ∈ U , s ∈ S we have p |PU S 0 (u, s) − PU (u) PS 0 (s)| ≤ c˜ 2η .

(73)

From (70) and (72) we have ¯ X ¯X PU (u) PS (s) ¯ PS|X 0 U (s | x0 , u) V (y | x, s) − ¯ P (u, s) U S 0 s u,x,x ,y Q(u,x),Q(u,x0 )6=0

¯ √ ¯ 2c 2η ¯ PS 0 |XU (s | x, u) V (y | x , s)¯ ≤ , PU S 0 (u, s) γ

X PU (u) PS 0 (s) s

0

(74)

where γ,

min

u,x,x0 Q(u,x),Q(u,x0 )6=0

QU (u) QX|U (x | u) QX|U (x0 | u) .

Next, from (74), (71) and (73) we have ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 0 0 max P (s | x , u) V (y | x, s) − P (s | x, u) V (y | x , s) ¯ ¯ S|X U S |XU ¯ ¯ u,x,x0 ,y s s 0 Q(u,x),Q(u,x )6=0 ¯ ¯ ¯X ¯ X X ¯ ¯ 0 0 ≤ PS|X 0 U (s | x , u) V (y | x, s) − PS 0 |XU (s | x, u) V (y | x , s)¯ ¯ ¯ s ¯ 0 s u,x,x ,y Q(u,x),Q(u,x0 )6=0



f (η) ,

(75)

where limη→0 f (η) = 0. Let U1 and U2 be any pair of channels from X × U to S and let ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 F (U ) , max U (s | x , u) V (y | x, s) − U (s | x, u) V (y | x , s) ¯ ¯, 0 ¯ ¯ u,x,x ,y s

Q(u,x),Q(u,x0 )6=0

where U =

1 2

s

(U1 + U2 ), be a continuous function on the compact set of all channels from

X × U to S . We observe that ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 max (s | x, u) V (y | x , s) U U (s | x , u) V (y | x, s) − ¯ ¯ 2 1 ¯ ¯ u,x,x0 ,y s s Q(u,x),Q(u,x)6=0 ¯ ¯ ¯ ¯X X ¯ ¯ 0 0 U2 (s | x , u) V (y | x, s) − U1 (s | x, u) V (y | x , s)¯ = max ¯ 0 ¯ ¯ u,x,x ,y Q(u,x),Q(u,x)6=0

s

s

≥ F (U ) ,

(76)

where the first equality follows since the maximization does not change upon interchanging the two sums and then x and x0 . Denote the point on which F (U ) attains its minimum 39

by U ∗ , if the AVC V is nonsymmetrizable-X |U then U ∗ doesn’t satisfy (7) and hence ξ , F (U ∗ ) > 0. Substituting U1 = PS|X 0 U and U2 = PS 0 |XU in (76), we establish ξ to be a strict positive lower bound on the maximization in (75) which contradicts the upper bound in (75) if η is sufficiently small. Lemma A.4: Given Λ > 0 and arbitrarily small α > 0 and β > 0. If ΛW (QU ) ≥ Λ + α, then for a sufficiently small η, no five-tuple random variables U, U 0 , S, S 0 , Z can simultaneously satisfy PU = PU 0 = QU with min QU (u) ≥ β,

(77)

PU 0 S 0 Z ∈ Cηψ (Λ) ,

(78)

u∈U

PU SZ ∈ Cηψ (Λ) , and I(U Z; U 0 | S) ≤ η ,

I(U 0 Z; U | S 0 ) ≤ η .

Proof: Since (77) and (78) implies that ¡ ¡ ¢¢ D PU 0 S 0 Z ||PU 0 × PS 0 × QX|U W ≤ η ¡ ¡ ¢¢ D PU SZ ||PU × PS × QX|U W ≤ η XX QU (u) PS 0 |U 0 (s | u) l (s) ≤ Λ u∈U s∈S

XX

QU (u) PS|U (s | u) l (s) ≤ Λ,

u∈U s∈S

the rest of the proof is identical to [5, Theorem 2] and is therefore omitted. Lemma A.5: Given Λ > 0 and arbitrarily small α > 0 and β > 0. If ΛV (QX ) ≥ Λ + α, then for a sufficiently small η, no seven-tuple random variables U, U 0 , X, X 0 , S, S 0 , Y can simultaneously satisfy PX = PX 0 = QX with min QX (x) ≥ β, x∈X

PU XSY ∈ Cηφ (Λ) ,

PU 0 X 0 S 0 Y ∈ Cηφ (Λ) ,

and I(U XY ; U 0 X 0 | S) ≤ η ,

I(U 0 X 0 Y ; U X | S 0 ) ≤ η .

Proof: From (79) and (80) we have XX QX (x) PS|X (s | x) l (s) ≤ Λ ≤ ΛV (QX ) − α x∈X s∈S

XX

QX (x) PS 0 |X 0 (s | x) l (s) ≤ Λ ≤ ΛV (QX ) − α,

x∈X s∈S

40

(79) (80)

hence choosing U1 = PS 0 |X 0 and U2 = PS|X is suitable according to [5, Lemma A2] to obtain ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 max V (y | x, s) U1 (s | x ) − V (y | x , s) U2 (s | x)¯ ≥ ξ , ¯ 0 x,x ,y ¯ ¯ s s for some ξ > 0. The rest of the proof is identical to Lemma A.2 and is therefore omitted. Lemma A.6: Given Λ > 0 and arbitrarily small α > 0 and β > 0. If ΛV (Q, U ) ≥ Λ + α, then for a sufficiently small η, no six-tuple of random variables U, X, X 0 , S, S 0 , Y can simultaneously satisfy PU X = PU X 0 = Q

with min QU (u) > 0,

(81)

PU XSY ∈ Cηφ (Λ) ,

PU X 0 S 0 Y ∈ Cηφ (Λ) ,

(82)

I(X 0 Y ; X | U S 0 ) ≤ η .

(83)

u∈U

and I(XY ; X 0 | U S) ≤ η ,

Proof: Using similar arguments as in (75), we obtain that any six-tuple of random variables U, X, X 0 , S, S 0 , Y that simultaneously satisfy (81)-(83), must have ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 0 0 max PS|X U (s | x , u) V (y | x, s) − PS |XU (s | x, u) V (y | x , s)¯ ≤ f (η) , ¯ 0 ¯ ¯ u,x,x ,y s

Q(u,x),Q(u,x0 )6=0

s

1 2

¡

(84)

¢

where limη→0 f (η) = 0. Setting U = PS|X 0 U + PS 0 |XU and ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 F (Q, U ) , max U (s | x , u) V (y | x, s) − U (s | x, u) V (y | x , s) ¯ ¯, 0 ¯ ¯ u,x,x ,y Q(u,x),Q(u,x0 )6=0

s

s

using similar arguments to (76), we have ¯ ¯ ¯X ¯ X ¯ ¯ 0 0 0 0 max P (s | x , u) V (y | x, s) − P (s | x, u) V (y | x , s) ¯ ¯ ≥ F (Q, U ) . S|X U S |XU 0 ¯ ¯ u,x,x ,y

Q(u,x),Q(u,x0 )6=0

s

s

(85) Next, it follows from (81) and (82) that XXX Q (u, x) PS|X 0 U (s | u, x) l (s) ≤ Λ u∈U x∈X s∈S

XXX

Q (u, x) PS 0 |XU (s | u, x) l (s) ≤ Λ,

u∈U x∈X s∈S

hence, it follows that XXX

Q (u, x) U (s | u, x) l (s) ≤ Λ ≤ ΛV (Q, U ) − α.

u∈U x∈X s∈S

41

(86)

Considering F (Q, U ) as a continuous function of (Q, U ) ranging over the compact set of all pairs (Q, U ) satisfying (86), its minimum is attained for some (Q∗ , U ∗ ) satisfying (86). It follows that U ∗ cannot satisfy (7), hence F (Q∗ , U ∗ ) > 0 is positive lower bound on the maximization (85) which contradicts (84) if η is sufficiently small. Appendix III The Sub-Codewords and Codewords Properties The set of codewords xij , i = 1, · · · , M and j = 1, · · · , M 0 , and sub-codewords ui , i = 1, · · · , M , used in the proofs of Propositions 1-2 and Theorems 2-3, are any sets with the properties stated in the following lemma. Lemma A.7: For any ² > 0, n > n0 (²), M > exp (n²), M 0 > exp (n²) and a type QU X on U × X , there exists a set of sub-codewords ui ∈ U n , i = 1, · · · , M , each of type QU , and a set of codewords xij ∈ X n , i = 1, · · · , M and j = 1, · · · , M 0 , such that xij ∈ TQX|U (ui ) and for every s ∈ S n , u ∈ TQU , x ∈ TQX|U (u) and every joint type PU U 0 XX 0 X 00 S , upon setting R = 1/n log M and R0 = 1/n log M 0 , we have 1 |{(i, j) : (ui , xij , s) ∈ TPU XS }| ≤ exp (−n²/2) MM0 if I (U X; S) > ²,

(87)

ª¯ 1 ¯¯© 0 ¯ ≤ exp (−n²/2) 0 , s) ∈ TP (i, j) : (u , x , u for some i = 6 i i ij i 0 U XU S MM0 + if I (U X; U 0 S) > |R − I (U 0 ; S)| + ²,

(88)

n ³ ´o ¯© 0 ª¯ ¯ i : (u, ui0 , x, s) ∈ TP 0 ¯ ≤ exp n |R − I (U 0 ; U XS)|+ + ² , U U XS

(89)

ª¯ 1 ¯¯© (i, j) : (ui , xij , ui0 , xi0 j 0 , s) ∈ TPU XU 0 X 0 S for some i0 6= i, j 0 ¯ ≤ exp (−n²/2) 0 MM + if I (U X; U 0 X 0 S) > |R + R0 − I (U 0 X 0 ; S)| + ², (90) ¯© 0 0 ª¯ ¯ (i , j ) : (u, x, ui0 , xi0 j 0 , s) ∈ TP 0 0 ¯ ≤ U XU X S n ³ ´o + 0 exp n |R + R − I (U 0 X 0 ; U XS)| + ² , and for all i = 1, · · · , M , we have o¯ 1 ¯¯n ¯ 0 0 (u) for some j = 6 j j : (x , x , s) ∈ T ¯ ≤ exp (−n²/2) ¯ ij ij PXX 00 S|U M0 + if I (X; X 00 S | U ) > |R0 − I (X 00 ; S | U )| + ², 42

(91)

(92)

¯n o¯ n ³ ´o ¯ 0 ¯ + 0 00 0 ¯ j : (x, xij , s) ∈ TPXX 00 S|U (u) ¯ ≤ exp n |R − I (X ; XS | U )| + ² . (93) Proof: We will show that M = exp {nR} randomly selected sub-codewords and M M 0 randomly selected codewords, where M 0 = exp {nR0 }, will possess with probability close to 1, all the properties (87) - (93). Let U 1 , · · · , U M , be independent random sequences of length n, each uniformly distributed on TQU . For each i = 1, · · · , M , let X i1 , · · · , X iM 0 , be independent random sequences of length n, each uniformly distributed on TQX|U (U i ). Fix s ∈ S n , u ∈ TQU , x ∈ TQX|U (u) and a joint type PU U 0 XX 0 X 00 S with PU XS = Pu,x,s and PU X = PU 0 X 0 = PU X 00 = QU X (any other choice makes the bounds (87) - (93) trivial). As the total number of common messages i is exponential with n, the set of all possible combination of sequences s ∈ S n , u ∈ TQU , x ∈ TQX|U (u), grows exponentially with n and the number of joint types PU U 0 XX 0 X 00 S is polynomial with n, the following doubly exponential decaying probability bounds (96), (105), (107), (108), (109), (110) and (111), ensure that with probability close to 1 all the inequalities (87)-(93) hold simultaneously for our random selection of sub-codewords and codewords if n is sufficiently large. Any realization of the random sub-codewords and codewords that simultaneously satisfies (87)-(93) is a proper choice. Using Markov’s inequality, the independent random selection and the inequalities 2x ≤ 1 + x

∀0 ≤ x ≤ 1

(94)

∀ 0 ≤ a,

(95)

and 1 + a ≤ ea

for I (U X; S) > ² we have ½ ¾ ½ ¾ ³ 1 ²´ 1 Pr |{(i, j) : (U i , X ij , s) ∈ TPU XS }| > exp −n ≤ exp − exp (n²/2) , (96) MM0 2 2 the proof of which follows similar steps as those in the proof of [5, Lemma 3], thus proving (87). © £ ¤ª Let t , exp n |R − I (U 0 ; S)|+ + ²/4 , using Markov’s inequality, the independent

43

random selection and the inequalities (94) and (95), we have ©¯© ª¯ ª Pr ¯ i0 : (U i0 , s) ∈ TPU 0 S ¯ > t ( à ! ) X = Pr exp 1{(U 0 ,s)∈TP } > exp (t) i U0S i0 " ( )# X ≤ exp (−t) E exp 1{(U 0 ,s)∈TP } i U 0S i0 n oi Y h = exp (−t) E exp 1{(U 0 ,s)∈TP } i

i0

U 0S

h Yh 1 + E 1{(U 0 ,s)∈TP ≤ exp (−t) i

i0

≤ exp (−t)

Y i0

n

U 0S

ii } io

h

exp log eE 1{(U 0 ,s)∈TP } i U0S

.

(97)

Using the bounds (1) and (2) on the size of type classes, since PU 0 = PU we have h i © ª E 1{(U 0 ,s)∈TP } = Pr (U i0 , s) ∈ TPU 0 S i U 0S X Pr (U i0 = u) = u∈TP

U 0 |S

=

(s)

¯ ¯ ¯ ¯ ¯TPU 0 |S (s)¯ |TPU |

≤ (n + 1)c exp {−nI (U 0 ; S)} .

Substituting this and M = exp (nR) in (97), if n is large enough, we can bound (97) by exp (−t) exp {M log e (n + 1)c exp [−nI (U 0 ; S)]} n h ³ ´ + 0 = exp − exp (n²/4) exp n |R − I (U ; S)|

io − log e (n + 1)c exp {n [R − I (U 0 ; S)]} n h ³ h i´ + ≤ exp − exp (n²/8) exp n ²/8 + |R − I (U 0 ; S)| io 0 − exp {n [R − I (U ; S)]}

≤ exp {− exp (n²/8)} , and hence we obtain that n h ioo n¯© ª¯ + Pr ¯ i0 : (U i0 , s) ∈ TPU 0 S ¯ > exp n |R − I (U 0 , S)| + ²/4 ≤ exp {− exp (n²/8)} .

44

(98)

Let © ª ˜i , i0 < i : (U i0 , s) ∈ TP 0 , B U S ¯ ¯ ( © £ ¤ª ¯ ¯ ˜i if ¯B ˜i ¯ ≤ exp n |R − I (U 0 , S)|+ + ²/4 B Bi , , φ otherwise and

( fij ,

then

½ Pr

n o S 1 if (U i , X ij ) ∈ i0 ∈Bi TPU X|U 0 S (U i0 , s) , 0 otherwise

¾ ª¯ 1 ¯¯© 0 (i, j) : (U i , X ij , U i0 , s) ∈ TPU XU 0 S for some i < i ¯ > exp (−n²/2) ≤ MM0 ) ( 1 X fij > exp (−n²/2) Pr M M 0 i,j n¯© n h ioo ª¯ + + Pr ¯ i0 : (U i0 , s) ∈ TPU 0 S ¯ > exp n |R − I (U 0 , S)| + ²/4 . (99)

Using Markov’s inequality and the independent random selection, setting © ¡ ¢ª G = U 1 , · · · , U M −1 , (X 11 , · · · , X 1M 0 ) , · · · , X (M −1)1 , · · · , X (M −1)M 0 we have

(

) 1 X Pr fij > exp (−n²/2) M M 0 i,j ( Ã ! ) 1 X = Pr exp fij > exp {M exp (−n²/2)} M 0 i,j !# " Ã 1 X fij ≤ exp {−M exp (−n²/2)} E exp M 0 i,j " " Ã ! ## 1 X = exp {−M exp (−n²/2)} E E exp fij | G M 0 i,j = exp {−M exp (−n²/2)} ·      1  ·E  exp  M0

X

"

 fij   E exp

1≤i≤M −1 1≤j≤M 0

Ã

1 X fM j M0 j

!

#

(100) 

 |G  ,

using the inequalities (94) and (95) we have # " ! " Ã # 1 X 1 X fM j | G ≤ 1 + E fM j | G E exp M0 j M0 j ( ) log e X ≤ exp E [fM j | G] , M0 j 45

(101)

(102)

and using the bounds (1) and (2) on the size of type classes together with the independent random selection, we have

(

E [fM j | G] = Pr (U M , X M j ) ∈ ≤

X

n o Pr (U M , X M j ) ∈ TPU X|U 0 S (U i0 , s) | G

X

X

i0 ∈BM (u,x)∈TP

X

TPU X|U 0 S (U i0 , s) | G

i0 ∈BM

i0 ∈BM

=

)

[

U X|U 0 S

Pr {U M = u, X M j = x} (U i0 ,s)

X

1 ¯ ¯¯ ¯ ¯ ¯¯ ¯ T (u) T ¯ ¯ ¯ ¯ i0 ∈BM (u,x)∈TP P P (U i0 ,s) U X|U 0 U X|U S ¯ ¯ X ¯ ¯ c ≤ ¯TPU X|U 0 S (U i0 , s)¯ (1 + n) exp {−nH (U X)} =

i0 ∈BM

≤ |BM | (1 + n)c exp {−n [−H (U X | U 0 S) + H (U X)]} n h io + ≤ (1 + n)c exp −n I (U X; U 0 S) − |R − I (U 0 ; S)| − ²/4 ≤ (1 + n)c exp {−3n²/4} ,

(103)

where the last step requires that I (U X; U 0 S) > |R − I (U 0 ; S)|+ + ². By substituting (103) and (102) into (101) and repeating the procedure in (100) M − 1 times, we obtain the following bound on the first expression on the r.h.s of (99) ( ) 1 X Pr fij > exp (−n²/2) M M 0 i,j ≤ exp {−M exp (−n²/2)} exp {M log e (n + 1)c exp (−n²3/4)} = exp {−M exp (−n²/2) [1 − log e (n + 1)c exp (−n²/4)]} ¾ ½ ¾ ½ 1 M ≤ exp − exp (−n²/2) ≤ exp − exp (n²/2) , 2 2

(104)

where the last two steps follow if M ≥ exp (n²) and n is large enough. By substituting (104) and (98) into (99) we get ¾ ½ ª¯ 1 ¯¯© 0 ¯ > exp (−n²/2) Pr (i, j) : (U i , X ij , U i0 , s) ∈ TPU XU 0 S for some i < i MM0 ≤ exp {− exp (n²/2) /2} + exp {− exp (n²/8)} ≤ 2 exp {− exp (n²/8)} , and by symmetry, the same holds when “for some i0 < i” is replaced by “for some i0 > i”. Thus, we obtain that ¾ ½ ª¯ 1 ¯¯© 0 ¯ > exp (−n²/2) Pr (i, j) : (U i , X ij , U i0 , s) ∈ TPU XU 0 S for some i 6= i MM0 ≤ 4 exp {− exp (n²/8)} , (105) 46

as long as I (U X; U 0 S) > |R − I (U 0 ; S)|+ + ², and (88) follows. © £ ¤ª Let w , exp n |R − I (U 0 ; U XS)|+ + ² , using Markov’s inequality, the independent random selection and inequalities (94) and (95), we have ©¯© ª¯ ª Pr ¯ i0 : (u, U i0 , x, s) ∈ TPU U 0 XS ¯ > w ( à X = Pr exp 1nU 0 ∈T "

(

≤ exp (−w) E exp

o

PU 0 |U XS (u,x,s)

i

i0

!

X

i ½ Y · = exp (−w) E exp 1nU 0 ∈T i

i0

· Y· 1 + E 1nU 0 ∈T ≤ exp (−w) i0

≤ exp (−w)

Y i0

> exp (w) )# o

1nU 0 ∈T i

0

)

i

PU 0 |U XS (u,x,s)

¾¸

o

PU 0 |U XS (u,x,s)

¸¸

o

PU 0 |U XS (u,x,s)

½ · exp log eE 1nU 0 ∈T i

¸¾ o

PU 0 |U XS (u,x,s)

.

(106)

Using the bounds (1) and (2) on the size of type classes, since PU 0 = PU we have · ¸ n o n o 0 E 1 U 0 ∈T = Pr U ∈ T (u, x, s) i PU 0 |U XS PU 0 |U XS (u,x,s) i X Pr (U i0 = u0 ) = u0 ∈TP

U 0 |U XS

=

(u,x,s)

¯ ¯ ¯ ¯ ¯TPU 0 |U XS (u, x, s)¯

|TPU | ≤ (n + 1)c exp {−nI (U 0 ; U XS)} . Substituting this and M = exp (nR) in (106), if n is large enough, we conclude that ©¯© ª¯ ª Pr ¯ i0 : (u, U i0 , x, s) ∈ TPU U 0 XS ¯ > w ≤ exp (−w) exp {M log e (n + 1)c exp [−nI (U 0 ; U XS)]} ½ · ³ ´ + 0 = exp − exp (n²) exp n |R − I (U ; U XS)| ¸¾ − log e (n + 1) exp {n [R − I (U ; U XS)]} c

0

½ · ³ h i´ + ≤ exp − exp (n²/2) exp n ²/2 + |R − I (U 0 ; U XS)| ¸¾ 0 − exp {n [R − I (U ; U XS)]} ≤ exp {− exp (n²/2)} ,

(107) 47

thus proving (89). Using similar steps, as thous leading to (105) and (107), we can establish the following double exponential bounds ½ ª¯ 1 ¯¯© 0 0 ¯ 0 , X i0 j 0 , s) ∈ TP Pr (i, j) : (U , X , U for some i = 6 i, j i ij i 0 0 U XU X S MM0 ¾ > exp (−n²/2) ≤ 4 exp {− exp (n²/8)} , (108) as long as I (U X; U 0 X 0 S) > |R + R0 − I (U 0 X 0 ; S)|+ + ², ( ¯© ª¯ Pr ¯ (i0 , j 0 ) : (u, x, U i0 , X i0 j 0 , s) ∈ TP 0 0 ¯ > U XU X S

n h io + exp n |R + R0 − I (U 0 X 0 ; U XS)| + ² ½ Pr

1 M0

) ≤ exp {− exp (n²/2)} ,

(109)

¾ ¯n o¯ ¯ ¯ 0 ¯ j : (X ij , X ij 0 , s) ∈ TPXX 00 S|U (u) for some j 6= j ¯ > exp (−n²/2) ≤ 4 exp {− exp (n²/8)} ,

(110)

as long as I (X; X 00 S | U ) > |R0 − I (X 00 ; S | U )|+ + ², and ( ¯n o¯ ¯ ¯ Pr ¯ j 0 : (x, X ij 0 , s) ∈ TPXX 00 S|U (u) ¯ > n h io + exp n |R0 − I (X 00 ; XS | U )| + ²

) ≤ exp {− exp (n²/2)} ,

(111)

where (110) and (111) follows for any 0 ≤ i ≤ M .

References [1] C. E. Shannon, “A mathematical theory of communication”, Bell Syst. Tech. J., vol. 27, pp. 379-423, 623-656, 1948. [2] D. Blackwell, L. Breiman, and A. J. Thomasian, “The capacity of certain channel classes under random coding,” Ann. Math. Statist., vol. 31, pp. 558-567, 1960. [3] J.-H. Jahn, “Coding of arbitrarily varying multiuser channels,” IEEE Trans. Inform. Theory, vol. IT-27, No. 2, pp. 212-226, March 1981. [4] R. Ahlswede, “Elimination of correlation in random codes for arbitrarily varying channels,” Z. Wahrscheinlichkeitstheorie Verw. Gebiete, vol. 44, pp. 159-175,1978. 48

[5] I. Csisz´ar and P. Narayan, “The capacity of the arbitrarily varying channel revisited: Positivity, constraints,” IEEE Trans. Inform. Theory, vol. IT-34, No. 2, pp. 181-193, March 1988. [6] A. Lapidoth and P. Narayan, “Reliable communication under channel uncertainty,” IEEE Trans. Inform. Theory, vol. IT-44, No. 6, pp. 2148-2177, October 1998. [7] J. Gubner, “On the deterministic-code capacity of the multiple-access arbitraryly varying channel,” IEEE Trans. Inform. Theory, vol. IT-36, No. 2, pp. 262-275, March 1990. [8] J. A. Gubner, “State constraints for the multiple-access arbitrarily varying channel,” IEEE Trans. Inform. Theory, vol. IT-37, No. 1, pp. 27-35, January 1991. [9] J. A. Gubner and B. L. Hughes, “Nonconvexity of the capacity region of the multipleaccess arbitrarily varying channel subject to constraints,” IEEE Trans. Inform. Theory, vol. IT-41, No. 1, pp. 3-13, January 1995. [10] R. Ahlswede and N. Cai, “Arbitrarily varying multiple-access channels Part I - Ericson’s symmetrizability is adequate, Gubner’s conjecture is true,” IEEE Trans. Inform. Theory, vol. IT-45, No. 2, pp. 742-749, March 1999. [11] T. Ericson, “Exponential error bounds for random codes in the arbitrarily varying channel,” IEEE Trans. Inform. Theory, vol. IT-31, No. 1, pp. 42-48, January 1985. [12] J. K¨orner and K. Marton, “General broadcast channels with degraded message sets,” IEEE Trans. Inform. Theory, vol. IT-23, No. 1 pp. 60-64, January 1977. [13] J. K¨orner and A. Sgarro, “Universally attainable error exponents for broadcast channels with degraded message sets,” IEEE Trans. Inform. Theory, vol. IT-26, No. 6, pp. 670679, November 1980. [14] I. Csisz´ar and J. K¨orner, Information Theory: Coding Theorems for Discrete Memoryless Systems. New York: Academic, 1981. [15] E. Hof and S. I. Bross, “On the Deterministic-Code Capacity of the Two-User Discrete Memoryless Arbitrarily Varying General Broadcast Channel,” in Proc. 39th Annual Conference on Information Sciences and Systems (CISS 2005), Johns Hopkins University, Baltimore, MD, March 2005. [16] E. Hof, “On the Deterministic-Code Capacity of the Two-User Discrete Memoryless Arbitrarily Varying General Broadcast Channel,” M.Sc. thesis, Technion - Israel Institute of Technology, Haifa, Israel, 2005. 49

Suggest Documents