( (R1;i; R2;i) + (1 0 )(R1;j; R2;j)) - IEEE Xplore

0 downloads 0 Views 238KB Size Report
implying that if the code achieves D1 and D2, the corresponding rates. R1 and R2 must satisfy the conditions stated in Theorem 3. The bound on the cardinality ...
2768

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 11, NOVEMBER 2004

On Properties of Rate-Reliability-Distortion Functions

then

((R1

;i

; R2;i ) + (1 0 )(R1;j ; R2;j )) 2 Rc ((D1;i ; D2;i ) + (1 0 )(D1;j ; D2;j ))

implying that if the code achieves D1 and D2 , the corresponding rates R1 and R2 must satisfy the conditions stated in Theorem 3. The bound on the cardinality of the alphabets of the auxiliary random variables is a direct consequence of [8, Theorem A2].

Ashot N. Harutyunyan and Evgueni A. Haroutunian, Associate Member, IEEE Abstract—Some important properties of the rate-reliability-distortion function of discrete memoryless source (DMS) are established. For the binary source and Hamming distortion measure this function is derived and analyzed. Even that elementary case suffices to show the nonconvexity of the rate-reliability-distortion function in the reliability (error exponent) argument.

ACKNOWLEDGMENT

Index Terms—Convexity, error exponent, Hamming distance, ratedistortion function, rate-reliability-distortion function, reliability, timesharing argument.

The author would like to acknowledge the excellent manuscript of Prof. T. Berger [4], as well as stimulating discussions with Dr. P. L. Dragotti, Dr. G. Kramer, and Prof. M. Vetterli. Finally, the comments of the reviewers and the Associate Editor helped to improve this manuscript.

I. INTRODUCTORY NOTES

REFERENCES [1] T. Berger, Rate Distortion Theory: A Mathematical Basis for Data Compression. Englewood Cliffs, NJ: Prentice-Hall, 1971. [2] I. Csiszár and J. Körner, Information Theory: Coding Theory for Discrete Memoryless Systems. New York: Academic, 1981. [3] D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 471–480, July 1973. [4] T. Berger, “Multiterminal source coding,” presented at the CISM Summer School on the Information Theory Approach to Communications, July 1977. [5] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991. [6] A. D. Wyner, J. K. Wolf, and F. M. J. Willems, “Communication via a processing broadcast satellite,” IEEE Trans. Inform. Theory, vol. 48, pp. 1243–1249, June 2002. [7] I. Csiszár and P. Narayan, “The secret key capacity for multiple terminals,” in Proc. IEEE Int. Symp. Information Theory, Lausanne, Switzerland, June/July 2002, pp. 27–27. [8] A. D. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform. Theory, vol. IT-22, pp. 1–11, Jan. 1976. [9] A. D. Wyner, “The rate-distortion function for source coding with side information at the decoder—II: General sources,” Inform. Contr., vol. 38, pp. 60–80, 1978. [10] Y. Oohama, “Gaussian multiterminal source coding,” IEEE Trans. Inform. Theory, vol. 43, pp. 1912–1923, Nov. 1997. [11] A. H. Kaspi and T. Berger, “Rate-distortion for correlated source with partially separated encoders,” IEEE Trans. Inform. Theory, vol. IT-28, pp. 828–840, Nov. 1982. [12] T. Berger and R. W. Yeung, “Multiterminal source encoding with one distortion criterion,” IEEE Trans. Inform. Theory, vol. IT-35, pp. 228–236, Mar. 1989. [13] Y. Oohama, “The rate-distortion function for the quadratic Gaussian CEO problem,” IEEE Trans. Inform. Theory, vol. 44, pp. 1057–1070, May 1998. [14] M. Gastpar, P. L. Dragotti, and M. Vetterli, “The distributed Karhunen–Loève transform,” in Proc. 2002 Int. Workshop on Multimedia Signal Processing, St. Thomas, U.S. Virgin Islands, Dec. 2002. [15] M. Gastpar, G. Kramer, and P. Gupta, “The multiple-relay channel: Coding and antenna-clustering capacity,” in Proc. IEEE Int. Symp. Information Theory, Lausanne, Switzerland, June/July 2002, pp. 136–136. [16] T. M. Cover, “A proof of the data compression theorem of Slepian and Wolf for ergodic sources,” IEEE Trans. Inform. Theory, vol. IT-21, pp. 226–228, Mar. 1975. [17] M. Gastpar, “On Wyner–Ziv networks,” in Proc. 37th Asilomar Conf. Signals, Systems, and Computers, Asilomar, CA, Nov. 2003.

We treat the performance bound in the source coding problem under fidelity and reliability (error exponent) criteria, namely, the rate-reliability-distortion function. In this concise correspondence, we summarize some basic facts and results on the properties of that function and the concept at all. Shannon [11] defined the rate-distortion function as the minimal coding rate that can be asymptotically achieved for transmission of an information source data with an average distortion less than a predetermined threshold. It is important also the study of the rate-distortion problem under an additional coding characteristic—an exponential decay in the error probability. The maximum error exponent as a function of coding rate and distortion, characterizing the same source coding system was studied by Marton in [10]. An alternative order dependence of the three parameters was introduced by Haroutunian and Mekoush in [7], defining the rate-reliability-distortion function as the minimal rate at which the messages of a source can be encoded and then reconstructed by the receiver with an exponentially decreasing error probability with increasing codeword length. In this approach, the achievability of the coding rate R is considered as a function of a fixed distortion level  and an error exponent E > . A number of publications of the coauthors and their collaborators during the past years have been devoted to the development of this idea (and the equivalent approach applied to channel coding introduced in [6]) toward the multiuser source coding problems. Among those are the works concerning the multiple descriptions problem [9], successive refinement of information [8]. Recently, this approach was adopted by Tuncel and Rose [12]. Essentially, as an advantage of the approach considered in [7], we can emphasize the technical ease of treatment of the coding rate as a function of distortion and error exponent which, at the same time, allows to convert readily the results from the rate-reliability-distortion area to the rate-distortion ones looking at the extremal values of the reliability, e.g., E ! , E ! 1. The importance of that fact is especially pronounced and evident when one deals with a multidimensional

1 0

0

0

Manuscript received May 26, 2003; revised June 22, 2004. The work of E. A Haroutunian was supported by INTAS under Grant 00-738. The material in this correspondence was presented in part at the Symposium on Information Theory and Some Friendly Neighbors—Ein Wunschkonzert, ZiF, Bielefeld University, Bielefeld, Germany, August 2003. A. N. Harutyunyan is with the Epygi Labs AM LLC, 375026 Yerevan, Armenia (e-mail: [email protected]). E. A. Haroutunian is with the Institute for Informatics and Automation Problems of the Armenian National Academy of Sciences and the Yerevan State University, 375044 Yerevan, Armenia (e-mail: [email protected]). Communicated by R. W. Yeung, Associate Editor for Shannon Theory. Digital Object Identifier 10.1109/TIT.2004.836706

0018-9448/04$20.00 © 2004 IEEE

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 11, NOVEMBER 2004

2769

and a decoding Fig. 1.

g

Shannon’s one-way communication system.

0

II. THE RATE-RELIABILITY-DISTORTION FUNCTION, SOME PROPERTIES The messages of a discrete memoryless source (DMS) encoded in an appropriate manner must be transmitted to a receiver, the addressee of those messages. The decoder based on the codeword has to recover the original message within the required distortion and reliability. The model of a such information transmission system is depicted in Fig. 1. The DMS X is defined as a sequence fXi g1 i=1 of discrete independent and identically distributed random variables taking values in the finite set X , which is the alphabet of the source. Let

fP (x); x 2 Xg

(1)

be the generating probability distribution (PD) of the source messages. Since we are interested in the memoryless sources, the probability P N x of an N -length vector of N successive messages x x1 ; x 2 ; ; xN 2 X N can be calculated as product of the components’ probabilities

() ... )

P

N (x) =

N n=1

( )

P xn :

The finite set X , different in general from phabet at the receiver. Let d

X , is the reproduction al-

1

(3)

N

(

)

d xn ; xn :

(4)

( ) N f : X ! f 1; 2; . . . ; L (N )g

(5)

N

A

fx 2 X N

n=1

The block code, denoted by f; g , is a pair of mappings: a coding

1 0 : g(f (x)) = x; d(x; x)  1g

(7)

that is, the set of satisfactorily transmitted vectors of messages. The error probability e f; g; P; ; N of a code f; g for the source PD P given and N in terms of (7) can be measured as follows:

(

1

(

1 )

e f; g; P;

( )

1; N ) 1 0 Pr(A)

(8)

computed according to (1)–(6). First we define the notion of -achievable rate for this source coding system and the concept of Shannon rate-distortion function [11], [2]. All exponents and logarithms hereafter are of base of .

1

0

2

1

Definition 1: A number R  is called - achievable rate for  if, for every " > and sufficiently large N , given PD P and there exists a code f; g such that

1 0 ( )

0

1 log L(N )  R + " ( 1; N )  ":

(9)

N e f; g; P;

(10)

1

The minimum -achievable rate defines the corresponding value of the rate-distortion function [11] for the source P and distortion threshold , denoted here by R ; P . The properties of the rate-distortion function are well studied, and they can be found in the books by Berger [2], Cover and Thomas [3], Csiszár and Körner [5], and in the paper [1] by Ahlswede. The subject of this correspondence is relevant to the study the achievability of a coding rate involving in addition to the distortion criterion the requirement that when N ! 1, the error probability (8) exponentially decreases with a given exponent E .

1

(1 )

0

( 1) ( ) 1 log L(N )  R + " N

Definition 2: A number R  we call E; -achievable rate for  if for every " > ,  > , and given PD P , E > , and sufficiently large N , there exists a code f; g such that

0

1

0

0

0

(11)

and the error probability is exponentially small

(

e f; g; P;

be the corresponding fidelity (distortion) criterion between the original source and the reconstruction messages. The distortion measure for sequences x 2 X N and x 2 X N is assumed to be the average of the components’ distortions

( )

1

(2)

: X 2 X ! [0; 1)

d x; x

where L N is the volume of the code. The task of the system designer is to ensure restoration of the source and with messages at the receiver within a given distortion level a ”small” error probability, employing a code (5) and (6). The basic problem of Shannon theory is to estimate the minimum of the code volume sufficient for realizing that task. For a given distortion level  , we consider the following set:

1 0

0

=(

(6)

( )

situation. Having solved the problem of finding the rate-reliability-distortion region of a multiterminal system, the corresponding rate-distortion one can be deduced without an effort. As a particular fact from the general note above, with the limit condition E ! the rate-reliability-distortion function turns into the corresponding rate-distortion one. In this correspondence we attempt to give a facile proof of this claim in Theorem 2 without knowing the analytical forms of those functions. We then elaborate on the concept of the rate-reliability-distortion function and its properties. The nonconvexity of the rate-reliability-distortion function in the reliability argument E is established as a particular property of this function for the binary source with Hamming distortion measures. For this specified function, further analysis shows that for each fixed distortion level  there exists a finite value Emax > of the reliability argument such that higher reliability constraints do not increase the coding rate as a function of distortion and reliability. Emax can be regarded as the bound of sensible demands on the error probability exponent put by the receiver at which the function takes on the value of the corresponding ”zero-error” rate-distortion function.

P

: f1; 2; . . . ; L (N )g ! X N

1; N )  expf0N (E 0 )g:

( 1) ( 1) ( 1 ) ( 1) ( 1 ) (1 ) ( 1) ( 1 ) (1 )

(12)

It is clear that if R is E; -achievable then any real number larger than R is also E; -achievable. Denote by R E; ; P the minimal E; -achievable rate for a given PD P and call it the rate-reliability-distortion function. The function R E; ; P is a generalization of Shannon rate-distortion function R ;P . Before discussing the idea of E; -achievability and some properties of the function R E; ; P , we recall the result on the maximum error probability exponent by Marton [10]. She showed that if R > R ; P , then there exist N -length block codes f; g such that with N growing the rate of the code approaches R N

01 log L(N ) ! R

( )

2770

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 11, NOVEMBER 2004

(f; g; P; 1; N ) converges to zero exponen-

and the error probability e tially with exponent

F (P; R; 1)

inf

P :R(1;P )>R

D (P 0 k P )

(13)

where D(P 0 k P ) is the Kullback–Leibler information divergence between PDs P 0 and P 0 D (P 0 k P ) P 0 (x)log P (x) : P (x ) x In addition, (13) is the optimal dependence between R and E , due

Let QP be a conditional PD such that for given distribution P 0 2 E; P and  the following condition on expectation of distortion takes place:

(

)

1 0

E P ;Q d(X; X )

P 0 (x)QP (x j x)d(x; x)  1:

x;x

Note that the inequality (16) holds also for P 0 P. We use the following notations for entropy and mutual information:

=

H P (X )

0

I P;Q (X ^ X )

to the following. Theorem 1 (Marton [10]): Let d be a distortion measure on X 2X and R < jX j. Then there exist block codes such that with growing

max

P 2 (E;P ) Q

1)

1)

( 1 )

Lemma 1: i) Every E; ii) R E; ; P  ).

( 1)-achievable rate is also 1-achievable. ( 1 ) is a nondecreasing in E function (for each fixed 1 0 (1 ) lim R(E; 1; P ) = R(1; P ): E !0

(14)

1  0 and P

Proof: It follows from Lemma 1 that for any fixed for every < E2  E1 , the following inequalities hold:

R(1; P )  R(E2 ; 1; P )  R(E1 ; 1; P ): For Ei ! 0, the sequence R(Ei ; 1; P ) is a monotonically nonincreasing sequence lower bounded by R(1; P ) and must therefore have a limit. We have to prove that it is R(1; P ). Actually, Theorem 1 implies that every R > R(1; P ) is an (E; 1)-achievable rate for a some E defined by (13). If R0R(1; P ) <  then R(E; 1; P )0R(1; P ) <  , since R(E; 1; P )  R. Then the truthfulness of (14) follows. Now proceeding with some auxiliary notations, we formulate the result of [7] on the analytical form of the rate-reliability-distortion function in Theorem 3 and the important corollaries which the latter results in. Let

Q = fQ(x j x); x 2 X ; x 2 Xg

( )

be a conditional PD on X for given x. And let P X be the set of all PDs on X . Consider the following subset of those PDs P 0 from P X :

(E; P )

fP 0 : D (P 0 k P )  E g:

d(X;X )1

(X ^ X ):

I P ;Q

(17)

( 1)-achievable

> 0, 1  0 R(E; 1; P ) = R3 (E; 1; P ):

( )

(15)

(18)

We omit the proof of this statement here. It can be readily specialized from the more general result for robust descriptions system [9] in the case of one receiver. That proof employs the technique of types [5], [4], and a combinatorial method for the converse. The first consequence for the rate-distortion theory, which was proved in Theorem 2 in an alternative way, immediately follows.

0

Corollary 1: When E ! , (18) yields the expression for the ratedistortion function [11], [2], [5]

R(1; P ) =

(E; 1; P ) when

0

min

:E

Theorem 3: For every E

The proof is an evident application of Definitions 1 and 2.

0

QP (x j x) : P x (x)QP (x j x)

The solution of the problem of finding the minimal E; rate reported in [7] is given in the following theorem.

The properties of the error exponent function F P; R; are discussed in [10] and [1]. Particularly, Marton [10] questioned the continuity of that function in R, meanwhile Ahlswede [1] gave the full solution to this problem proving the discontinuity of F P; R; for general distortion measures other than the Hamming distance. Now we formulate some basic properties of R E; ; P including its continuity in the reliability argument E .

Theorem 2: For any PD P , the limit function of R E ! is R ; P

P (x)QP (x j x)

Let us introduce the following function:

R3 (E; 1; P )

lim inf N 01 log e(f; g; P; 1; N )  0F (P; R; 1): N !1 (

P (x)log P (x)

2 log

and, furthermore, for every coding procedure satisfying i)

(

x

x;x

log

 N0 (jXj; )) i) N 01 log L(N ) ! R; ii) for every PD P on X , 1  0 and each  > 0 N 01 log e(f; g; P; 1; N )  0F (P; R; 1) +  ;

N (N

(16)

min

Q :E

d(X;X )1

I P;Q (X ^ X ):

(19)

Note that an equivalent to the representation (18) in terms of the rate-distortion function is the following:

R(E; 1; P ) =

max

P 2 (E;P )

R(1; P 0 ):

(20)

Corollary 2: When E ! 1, Theorem 3 implies that the minimal of coding of all vectors from X N , each being asymptotic rate R reconstructed within the required distortion threshold , is

(1)

R(1) =

max R(1; P ):

1

(21)

P 2P (X )

This is the so-called “zero-error” rate-distortion function [5], [4].

(

) ( 1 )

Let R E; P denote the special case of the rate-reliability-distortion and the generic PD P ; call it the function R E; ; P for rate-reliability function in source coding. Then, Theorem 3 results in another corollary.

1=0

> 0 and a fixed PD P R(E; P ) = max HP (X ): P 2 (E;P )

Corollary 3: For every E

In [1], Ahlswede showed that despite the continuity of the Marton’s in R (where R ; P < R < R ) exponent function F P; R; for Hamming distortion measures, in general, it is discontinuous in R. In contrast to this fact, we have the following result.

(

1)

(1 )

(1)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 11, NOVEMBER 2004

( 1 ) ( ) ( ) P 0 + (1 0 )P 00 2 (E; P )

Lemma 2: R E; ; P is continuous in E . Proof: First note that E; P is a convex set, that is, if E; P and P 00 2 E; P , then

(

)

P0

Using the following notation:

2

Theorem 4: For every E

due to the concavity of D P 0 k P in the pair P 0 ; P , which means that it is a concave function in P 0 for fixed P . Hence, taking into account the continuity of R ; P in P , the continuity of R E; ; P in E follows from (20).

( ) (1 )

( 1 )

III. ON CONVEXITY It is well known [5] that the rate-distortion function is a nonincreasing and convex function of . Let us establish that the property of convexity in remains in force for the rate-reliability-distortion function.

1

1

( 1 ) ( 1 ) 1

1

Lemma 3: R E; ; P is a convex function in . Proof: For fixed E , let the points 1 ; R1 and 2 ; R2 belong to the curve R E; ; P and 1  2 . We shall prove that for every  from ;

)

(1

(1

)

1 (0 1) R(E; 11 +(1 0 )12 ; P )  R(E; 11 ; P )+(1 0 )R(E; 12 ; P ): Let 1 11 + (1 0 )12 . Consider for any fixed PD P the ratedistortion function (19). Using the fact that the rate-distortion function R(1; P ) is a convex function in 1, one can readily deduce R(E; 1 ; P ) = max R(1 ; P 0 ) P 2 (E;P )  P 2max (R(11; P 0 ) + (1 0 )R(12 ; P 0 )) (E;P )   P 2max R(11 ; P 0 ) (E;P ) + (1 0 ) P 2max R(12 ; P 0 ) (E;P ) = R(E; 11 ; P ) + (1 0 )R(E; 12 ; P ):

Hereafter, elaboratingon the binary Hamming rate-reliability-distortion function we conclude that R E; ; P is not convex in E .

( 1 )

IV. THE BINARY HAMMING RATE-RELIABILITY-DISTORTION FUNCTION For the binary source Hamming distance

X =f0; 1g with PD P = fp; 1 0 pg and

d(x; x)

0; 1;

=x =x

if x if x 6

we denote the rate-distortion function and the rate-reliability-distortion function by RBH ; P and RBH E; ; P , respectively. Let H P X and H 1 X ,  , are the following binary entropy functions:

( )

(1 ) ( 1 ) ( )1 0

H P (X ) H 1 (X )

0 p log p 0 (1 0 p)log(1 0 p) 0 1log 1 0 (1 0 1)log(1 0 1):

Similar notations of entropies are in force for other binary PDs. It is known (see [3]) that

RBH (1; P ) =

H P (X ) 0 H 1 (X ); 0;

0  1  minfp; 1 0 pg 1 > minfp; 1 0 pg.

> 0 and 1  0

RBH (E; 1; P ) H P (X ) 0 H 1 (X ); = 1 0 H 1 (X ); 0;

 E + (1 0 )E

)

P 2 (E;P )

for the binary Hamming case we have the following.

D(P 0 + (1 0 )P 00 k P )  D(P 0 k P ) + (1 0 )D(P 00 k P )

(

max minfp0 ; 1 0 p0 g

pmax

because

=E

2771

[ [

] 0  1  pmax ] 1  pmax

if p 2 = 1 ; 2 , if p 2 1 ; 2 , if > pmax

1

(22)

E 0 p22E 0 1 2E + p22E 0 1 2 [ 1 ; 2 ] = 2E+1 ; 2E+1 = 12 1 0 1 0 202E ; 12 1 + 1 0 202E and for PE = fpE ; 1 0 pE g D(PE k P ) = E: where

Proof: From (17) and (19) we can derive

RBH (E; 1; P )

=

max (H P (X ) 0 H 1(X ));

P 2 (E;P )

0;

0  1  pmax 1 > pmax .

Let

0  1  P 2max minfp0 ; 1 0 p0 g: (E;P ) Our task is the simplification of

H (X ) 0 H 1 (X ): max (H P (X ) 0 H 1 (X )) = P 2max (E;P ) P Note that if PD f1=2; 1=2g 2 (E; P ) then max H (X ) = 1 P 2 (E;P ) P P 2 (E;P )

which takes place when

1 log 1 + log 1 2 2p 2(1 0 p)  E

or

p(1 0 p)  202(E+1) :

(23)

The condition (23) may be rewritten as a quadratic inequality

p2 0 p + 202(E+1)  0 which is equivalent to p 2 [ 1 ; 2 ]. Consequently, the value of RBH (E; 1; P ) is constant and equals 1 0 H 1 (X ) for all PDs from [ 1 ; 2 ], which with E ! 0 tends to the segment [0; 1]. = [ 1 ; 2 ]. We show that Now consider the case, when p 2 (24) max H (X ) = H P (X ) P 2 (E;P ) P where PE = fpE ; 1 0 pE g and D(PE k P ) = E assuming pE has the value nearest to 1=2 that results from the equation D(PE k P ) = E . The reference (24) will be true by the following argument. Lemma 4: The function

0 0 D(P 0 k P ) = p0 log p + (1 0 p0 )log 1 0 p p 10p is a monotone function of p0 for P 0 2 (E; P ) with p0 minfp; 1 0 pg; 12 .

2

2772

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 11, NOVEMBER 2004

= fp1; 1 0 1p1g and P2 = fp2 ; 1 0 p2 g be some  p2  2 . It is required to prove the inequality D(P2 k P )  D(P1 k P ). Since the set (E; P ) is convex, then we can represent P1 = P + (1 0 )P2 (for some 0 <  < 1) and as in

Proof: Let P1 binary PDs and p  p1

the proof of Lemma 2 write

D(P1 k P )  D(P + (1 0 )P2 k P )  D(P k P ) + (1 0 )D(P2 k P ) = (1 0 )D(P2 k P )  D(P2 k P ):

Therefore, Lemma 4 is proved (at the same time proving that

pmax = pE ) and (24) obtains, which gives us (22).

It is interesting and important to observe that Theorem 4 and (22)  there show that for every binary PD P and distortion level exists a reliability value Emax > at which RBH E; ; P reaches its possible maximal value 0 H1 X and remains constant afterwards. It means that there is a sensible receiver demand on the reliability criterion, above which increasingly stronger demands on it can be satisfied asymptotically by the same code rate. Note that this constant is the value of the rate-distortion function for the binary equiprobable source and the Hamming distortion measure with the condition  fp; 0 pg. This fact has its origins in (21). Some calculations and pictorial illustrations of the rate-reliabilitydistortion function for the case considered in this section can be found in [9].

1

0

( )

1 0 ( 1 )

1 min 1

( 1 )

Theorem 5: R E; ; P is not convex in E . Proof: It immediately follows from the form of the rate-reliability-distortion function for binary Hamming case (22): for any P , one can easily choose a distortion level > pE such that RBH E; ; P will be constantly zero on some interval of values of reliability E . The corresponding graphical representations in [9 (e.g., Fig. 2(b))] add to this logic a visual clarity.

1

( 1 )

Remark: A naturally relevant question to the convexity property can arise: whether the widely used time-sharing argument in rate-distortion theory for multiterminal sources is applicable to rate-reliability-distortion approach? In other words, are the convex combinations of two points on E; -achievable rates’ hyperplane also E; -achievable? One can readily verify that this important tool is no longer applicable.

( 1)

( 1)

)= ) = E2 . ( + (1 0 )E2 we have 1 RBH (E ; 1) = max H P (X ) 0 H 1 (X ) P 2 (E ;P ) H = P (X ) 0 H 1 (X ) (a)  H P +(10)P (X ) 0 H 1(X ) (b) H P (X ) + (1 0 )H P (X ) 0 H 1 (X )  H = HH P (X ) + (1 0 )H P (X ) H 1 (X ) 0 (1 0 )H 1 (X ) 0H = (H P (X ) 0 H 1 (X )) + (1 0 )(H P (X ) 0 H 1 (X )) = RBH (E1 ; 1; P ) + (1 0 )RBH (E2 ; 1; P ) where D(PE k P ) = E and the inequality (a) follows from the (

where D PE k P E1 , D PE k P For <  < , denoting E E1

0

inequality

+ (1 0 )PE k P )  D(PE k P ) + (1 0 )D(PE k P ) = E1 + (1 0 )E2 and the inequality (b) follows from the concavity of the entropy. So the D(PE

lemma is proved.

(1)

0

can take on the value , providing Note that, in particular, Einf the concavity of the binary Hamming rate-reliability-distortion function RBH E; ; P on the whole domain ; 1 . The explicit form (22) of that function allows us to conclude that it always holds when RBH ; P > . And when RBH ; P it holds under the condition RBH E; ; P > RBH ; P for all values of E from ; 1 . For illustrations on this point we refer the reader again the example elaborated in [9].

( 1 ) (1 ) 0 ( 1 )

(0 ) (1 ) = 0 (1 )

(0 )

ACKNOWLEDGMENT The authors are thankful to the reviewers and Prof. R. W. Yeung, Associate Editor for Shannon Theory, for suggested corrections and stimulating comments. REFERENCES

Nevertheless, we can prove the concavity of the binary Hamming rate-reliability-distortion function RBH E; ; P in reliability argument on the interval of its positiveness. That interval, under certain conditions, can coincide with the whole domain of definition ; 1 of that function. be the infimum of reliability values E for which  Let Einf pE for given PD P and distortion level  .

( 1 )

(1)

1 0

(0 ) 1

( 1 ) (1) ) 1 ( 1 ) ( (1) ) 1 0 ( 1 ) 1 ( ) 1 ( ) ] ( 1 ) ( (1) 0 ( (1) ] =1 2

Lemma 5: RBH E; ; P is concave in E on interval of its posi;1 . tiveness Einf  pE provides the posProof: First note that the condition ; 1 . For fixed  and itiveness of RBH E; ; P on Einf P there exists a value Emax of the reliability such that if E  Emax , then RBH E; ; P is constant and equal to 0 H 1 X . Since 0 H 1 X is the maximal value of the binary Hamming rate-reliability-distortion function, it remains to prove the concavity of RBH E; ; P on the interval Einf ; Emax . ; Emax , i ; . Therefore, Let < E1 < E2 and Ei 2 Einf it follows from (22) that

(

RBH (E1 ; 1; P ) = H P RBH (E2 ; 1; P ) = H P

(X ) 0 H 1 (X ) (X ) 0 H 1 (X )

[1] R. F. Ahlswede, “Extremal properties of rate-distortion functions,” IEEE Trans. Inform. Theory, vol. 36, pp. 166–171, Jan. 1990. [2] T. Berger, Rate Distortion Theory: A Mathematical Basis for Data Compression. Englewood Cliffs, NJ: Prentice-Hall, 1971. [3] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991. [4] I. Csiszár, “The method of types,” IEEE Trans. Inform. Theory, vol. 44, pp. 2505–2523, Oct. 1998. [5] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. New York: Academic, 1981. [6] E. A. Haroutunian, “Upper estimate of transmission rate for memoryless channel with countable number of output signals under given error probability exponent” (in Russian), in Proc. 3rd All Union Conf. Theory of Information Transmission and Coding, Uzhgorod, Tashkent, Uzbekistan, U.S.S.R., 1967, pp. 83–86. [7] E. A. Haroutunian and B. Mekoush, “Estimates of optimal rates of codes with given error probability exponent for certain sources” (in Russian), in Abstracts 6th Int. Symp. Information Theory, vol. 1, Tashkent, Uzbekistan, U.S.S.R, 1984, pp. 22–23. [8] E. A. Haroutunian and A. N. Harutyunyan, “Successive refinement of information with reliability criterion,” in Proc. IEEE Int. Symp. Information Theory, Sorrento, Italy, June 2000, p. 205. [9] E. A. Haroutunian, A. N. Harutyunyan, and A. R. Ghazaryan, “On rate-reliability-distortion function for robust descriptions system,” IEEE Trans. Inform. Theory, vol. 46, pp. 2690–2697, Nov. 2000.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 50, NO. 11, NOVEMBER 2004

[10] K. Marton, “Error exponent for source coding with a fidelity criterion,” IEEE Trans. Inform. Theory, vol. IT-20, pp. 197–199, Mar. 1974. [11] C. E. Shannon, “Coding theorems for a discrete source with a fidelity criterion,” in IRE Nat. Conv. Rec., 1959, pp. 142–163. [12] E. Tuncel and K. Rose, “Error exponents in scalable source coding,” IEEE Trans. Inform. Theory, vol. 49, pp. 289–296, Jan. 2003.

2773

Capital letters denote random variables (e.g., X ) and small letters (e.g., x) denote specific values taken by them. The expectation of X is denoted by X . A cumulative density function (cdf) is denoted by F . w F denotes that Fn converges For a sequence of cdfs fFn gn1 , Fn 0! weakly to F . The entropy of a positive-valued function f is denoted by H (f ). For any k  0 and any two positive-valued functions L and U , we define

( )j = 0

H U k

( ) log U (y)dy

U y



y k

and

( k )j =

On the Discreteness of Capacity-Achieving Distributions Aslan Tchamkerten, Student Member, IEEE

+

Abstract—We consider a scalar additive channel whose input is amplitude constrained. By extending Smith’s argument, we derive a sufficient condition on noise probability density functions (pdfs) that guarantee finite support for the associated capacity-achieving distribution(s). Index Terms—Additive channel, capacity-achieving distribution.

I. INTRODUCTION Smith [14] investigated capacity-achieving distributions of additive channels whose noise probability density functions (pdfs) decay like a Gaussian tail. Under an input amplitude constraint, he showed that the capacity-achieving distribution was unique, with a finite set of mass points. Oettli [9] stated a criterion that related the noise pdf and input amplitude constraint. He showed that when this criterion is satisfied, along with a unimodality constraint, there exists a binary capacityachieving distribution. Later Das [2] showed that an additive channel with “heavy tailed” noise admits a finite support capacity-achieving distribution under an average-power constraint on the input. Several authors [1], [4], [7], [12] have derived specific results on quadrature Gaussian channels, Rayleigh-fading channels, noncoherent additive Gaussian noise channels, and noncoherent Rician-fading channels. Related works can also be found in [5], [8], [10]. We consider additive channels whose inputs are amplitude constrained. For such a setting, we derive a family of noise pdfs whose corresponding capacity-achieving distributions consist of a finite set of mass points. Unlike [9], the discrete character of the capacity-achieving distribution does not depend either on the amplitude constraint or on the unimodality of the noise pdf. Our result also extends the results in [2], [14] to noise pdfs with a wider range of tail decay. We end this section with some notational conventions. In Section II, we present our result and illustrate it with examples. Section III is devoted to the proofs and, finally, in Section IV we list some directions for future research.

Manuscript received May 14, 2002; revised July 20, 2004. This work was supported in part by the National Competence Center in Research on Mobile Information and Communication Systems (NCCR-MICS), a center supported by the Swiss National Science Foundation under Grant 5005-67322. The author is with the Laboratoire de Théorie de l’Information (LTHI), ISCI&C, Ecole Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland (e-mail: aslan.tchamkerten@epfl.ch). Communicated by P. Narayan, Associate Editor for Shannon Theory. Digital Object Identifier 10.1109/TIT.2004.836662

D U L k



y k

( ) log UL((yy)) dy:

U y

The real and imaginary parts of an element z in the complex domain are denoted by 0 for all y 2 ; ii. there exists  > 0 such that jN j < 1; iii. there exists  > 0 such that pN admits an analytic extension fz 2 : j=zj < g;1 on D iv. there exists k  0 and two nonincreasing functions L : [k; 1) ! + and U : [k; 1) ! + such that

2 D with jzj  k, 0 < L(j