On rate-reliability-distortion function for a robust ... - Semantic Scholar

0 downloads 0 Views 462KB Size Report
distortion function) of the rate-distortion function for robust descrip- tions systems. ... Theory, Statistical Decision Functions and Random Processes. The authors ...
2690

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

On Rate-Reliability-Distortion Function for a Robust Descriptions System Evgueni A. Haroutunian, Associate Member, IEEE, Ashot N. Harutyunyan, and Anahit R. Ghazaryan Abstract—A source coding problem is considered for a robust description system with one encoder and many decoders. The rate-reliability-distortion function is specified. An example is given showing a distinction in calculation of the rate-reliability-distortion function as compared with the rate-distortion function. Index Terms—Hamming distance, multiple descriptions, rate-distortion function, reliability function, rate-reliability-distortion dependence, robust descriptions, source coding with fidelity criterion.

I. INTRODUCTION The aim of this work is to develop a generalization (rate-reliabilitydistortion function) of the rate-distortion function for robust descriptions systems. The achievable coding rate R is considered in relation to demands of receivers relevant to both distortion levels and reliability. The problem of source coding with respect to fidelity criterion and reliability was formulated for a one-way system by Haroutunian and Mekoush [9]. We apply the same idea to a special model of a multiple description system (see Fig. 1), earlier studied by El Gamal and Cover [8] and named robust descriptions system. Some models of multiple description system have been considered also in [1], [4], [5], [8], [10], [12], [13], and [15]. Messages of a discrete memoryless stationary source encoded by one encoder must be transmitted to K different receivers. Every receiver, based on the same codeword, tries to recover the original message in the framework of acceptable distortions and reliabilities. The source fX g is defined as a sequence fXi gi=1 of discrete independent and identically distributed random variables taking values in the finite set X , which is the alphabet of messages of the source. Let

1

P

3

= fP (x); x 2 Xg 3

be the probability distribution of the source messages. Since we study the memoryless source, the probability P 3N x of the N -length vector of N successive messages x x1 ; x2 ; 1 1 1 ; xN 2 X N is defined as a product

()

=(

P 3N x

( )=

N n=1

)

P 3N xn :

( )

Finite sets X k ; k ; K , in general different from X , are the reproduction alphabets of corresponding receivers. Let

=1

dk

: X 2 X k ! [0; 1);

k

= 1; K

Fig. 1. A robust descriptions system.

sequences of length N are assumed to be the average of the components’ distortions dk x; xk

(

(

N ) = N1 dk (xn ; xkn );

)=(

n=1

We denote by f; F f; F 1 ; F 2 ; mappings: with one encoder f

= 1; K:

k

1 1 1 ; F K ) the family of (K +1)

: (X )N ! f1; 2; 1 1 1 ; L(N )g

and the K separate decoders Fk

: f1; 2; 1 1 1 ; L(N )g ! (X k )N ;

k

= 1; K:

The task of the system is to restore the messages of the source X at the k th receiver within corresponding distortion levels k and error probability exponents (reliabilities) E k , k ; K. For given distortion levels k  ; k ; K; we consider the following sets:

1

=1 0 =1

1

Ak = fx 2 (X )N : F k (f (x))= xk ; dk (x; xk )  1k g;

(

) ek (f; F k ; 1k ; N ) = 1 0 P 3N (Ak );

k

=1; K

and error probabilities of the code f; F

k

= 1; K:

For brevity, we denote

(E 1; 1 1 1 ; E K ) = E ; (11 ; 1 1 1 ; 1K ) = 1: A number R  0 is called (E ; 1)-achievable rate for E k > 0; 1k  0; k = 1; K; if for every " > 0 and sufficiently large N there exists a code (f; F ), such that (log’s and exp’s are taken to the base 2) 1 log L(N )  R + " N and the error probabilities are exponentially small ek f; F k ;

1k ; N )  expf0N E k g; k = 1; K: Denote by R(E ; 1 ) the minimal (E ; 1 )-achievable rate and call it the rate-reliability-distortion function. The function R(E ; 1 ) is a general1), which was ization of the corresponding rate-distortion function R(1 1 specified by El Gamal and Cover in [8]. R(1 ) is the limit function of R(E ; 1) when E k ! 0; k = 1; K . (

be the corresponding fidelity (distortion) criteria between source and corresponding reconstruction alphabets. The distortion measures for Manuscript received February 2, 1999; revised October 17, 1999. This work was supported in part by INTAS under Grant 94-469. The material in this correspondence was presented in part at the 13th Prague Conference on Information Theory, Statistical Decision Functions and Random Processes. The authors are with the Institute for Informatics and Automation Problems, the Armenian National Academy of Sciences and of the Yerevan State University, Yerevan 375044, Armenia (e-mail: [email protected]; [email protected]). Communicated by N. Merhav, Associate Editor for Source Coding. Publisher Item Identifier S 0018-9448(00)09865-5.

We specify the rate-reliability-distortion function in the next section. The proof is expounded in Section III. Section IV contains an example of the calculation of the rate-reliability-distortion function for the case of a binary source, with two receivers K and Hamming distortion measures. This example demonstrates some peculiarities of rate-reliability-distortion function R E ; as a function of E and

0018–9448/00$10.00 © 2000 IEEE

=2

( 1)

1

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

for different cases. It is shown that in contradistinction to the case of the rate-distortions function, where the demand of the receiver with the smallest distortion level is decisive, for the rate-reliability-distortion function the decisive element may be the rate of the receiver with are illustrated the greatest reliability. Some possible forms of R E ; in Figs. 2–6.

2691

Let us introduce the following function:

R3 (E ; 1) = max

( 1)

II. FORMULATION OF RESULT

Let P and

= fP (x); x 2 Xg be a probability distribution (PD) on X

Q = fQ(x1 ; 1 1 1 ; xK jx);

Q(xk jx) =

x

2X ; j =1; K; j 6=k

Q(x1 ; 1 1 1 ; xK jx);

Theorem:

k = 1; K:

(E k ) = fP : D(P kP 3 )  E k g;

k = 1; K

R(E ; 1) = R3 (E ; 1): When E k = E; k = 1; K

max

1) = R(1

P (x)log P3(x) : P (x) x

x; x

k = 1; K

(1)

8(P; E j ; 1) = QPj if P 2 (E j ) 0 (E j01 ); j = 2; K ,

( 1 8( 1) 1

1)

1 K 8(P min )2M(11) IP ; 8(P ) (X ^ X ; 1 1 1 ; X ):

k = j; K:

8(

(2)

1)

Denote by M P; E k ; the set of all such functions P; E k ; k for given , P , and E , k ; K , and by M the set of all functions P 3 ; for which inequalities (1) take place for P P 3 and given . For brevity, we shall just write P; E k and P 3 . We use the following notations for entropy and information:

8(

HP (X ) = 0 IP; Q (X ^ X 1 ; 1 1 1 ; X K ) =

x

(11) = ) 8( )

P (x)log P (x);

x; x ; 111; x

1 log x

We apply the typical sequences technique [6], [7]. The proof of the inequality

R3 (E ; 1)  R(E ; 1)

P (x)QP (x1 ; 1 1 1 ; xK jx) QP (x1 ; 1 1 1 ; xK jx) : P (x)QP (x1 ; 1 1 1 ; xK jx)

(3)

is based on the idea of “importance” for each k of the source messages’ types P , which are not farther (in the sense of divergence) from the generic probability distribution P 3 than the corresponding E k ; ; K; and on the following random coding lemma about coverk ings of types of vectors, which is a modification of the covering lemmas from [2], [7], and [10].

=1

Lemma: Let for "

>0

Jk (P; Q)=expfN (IP; Q (X ^ X k ; 1 1 1 ; X K )+ ")g; Then for every type P and conditional distribution collections of vectors

P (x)QPj (xk jx)dk (x; xk )  1k ;

=1

! 0,

III. PROOF OF THE THEOREM

0 < E1  E2  1 1 1  EK and note that this implies (E k )  (E k+1 ), k = 1; K 0 1. k (x1 ; 1 1 1 ; xK jx) = Qk the funcDenote by 8(P; E k ; 1 ) = QP P tion, which puts into the correspondence to PD P , some conditional k such that for a given 1 , the following conditions hold: PD QP 1 if P 2 (E 1 ), then for 8(P; E 1 ; 1) = QP E P; Q dk (X; X k ) = P (x)QP1 (xk jx)dk (x; xk )  1k ;

x; x

min(P; E; 1) 8(P; E)2M 1IP; 8(P; E) (X ^ X 1 ; 1 1 1 ; X K )

P 2 (E )

We assume without loss of generality that

E P; Q dk (X; X k ) =

min

) 8(P; E )2M(P; E ; 1)

Corollary 2 (El Gamal and Cover [8, Theorem 2]): When E we obtain the rate-distortions function.

with divergence

and for then

max )0 (E

1 1 1

1 IP; 8(P; E ) (X ^ X K ) : For every E k > 0; 1k  0; k = 1; K

Corollary 1:

R(E ; 1) =

Consider the following sets of distributions:

D(P kP 3 ) =

P 2 (E

x 2 X ; xk 2 X k ; k = 1; K g

be a conditional distribution on X 1 2 1 1 1 2 X K for a given x. We denote the marginal distributions

max ) 8(P; E )2M min(P; E ; 1) 1 IP; 8(P; E ) (X ^ X 1 ; 1 1 1 ; X K ); min max P 2 (E )0 (E ) 8(P; E )2M(P; E ; 1) 1 IP; 8(P; E ) (X ^ X 2 ; 1 1 1 ; X K ); P 2 (E

k =1; K:

Q there exist the

f(xkj ; 1 1 1 ; xKj ) 2 TP; Q (X k ; 1 1 1 ; X K ); jk = 1; Jk (P; Q)g;

k = 1; K

such that the sets

fTP; Q (X jxkj ; 1 1 1 ; xKj ); jk = 1; Jk (P; Q)g; covers TP (X ) for N large enough.

k = 1; K

The proof of the lemma is similar to the proof of lemmas in [2], [7], [10], [13], and is given in the Appendix. Now let us represent the set of all source messages of length N as a union of all disjoint types of vectors

(X )N = where P

P 2P (X ; N )

TP (X )

(X ; N ) is the set of all possible types P of vectors in (X )N .

2692

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

=1

Let  be a positive number. Then for every k ; K , for N large enough we can estimate the probability of appearance of the source sequences of types beyond E k 

( + )

P 3N

=

P 2= (E

TP (X )

+ )

P 2= (E

P 3N (TP (X ))

+)  (N + 1)jXj exp 0N

P 2= (E

Here the first and the second inequalities follow from the well-known properties of types and the definition of the set E k . Consequently, in order to obtain the desired levels of error probabilities ek , it is sufficient to construct a “good” encoding function for the vectors with types P from E K  . Let us fix some type P 2 E K  . If P 2 E 1  , let us fix some P; E 1 2 M P; E 1 ; . If P 2 E k  0 E k01  ; where  k  K , let us fix some P; E k 2 M P; E k ; . Denote P; E k QPk . According to the lemma for any k ; K , there exists a covering

( )

( + ) 8( ) ( 2 8( )=

for TP

( +) ( +) 1) ( +) ( +) 8( ) ( 1) =1 (X jxkj ; 1 1 1 ; xKj ); jk = 1; Jk (P; QPk )

(X ). Let C (P; QPk ; jk ) = TP; Q (X jxkj ; 1 1 1 ; xKj ) 0

j 0 be fixed and assume that a given code (f; F ) of block length N has (E ; 1)-achievable rate R. It is sufficient to show that for some 8(P; E k ) = QPk (x1 ; 1 1 1 ; xK jx) 2 M(P; E k ; 1); k = 1; K for N large enough, the following inequality holds:

1 log L(N ) + " I (X ^ X 1 ; 1 1 1 ; X K );  max P 2max (E ) P; 8(P; E ) (X ^ X 2 ; 1 1 1 ; X K ); max I P 2 (E )0 (E ) P; 8(P; E )

N

1

P 2 (E

max )0 (E

1

1

I (X ^ X K ) : ) P; 8(P; E )

We can write

K

For P

k=1

Ak

TP (X ) = jTP (X )j 0

K k=1

Ak

TP (X ) :

2 (E 1 0 ") the following estimates holds: K

=

According to the definition of the code f; F , to the lemma and the inequalities (1) and (2) we have for P 2 E k 

x; x

1 log L(N ) 0 " 0 jX j log(N + 1) N min  max P 2 max (E +) 8(P; E )2M(P; E ; 1) 1 IP; 8(P; E ) (X ^ X 1 ; 1 1 1 ; X K ); min max P 2 (E +)0 (E +) 8(P; E )2M(P; E ; 1) 1 IP; 8(P; E ) (X ^ X 2 ; 1 1 1 ; X K );

N

k=1

k for s = k F k (js ) = xj ; x; for s 6= k k = 1; K: F k (j0 ) = x;

=

LP;8(P;E ) (N ) = expfN (IP;8(P;E ) (X ^ X k ; 1 1 1 ; X K ) + ")g: The worst types among (E k +  ) (for which we have polynomial estimate) and their optimal (among M(P; E k ; 1)) conditional distributions 8(P; E k ), k = 1; K , determine the corresponding bound for

when x 2 C

and the decoding functions

dj (x; xj ) = N 01

) ( )

the transmission rate

min+) D(P kP 3 )  expf0NE k 0 N + jXj log(N + 1)g  expf0N (E k + =2)g:

TP; Q

8(

For a fixed type P and its conditional type P; E k the number of symbols used in encoding (denote it by LP; 8(P; E ) N ) is

Ak P 3N

TP (X ) K k=1

Ak TP (X )

P 3N (x)

 expfN (HP (X ) + D(P kP 3 ))g

K

expf0NE k g  K expf0NE 1 g 1 expfN (HP (X ) + E 1 0 ")g = exp N HP (X ) + logNK 0 "  expfN (HP (X ) 0 "=2)g

for N large enough.

k=1

(6)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

Similarly, we can show that for N large enough and

Hence

K k=1

for N large enough. A unique collection of vectors x1 ; 1 1 1 ; x K corresponds to each k Fk f x , k ; K . This x2 K k=1 Ak TP X , such that x collection of K vectors determines a conditional type Q, for which

) = ( ( )) = 1

(

( )

(x1 ; 1 1 1 ; xK ) 2 TP; Q (X 1 ; 1 1 1 ; X K jx): Since x

2

K k=1

E P; Q

M (P; E 1 ; 1).

K

TP (X )

TP (X )

 (N + 1)jXj  expfN"=2g

jX j K k=1

K k=1

Ak

TP (X )

TP (X )

( ( )) = ( k=1



Ak

=1 ) ( )

(

)

TP (X )

(x ; 111;xx )2D

( )

 L(N )expfNHP; 8(P; E ) (X jX 1; 1 1 1 ; X K )g: From the last inequality, (7), and (8) it is not difficult to conclude that

L(N )  expfN (IP; 8(P; E ) (X ^ X 1 ; 1 1 1 ; X K ) 0 ")g

( 0 ").

for P 2 E 1 Therefore,

1 log L(N )  max IP; 8(P; E ) (X ^ X 1; 1 1 1 ; X K ) 0 ": P 2 (E 0")

N

K j =k

K j =k

K

Aj

j =k

TP (X ) (8(P; E k ))

Aj

Aj TP (X ) (8(P; E k )) :

TP (X );

by

Dk; 111; K

(10)

the set of all

x 2 TP; 8(P; E ) (X jxk ; 1 1 1 ; xK )

( (x)) = xj , j = k; K . Note that jDk; 111; K j  L(N ).

j =k



Aj

TP (X ) (8(P; E k ))

(x ; 111;xx )2D

jTP; 8(P; E ) (X jxk ; 1 1 1 ; xK )j

 L(N )expfNHP; 8(P; E ) (X jX k ; 1 1 1 ; X K g: Taking into account (9), (10), and the last inequality for

P

2 (E k 0 ") 0 (E k01 0 ");

k = 2; K

we obtain that

(8(P; E 1)) jTP; 8(P; E ) (X jx1 ; 1 1 1 ; xK )j

k = 2; K:

such that F j f Consequently,

(8)

Let D1; 111; K be the set of all collections x1 ; 1 1 1 ; xK , which satxk , k ; K , for some x 2 K isfy F k f x k=1 Ak TP X ; 1 x 2 TP; 8(P; E ) X jx ; 1 1 1 ; xK . In accordance with the definition of the code jD1; 111; K j  L N . Then

jX j

For any k = 2; K let us denote (xk ; 1 1 1 ; xK ), for which there exists

(8(P; E 1 ))

(8(P; E 1 )) :

= 2; K ,

= 2; K , for N large enough

 expfN"=2g

x2

(9)

TP (X )

Aj

K

Ak

TP (X ) (8(P; E k ));

Aj

 (N + 1)jXj

)

Using the polynomial upper estimate [6], [7] for the number of conditional types Q, we have for N large enough

K

j =k

( )

Ak

k=1

TP (X )  expfN (HP (X ) 0 "=2)g:

Aj

K

k = 1; K:

(8(P; E 1 )):

= 2; K

By analogy with the selection of the class in (8), for each k we can choose the classes

j =k

= 8(

Ak

j =k

Then for any k

So Q 2 K A The set of all vectors x 2 k=1 k TP X is divided into classes corresponding to these conditional types Q and is the union 1 -shell (Q1 of all such Q-shells. Let us select the QP P; E 1 ) P having maximal cardinality for a given P and denote it by

k=1

K

K

X k ) = dk (x; xk )  1k ;

for k

the following inequality is valid:

Ak , then

dk (X;

2 (E k 0 ") 0 (E k01 0 ");

P

Ak TP (X )

 (N + 1)0jXj expfNHP (X )g 0 expfN (HP (X ) 0 "=2)g fN"=2g = expfN (HP (X ) 0 "=2)g exp (N + 1)jXj 0 1  expfN (HP (X ) 0 "=2)g (7)

K

2693

L(N )  expfN (IP; 8(P; E ) (X ^ X k ; 1 1 1 ; X K ) 0 ")g: Then

1 log L(N )  max P 2 (E 0")0 (E

N

0")

1IP; 8(P; E ) (X ^ X k ; 1 1 1 ; X K ) 0 ":

Taking into account the arbitrariness of ", and the continuity by Ek; k ; K of all functions in the expressions above, we complete the proof of the inclusion (6), hence of (5) as well.

=1

IV. EXAMPLE In this section we consider the rate-reliability-distortion function

R(E ; 1) for the robust descriptions system of Fig. 1, for the case of a

2694

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

Fig. 2. (a) R

(

(a)

E ;

1 ) for = 0 15, 1 = 0 1. (b) p

:

:

R

(

E ;

1 ) for = 0 15, 1 = 0 3. p

:

(b)

:

=2

binary source with K and Hamming distortion measures. Calculations and graphics were executed using the “Mathematica” integrated environment. The binary source is characterized by the alphabets X X1 X 2 f ; g; with generic probability distribution P 3 p3 ; 0 p3 and Hamming distortion measures

= = =( 1 )

= 01

0; x = x1 1; x 6= x1 x = x2 d2 (x; x2 ) = 0; 1; x 6= x2 : We denote by RBH (E ; 1 ) the binary Hamming rate-reliability-distor1)—the binary Hamming rate-distortions function function, by RBH (1 tion. For ordinary one-way system, the binary Hamming rate-distortion function RBH (11 ) was specified in [3] and [6]. The binary Hamming rate-reliability-distortion function RBH (E 1 ; 11 ) for the same system was derived in [11] and [13]. For E 1 such that (1=2; 1=2) 2 (E 1 ) 1 when 11  1=2 RBH (E 1 ; 11 ) = 1 0 H (1 ); (11) when 11 > 1=2 0; d1 (x; x1 ) =

where H (11 )

=

entropy. Let PE pE of the equation

=(

011 log 11 0 (1 0 11 )log(1 0 11 ) is binary ; 1 0 pE

) where pE

is the nearest to

1=2 solution

D(PE kP 3 ) = E 1 : We shall later also define PE analogously. For E 1 such that = ; = 2 = E1

(1 2 1 2) ( ) 1 1 RBH (E 1 ; 11 ) = HP (X ) 0 H (1 ); when 11  pE (12) 0; when 1 > pE : 1 It was shown in [11] and [13], that RBH (E ; 11 ) as a function of 11 is concave, as a function of E 1 for 11 such that RBH (11 ) > 0 it is convex (see Fig. 2(a)). For 11 such that RBH (11 ) = 0, it is convex when E 1 > E 1 (11 ), where E 1 (11 ) is the minimal E 1 , for which 11  pE 1 (see1 Fig. 2(b)). RBH (E ; 1 ) as a function of E 1 and 11 is illustrated in Fig. 3. We consider the case 11 < 12 . It is not difficult to verify that 1) is in this case the binary Hamming rate-distortions function RBH (1

determined by the demands of the receiver with the smallest distortion level. By analogy with [6, ch. 13 ], we can write

1) = RBH (11 ) RBH (1 1 = H0; P (X ) 0 H (1 );

if if

11  min(p3 ; 1 0 p3 ) 11 > min(p3 ; 1 0 p3 ):

Fig. 3.

R

(

E ;

1 ) as a function of

For the case E 1

E

and

1.

< E 2 we have from the theorem

RBH (E ; 1) = max max min P 2 (E ) 8(P; E )2M(P; E ; 1) 1 IP; 8(P; E ) (X ^ X 1; X 2 );

min max )0 (E ) 8(P; E )2M(P; E ; 1)

P 2 (E

1 IP; 8(P; E ) (X ^ X 2 ) :

By analogy with the calculation of the binary Hamming rate-reliabilitydistortion function RBH E 1 ; 1 we have the following equalities. For E 1 such that = ; = 2 E 1

( 1) (1 2 1 2) ( ) min I (X ^ X 1 ; X 2 ) max P 2 (E ) 8(P; E )2M(P; E ; 1) P; 8(P; E ) 1 11  1=2 = 01;0 H (1 ); when when 11 > 1=2: 1 1 For E such that (1=2; 1=2) 2 = (E ) min I (X ^ X 1 ; X 2 ) max P 2 (E ) 8(P; E )2M(P; E ; 1) P; 8(P; E ) 1 11  pE = H0; P (X ) 0 H (1 ); when when 11 > pE : 2 2 1 For E such that (1=2; 1=2) 2 (E ) 0 (E ) min I (X ^ X 2 ) max P 2 (E )0 (E ) 8(P; E )2M(P; E ; 1) P; 8(P; E ) 2 12  1=2 = 01;0 H (1 ); when when 12 > 1=2: 2 2 1 For E such that (1=2; 1=2) 2 = (E ) 0 (E ) min I (X ^ X 2 ) max P 2 (E )0 (E ) 8(P; E )2M(P; E ; 1) P; 8(P; E ) 2 12  pE : = H0; P (X ) 0 H (1 ); when when 12 > pE

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

Fig. 4. E

R ( 1) as a function of 1

= 0 09, :

E;

E

= 0 49.

and

:

1

for

P

2695

= (0 15 0 85), :

;

:

Fig. 6. . :

0 09

R

( 1) for E;

= (0 15 0 85), 1 = 0 1, 1 = 0 13,

P

:

;

:

:

:

E

=

TABLE I SOME VARIANTS OF PARAMETERS P , , , E , AND E CALCULATED VALUES OF p , H X H , H ,R , AND R E; IN THE CASE (16)

1 1 ( ) 0 (1 ) 1 0 (1 ) ( 1)

( 1) as a function of 1 = 0 1, 1 = 0 13. Fig. 5.

R

E;

:

E

and

E

for

P

:

( 1)

(11)

= (0 15 0 85), :

;

:

1 1

RBH E ; as a function of 1 , 2 (for fixed E 1 , E 2 ) is illustrated in Fig. 4. An example of RBH E ; as a function of E 1 , E 2 (for fixed 1 , 2 ) is shown in Fig. 5. It is not difficult to show that RBH E ; as a function of E 2 when = ; = 2 E 1 is constant and equals to its possible maximal value 0 H 1 . Let us assume that

1 (1 2 1 2) ( ) 1 (1 )

( 1)

1

( 1)

11 < min(p3 ; 1 0 p3 ): (13) 1 1) = HP (X ) 0 H (1 ). Then RBH (1 Let us consider the following two cases for fixed 11 , satisfying (13) and E 1 such that (1=2; 1=2) 2 = (E 1 ), 11 < pE . 2 For E such that

(1=2; 1=2) 2 (E 2 );

we obtain

RBH (E ; 1) = max[HP

12 < 12

(14)

(X ) 0 H (11 ); 1 0 H (12 )];

and when

HP

(X ) 0 H (11 ) < 1 0 H (12 )

(15)

then

RBH (E ; 1) = HP

then

RBH (E ; 1) = RBH (E ; 1 2

For E such that

2

) = 1 0 H (1 ): 2

2

(1=2; 1=2) 2= (E 2 );

we obtain

RBH (E ; 1) = max[HP

and when

HP

12 < pE

(16)

(X ) 0 H (11 ); HP (X ) 0 H (12)]

(X ) 0 H (11 ) < HP (X ) 0 H (12)

(17)

( 1)

(X ) 0 H (12 ): (1 2 1 2) ( ) (1 2 1 2) ( ) 1 1 ( ) (1 ) 1

Note that RBH E ; as a function of E 2 when = ; = 2= 1 E and (15) does not hold is constant and equals HP X 0 H 1. RBH E ; as a function of E 2 when = ; = 2= E 1 and (15) is valid, is presented in Fig. 6. In Table I, some values of parameters P 3 , 1 , 2 , E 1 , and E 2 meeting the condition (14) are chosen such, that inequality (15) holds. In this table, the calculated values for pE , HP X 0H 1 , 0 H 2 , RBH , and RBH E ; are given.

( ) (1 ) ( 1) (1 )

(11)

( 1)

2696

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

TABLE II SOME VARIANTS OF PARAMETERS P , , , E , E , AND CALCULATED VALUES OF p , p , H X H ,H X H , R , AND R E; IN THE CASE (16)

1 1 ( ) 0 (1 ) ( 1)

(11)

Taking into account the independence of random variables

( ) 0 (1 )

= 1; J (P; Q)

j ; jk

k

and the polynomial estimate for numbers of conditional types [7] we have for N large enough

Pr

2

(P; Q)

J

x =

(X j )

T

P; Q

=1

j

j

 10 T

(X jx ; 1 1 1 ; x ) jT (X )j0

 1 0 exp

N HP; Q X X ;

P; Q

0jXj

k j

K j

( j

K

K

(P; Q)

J

jX j log(N + 1) 0 N H (X )

 1 0 exp 0N

(P; Q)

111; X )

k

j

j =k

J

1

P

P

( ^ X ; 1 1 1 ; X ) + 2" k

IP; Q X

where

K

"  N 01 jXj

K

J

(P; Q)

jX j log(N + 1): j

j =k

Applying the inequality (1 0 t)s 0 < t < 1 and s) with

 expf0stg (which holds for each

t = exp 0N I (X ^ X ; 1 1 1 ; X and s = J (P; Q); we continue the estimation k

P; Q

K

)+

" 2

k

Pr

2T

x

x 1

( ) (X )

 expfNH (X )g exp 0J (P; Q) 1 exp 0N I (X ^ X ; 1 1 1 ; X " ; = exp NH (X ) 0 exp N 2 P

1 1 (1 ) ( )

In Table II some variants of values of parameters P 3 , 1 , 2 , E 1 , and E 2 meeting the condition (16) are chosen so that (17) holds. In this 1 table, the calculated values for pE , HP X 0H , HP X 0 2 , RBH , and RBH E ; are presented. H

(1 )

(11)

( )

( 1)

=1

( )

=1

()

(

(x) = 1; 0;

if x 2 =

( ) )

(

)

J

(P; Q)

T

P; Q

j

=1

(X j ) j

otherwise. It is necessary to show that for N sufficiently large

Pr

2T

x

(x) < 1

>

0

(X )

since it is equivalent to the existence of the required covering. We have

Pr

2T

x

(x)  1 (X )

 jT (X )j Pr P

2

J

(P; Q)

x =

T

P; Q

j

=1

(X j ) j

:

K

)+

" 2

P

and when N is large enough, then

2T

x

Using the method of random selection, following [7, Sec. 2, proof of Lemma 4.1], we show the existence of K coverings for TP X . Let k , k ; K , be fixed and fj ; jk ; Jk P; Q g be a collection of random variables, independent and identically distributed on TP X . Denote by x the characteristic function of the complement (P; Q) of random set jJ =1 TP; Q X jj

k

P; Q

Pr

APPENDIX PROOF OF LEMMA

k

x  1 < 1:

( ) (X )

REFERENCES [1] R. Ahlswede, “The rate-distortion region for multiple descriptions without excess rate,” IEEE Trans. Inform. Theory, vol. IT-31, pp. 721–726, Nov. 1985. , “Coloring hypergraphs. A new approach to multi-user source [2] coding—I and II,” J. Combin. Inform. Syst. Sci., vol. 4, no. 1, pp. 75–115, 1979. [3] T. Berger, Rate Distortion Theory, A Mathematical Basis for Data Compression. Englewood Cliffs, NJ: Prentice-Hall, 1971. [4] T. Berger and Z. Zhang, “New results in binary multiple descriptions,” IEEE Trans. Inform. Theory, vol. IT-33, pp. 502–521, July 1987. [5] , “Multiple description source coding with no excess marginal rate,” IEEE Trans. Inform. Theory, vol. 41, pp. 349–357, Mar. 1995. [6] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991. [7] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. New York: Academic, 1981. [8] A. El Gamal and T. M. Cover, “Achievable rates for multiple descriptions,” IEEE Trans. Inform. Theory, vol. IT-28, pp. 851–857, Nov. 1982. [9] E. A. Haroutunian and B. Mekoush, “Estimates of optimal rates of codes with given error probability exponent for certain sources” (in Russian), in Abstracts 6th Int. Symp. Information Theory, vol. 1, Tashkent, U.S.S.R., 1984, pp. 22–23. [10] E. A. Haroutunian and R. S. Maroutian, “ E ; -achievable rates for multiple descriptions of random varying source,” Probl. Contr. Inform. Theory, vol. 20, no. 2, pp. 165–178, 1991.

( 1)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 7, NOVEMBER 2000

[11] E. A. Haroutunian and A. N. Haroutunian, “The binary Hamming ratereliability, distortion function,” Math. Probl. Comp. Sci. (Trans. Inst. Informatics and Automation Problems of the NAS of RA and of the Yerevan State Univ.), vol. XVIII, pp. 40–45, 1997. [12] A. N. Haroutunian and E. A. Haroutunian, “An achievable rates-reliabilities-distortions dependence for source coding with three descriptions ,” Math. Probl. Comp. Sci. (Trans. Inst. Informatics and Automation Problems of the NAS of RA and of the Yerevan State Univ.), vol. XVII, pp. 70–75, 1997. [13] A. N. Haroutunian, “Investigation of achievable interdependence between coding rates and reliability for several classes of sources,” Thesis of Kandidat of Science (in Russian), Inst. Inform. Automation Problems of the NAS of RA and of YSU, Yerevan, Nov. 1997. [14] E. A. Haroutunian, A. N. Haroutunian, and A. R. Kazarian, “On ratereliabilities-distortions function of source with many receivers,” in Proc. Joint Session of 6th Prague Symp. Asymptotic Statistics and 13th Prague Conf. Information Theory, Statistical Decision Functions and Random Processes, vol. 1, Prague, Czech Republic, 1998, pp. 217–220. [15] R. S. Maroutian, “Achievable rates for multiple descriptions with given exponent and distortion levels” (in Russian), Probl. Pered. Inform., vol. 26, no. 1, 1990.

2697

Fig. 1. Source coding with side information.

dent and identically distributed copies of a pair of real random variables (X; Y ), where X is called the source and Y is called the side information. Encoding and decoding is done in blocks of length n, and the distortion D between the source block (X1 ; . . . ; Xn ) and its reproduc^1; . . . ; X ^ n ) is given by tion (X 1

E

On Source Coding with Side-Information-Dependent Distortion Measures Tamás Linder, Senior Member, IEEE, Ram Zamir, Senior Member, IEEE, and Kenneth Zeger, Fellow, IEEE Abstract—High-resolution bounds in lossy coding of a real memorybe a less source are considered when side information is present. Let “smooth” source and let be the side information. First we treat the case and we estabwhen both the encoder and the decoder have access to lish an asymptotically tight (high-resolution) formula for the conditional ( ) for a class of locally quadratic distorrate-distortion function tion measures which may be functions of the side information. We then consider the case when only the decoder has access to the side information (i.e., the “Wyner–Ziv problem”). For side-information-dependent distortion measures, we give an explicit formula which tightly approximates the ( ) for small under some Wyner–Ziv rate-distortion function assumptions on the joint distribution of and . These results demonstrate that for side-information-dependent distortion measures the rate loss ( ) ( ) can be bounded away from zero in the limit of small . This contrasts the case of distortion measures which do not de0. pend on the side information where the rate loss vanishes as Index Terms—Conditional rate distortion, general distortion measures, high-resolution theory, Shannon lower bound, side information, source coding, Wyner–Ziv problem.

I. INTRODUCTION Consider the source coding scenario depicted in Fig. 1 (see Berger [1], Wyner and Ziv [2]). The sequence f(Xk ; Yk )g consists of indepenManuscript received May 20, 1998; revised September 12, 1999 and May 12, 2000. The work was supported in part by the National Science Foundation, Flishman Foundation, and the Natural Sciences and Engineering Research Council of Canada. T. Linder is with the Department of Mathematics and Statistics, Queen’s University, Kingston, ON, K7L 3N6 Canada (e-mail: [email protected]). R. Zamir is with the Department of Electrical Engineering–Systems, Tel-Aviv University, Ramat-Aviv, 69978, Israel (e-mail: [email protected]). K. Zeger is with the Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA 92093-0407 (e-mail: [email protected]). Communicated by P. A. Chou, Associate Editor for Source Coding. Publisher Item Identifier S 0018-9448(00)09677-2.

n

n

k=1

k

k

^ ) d(X ; X

where d(x; x ^) is a nonnegative single-letter distortion measure. When switches A and B are closed, both the encoder and the decoder have access to the side information. In this case, let RX j Y (D) denote the minimum rate R such that for any  > 0 and all n large enough there exists an encoder–decoder pair operating at distortion D and rate not exceeding R + . Under mild regularity conditions, RX j Y (D), called the conditional rate-distortion function, is given by ^ jY) I(X; X X Y (D) = inf X^

R

(1)

j

^ jY) where the infimum of the conditional mutual information I(X; X ^ given (Y; X) such that is taken over conditional distributions of X ^  D (see Berger [1], Gray [3], and Wyner [4]). E [d(X; X)] If switch A is open and switch B is closed, only the decoder knows the side information. In this case, let the minimum rate achievable at distortion D be denoted by RWZ (D). The quantity RWZ (D) was determined by Wyner and Ziv [2] for finite alphabets and by Wyner [4] for the general case. Assuming that d satisfies certain mild regularity conditions [4], we have R

WZ

(D) = inf I(X; Z

Z

j

Y

)

(2)

where Z is a random object taking values in an arbitrary measurable space, and where the infimum is taken over all conditional distributions of Z given (X; Y ) such that Y $ X $ Z forms a Markov chain (i.e., Y and Z are conditionally independent given X ) and there exists a measurable function f (Y ; Z ) with E [d(X; f (Y ; Z ))]  D . The Wyner–Ziv rate-distortion function R (D) finds applications in coding for communication networks [5] and in systematic data transmission [6]. Although (1) and (2) provide a single-letter characterization1 of the achievable rates at distortion level D , more explicit expressions are desirable. In this correspondence we develop explicit formulas which approximate RX j Y (D) and R (D) with increasing accuracy as D ! 0. Moreover, we consider the more general situation in which

WZ

WZ

1Strictly speaking, (2) is not a single-letter characterization since the alphabet of the auxiliary random object Z is not fixed. However, if and are realvalued, one can prove that can be restricted to be a real random variable without changing the defining infimum.

0018–9448/00$10.00 © 2000 IEEE

Z

X

Y

Suggest Documents