Robust Coding for Uncertain Sources: A Minimax Approach - Angelfire

2 downloads 0 Views 94KB Size Report
If ν
Robust Coding for Uncertain Sources: A Minimax Approach Farzad Rezaei , Charalambos D. Charalambous†

School of Information Technology and Engineering University of Ottawa CANADA † Department

of Electrical and Computer Engineering University of Cyprus CYPRUS

E-mail: [email protected], [email protected]

International Symposium on Information Theory September 2005 Adelaide, Australia 0

Table of Contents 1. Introduction 2. Problem Formulation 3. Solution to the Maximization Problem 4. Robust Shannon Coding 5. Robust Huffman Coding 6. A Numerical Example 7. Relationship to the Maximum Entropy 8. Minmax Redundancy 9. Future Research 1

Introduction

Given the source distribution → Find the code with the shortest average length.

1) Shannon Code

2) Huffman Code

Coding for one source with Unknown Statistics → Use empirical distribution [Davisson73, Barron92, Barron98, Cziszar98 ]

Our problem: Coding for a class of sources, described by a relative entropy constraint. 2

Introduction

There is a known nominal distribution w.r.t. which relative entropy is considered.

This class contains infinitely many sources.

We wish to design a code which is Robust in sense of average length.

i.e. One code for a class of sources, which performs reasonably well for all of them.

3

Problem Formulation

The finite alphabet Σ, and M(Σ) the set of distributions on Σ.

Each source has distribution denoted by ν, the nominal distribution denoted by µ is known,

Uncertainty Description: MR = {ν ∈ M(Σ); H(ν|µ) ≤ R}, R is given. Minmax Problem ( D-ary code ) (

J(`∗, ν ∗) = inf (`1,...,`M ) supν∈M (Σ) Eν (`) (1) PM −` i Subject to H(ν|µ) ≤ R, ≤1 i=1 D

4

Problem Formulation

Lagrangian: Lλ,s (`, ν) = Eν (`) − s(H(ν|µ) − R) + λ(

PM

i=1 D

−`i

− 1)

and the associated dual functional Lλ,s(`, ν ∗) = sup Lλ,s(`, ν) ν∈MR

s > 0 and λ are Lagrange Multipliers.

The supremum over ν is independent of the Kraft inequality.

5

Solution to the Maximization Problem Duality Relation between relative entropy and free energy → Large Deviations theory. See [meneghini96], [deuschel-stroock89]. Theorem 1 Assume s > 0, then the dual function Lλ,s(`, ν ∗) is given by Lλ,s (`, ν ∗ )

= sR + s log (

PM

i=1 e

`i s

µi ) + λ(

PM

i=1 D

−`i

− 1)

Moreover, the supremum is attained at ∗,s

νi

=

`i e s µi `j PM s j=1 e µj

∀i ∈ {1, ..., M }

(2)

The worst case distribution occurs on the boundary of the constraint, that is,

s0 = arg mins>0 Lλ,s(`, ν ∗,s) is such that H(ν ∗,s0 |µ) = R. 6

Robust Shannon Coding

By Kuhn-Tucker conditions, the optimum codeword length, `∗j

C s∗ ln ( )e =d 1 + s∗.lnD µj

∀j ∈ {1, ..., M } (3)



` i PM where C = i=1 e s∗ µi. `∗i ’s and s∗ from double minimization with re-

spect to lengths and s.

(M + 1) unknown values (`∗1, ..., `∗M , s∗).



(3) gives M equations, and H(ν ∗,s |µ) = R. Call this Robust Shannon Code.

7

Robust Shannon Coding

Uniform µ: all code word lengths are equal. ∗ ∗,s Then, ν also uniform, and s∗ → ∞.

This shows that we gain nothing by using robust coding method if the nominal distribution is itself uniform. This result holds for any R > 0.

The worst distribution is µ itself.

8

Robust Shannon Coding Theorem 2 The optimum distribution is given by 1) µi α = PM α j=1 µj s ln D α= 1 + s ln D ∗,s νi

∀i ∈ {1, ..., M }

(4)

2) H(ν ∗,s|µ) is a non-increasing function of s.

3) codeword lengths 1 `∗(s) = dlogD ( ∗,s )e ν Theorem 3 The necessary condition for existence of a solution M 1 1 X R≤ ln ( ) − ln M = H(η|µ) M i=1 µi

(5) 9

Robust Shannon Coding

Remark 1 Coding with respect to the distribution ν ∗, leads to an average length close to R´ enyi entropy. This is similar to the result given in (merhav91). Lemma 1 Suppose {`∗1, ..., `∗M } and s∗ correspond to the robust Shannon code, `∗max ≤ logD (

1

)

µmin 1 1 s∗ ≤ ( logD ( ) − HD (µ)) R µmin

10

Robust Shannon Coding 1) Initial s equal to the upper bound of s∗ , Then C(s) = (

M X

1 α α

µi )

i=1 s ln D . where α = 1+s ln D 2)

³ C(s) ´ s `i = ln 1 + s ln D µi

3) ∗,s

νi

=

`i e s µi `j PM s j=1 e µj

4) If H(ν ∗,s|µ) < R then decrease s by a fixed step size ∆ and then go back to step 1.

5) Continue steps (1) to (4) until |H(ν ∗,s0 |µ) − R| ≤ δ, where δ

Suggest Documents