Uniqueness of Locally Optimal Quantizer for Log-Concave Density ...

1 downloads 0 Views 851KB Size Report
Uniqueness of Locally Optimal Quantizer for. Log-Concave Density and Convex Error. Weighting Function. JOHN C. KIEFFER. Abstract-It is desired to encode a ...
42

IEEE TRANSACTIONS

ON INFORMATION

THEORY,

VOL.

IT-29,

NO.

1, JANUARY

1983

Uniqueness of Locally Optimal Quantizer for Log-Concave Density and Convex Error Weighting Function JOHN C. KIEFFER

Abstract-It is desired to encode a random variable X using an N-level quantizer Q to minimize the expected distortion I$(] X - Q(X)) I), where the error weighting function p is convex, strictly increasing and continuously differentiable. It is shown that if X has a log-concave density, then there exists a unique locally optimal quantizer Q* and Lloyd’s Method I may be used to find Q*. Trushkht had earlier shown this result for the error weighting functions p(t) z t and p(t) = t*.

I.

INTRODUCTION

ET CI,r be extended real numbers with u < 7. Let p: [0, co) + [0, co) be a strictly increasing, continuously differentiable, convex function. Let f: ((I, 7) -+ (0,~) be a log-concave probability density function such that 7 UER, 00, (1) J 0 P (1 t - ~I>f(t)dt~

L

where R denotes the set of real numbers. Let N be an integer such that N 2 2. We call Q: (a, r) + R an N-level quantizer if there exists a vector x = (x,, . . . , xN- ]) satisfyingu dt

-/‘P(I t - do,> A) I)f(t> dt 01

uSa> 5 inf,F(a, whence g(a,, PJ -+ da, P) by 1) and 4).

gh

i = l;*.,N,

t - g(an, P,) I).f(t> dt

1

lb+) +I (2,,3f,,,) (+(I

t-

Q;(t)

I).$>,

0.

we can apply the dominated convergence theorem to con-

44

IEEE TRANSACTIONS

ON INFORMATION

THEORY,

VOL.

d@Z) = f’%(l X/I

1983

t - Q,-(t) I)f(t> dt V.

slimsup/z)p(] n-cc X/l”

t - Q,cn,(t) I)f(t)dt.

Remark: An examination of the preceeding proofs will show that Lemma 2 is valid for all pairs (p, f), where p is as has been previously specified (convex, strictly increasing, continuously differentiable) and f: (cr, r) -+ (0, co) is any probability density for which (1) holds. Thus .the density f need not be log-concave for there to exist at least one absolutely optimal quantizer; however, we will need to assume f is log-concave to show uniqueness. IV.

STATEMENT OF MAIN RESULT

Let T: 0, --f 0, be the map such that T(x,; II (u,,. * ., u,+ ,), where “iz~[~(xi-l~u,)

+g(xi?xi+l>lf

PROOFS

Recall from Section II that G is the real-valued function of three variables given by

The right side of the preceding inequality is upper-bounded by inf {d(x): x E O,,,,}, and so by 4), i must be in ON, completing the proof.

. .,x,-,)

i = l;..,N-

1.

The following lemma shows that in searching for locally optimal N-level quantizers, we may restrict our search to quantizers of type Q,, where x E 0, is a fixed point of T (i.e., TX = x). Lemma 3: Let Q = Q,,, be a locally optimal N-level quantizer. Then Q = Q, and TX = x.

G(a, P, u) = /BH(u a

- +‘(I

t - u I)f(d

uER,

~-(a [P(lXi-Al)

Xi) Y,) = 0, -

P(lxi-X+~

(4)

G(a, P, u) = f(+(u

- a> - f(&b

- u>

+ / ‘~(1 t - u I>f’(t> dt, a ulacuc/llr7-.

(5)

(In the above expression, f’ denotes the almost everywhere defined derivative off, which exists since f is log-concave and therefore absolutely continuous [5, theorem A]. By f(a)p(u - a) we mean zero if u = -cc and [limt+o+ f(t)]p(u - a) if u > -cc; by f(T)p(T - u) we mean zero if T,= co and [lim,+Tmf(t)]p(T - u) if r < co.) Let G,, G,, G, denote the following partial derivative functions: G,(fi, P, u) = F$t-‘[G(a

+ t, P, u> - Gb,

P, u)],

ucacu - G(c

P, u)],

UrcxcucpcT;

These partial derivative functions exist on the indicated domains and are given by

i = 1; . .,N;

I>lf(x,)=O, i = l;..,N

dt,

Integrating by parts and using Lemma A2 of the Appendix, we have the following alternate expression for G:

Proof Taking the partial derivatives with respect to the coordinates of x and y (which exist by the fundamental theorem of calculus and Lemma Al of the Appendix) and setting them equal to zero we obtain

ii>

1, JANUARY

of using the interates of T in this way to find the optimal quantizer is called Lloyd’s Method I [4].

elude that

i>

IT-29, NO.

G,b,

P, u> = -P’(u

- a>fb>>

- 1.

From i), we obtain (using 5)) y, = g(xi-,, x,) and so Q = Q,. From ii), (y, + JJ,+~)/~ = xi whence TX = x. We quote now our main result, to be proved in Section V. Theorem: Under the given assumptions on .p and f(p: [0, co) + [0, cc) convex, strictly increasing and continuously differentiable and f: (a, 7) + (0, cc) a log-concave probability density for which (1) holds), T has a unique fixed point x*. Furthermore, for every x E O,, lim n-ooT”~ = x*. As a corollary, wesee from Lemmas 2 and 3 that if x* is the unique fixed point of T, then Q: is the unique N-level locally optimal quantizer, which is absolutely optimal. Thus we have the optimal quantizer Qz once we have found x*; x* can be found as the limit of the sequence {T”x,}~~,, where the initial value x0 is any point in 0,. The method

ucacucfisr; %(a, P> i> = -P’@

- u)f(Ph ula

- t)p’(( t - u I)f’(t) u+Y 0, if u < (Y < /I 5 r ora~a

dt = /kt g =- ; f%(l

- df(t)

dt

5.4)

dt - /‘d(f g

- s)f’(t)

dt,

+%(a,

5.5)

t -

g I>fW

-f’W fWl

dtds

dt

The functions g, and g, are positive on their respective domains, and are continuous on {(a, /?): u < (Y < p < r}. Also, g, is continuous on {(a, r): u < (Y < r}, g, is continuous on {(a, p): u < p < r}, and

5.3)

P>) + %(a, P> gh

G,h

g,(cu, ,8) + g,(a, p) 5 1, if u < (Y < ,8 < r, with equality only if log f is affine on (a, /I); g,(u, ,!3) _( 1, if u < ,8 < r, with equality only if u = - co and log f is affine on (u, ,0); g,(cu, r) I 1, if u < (Y < r, with equality only if r = cc and log f is affine on (a, r).

Proof: Since g is the unique function satisfying G(cw,/3, g(a, /3)) = 0 (u 5 a < p 5 r), G, is continuous on {(% P, u>: u : u P>> P> - ida,

P>>y

P, da> P>) + G,b+ P> do, P>>

= G,h P, .i(u> P>>(l - g,h o

The integral in a) is /h;o’(g - t)f’(t) a

% (a,P,da, P>>

G,(a, 7, da,

7)) + $(a,

= G,(a, 7, g(a, r))(l a>

- gl(a, r>>,

By (6) the left side of 5.4) is equal to the integral in a) of Lemma 4, and so 5.1) follows. Again applying a) of Lemma 4, if the left side of 5.4) vanishes then log f is affine on (a, p), and f(u)p’(g(u, p) - u) = 0 ‘(which can only happen if u = -co), and if the left side of 5.6) vanishes then log f is affine on (a, r) and f(T)p’(r - g(a, r)) = 0 (which only happens if r = 00). Hence 5.2) and 5.3) hold. The following lemma is an easy consequence of Lemma 5, using the expressions for the elements of D?-(x) (the (N - 1) X (N - 1) derivative matrix of T at x E 0,) in terms of g, and g, given in [3]. Lemma 6: The map T: 0, --3 0, is continuously differentiable. Each DT(x) (x E 0,) is a nonnegative, irreducible matrix such that each row sum is less than or equal to one, the row sums of DT(x) all being equal to one only if (u, T) = (- co, co) and log f is affine on each subinterval (xi--l, Xi), i = 1; * .,N. If x = (x1; 9 .) xN- ,) is a vector of real numbers let II x II denote maxi ] xi ( . Applying Lemma 6 and the mean value theorem, we have IlTx - Tyll 5 IIx -ylI,

x, y E 0,.

(8)

Let sN be the set of all vectors (x,;.*,xNPI) with real c_omponents such that u 5 x1 5 * . . s_x,-, 5 T. Let T: --) 0, be the map such that T(x,; ..,xN--) = (“u:‘.... , uN- ,), where uj=i[S(xi,7

xi) + g(xi, xi+,>],

i=

I,...

,N.

Then T extends T, and since 2 is continuous, r is continuous. Also, as remarked in [3], TN-‘(ON)

c 0,.

(9)

Lemma 7: For each x E ON, { T”x} converges to a fixed point of T.

46

IEEE TRANSACTIONS

Proof: Let d be the function defined in the proof of Lemma 2. Then, as remarked in [3], for each x E O,, d(Tx) 5 d(x) with equality only if TX = x. Fix x E 0,. If y* E 0, is a limit point of {T”x} then y* is a fixed point of T. (For, if not, then d(Ty*) < d(y) and then for some c > 0, d(T”x) 2 d(T “+‘x) + z for infinitely many n, which is impossible since {d(T”x)} is a decreasing nonnegative sequence.) But by (8), if y* is a fixed point of T and a limit point of {T”x}, then T”x -+ y*. So all we have to do is show that {T”x} has a limit point in 0,. Let x* be a fixed point of T (which exists by Lemmas 2 and 3). Then IIT”x - x*ll = I)T”x - T”x*ll 5 I/x - x*11 for all n. Therefore {T”x} is a bounded subset of RN-‘, and so must have a limit point 7 ino,. By (9), TN7 = y* is in ON, and by continuity of Ton ON, y* is a limit point of {T”x} since j is. Because of these last two lemmas, hypotheses a)-c) of Lemma A3 of the Appendix hold and so T has a unique fixed point.

j’p(lt-ul)Ig(t)(dtI s(t) I dt < 007

uER.

UER,

is continuously differentiable, and F’(u)

=/‘H’(u 0

- t)p’(l

t -

uI)g(t)

dt,

uER.

Fix u E R. Let u, + u(u, # u). Then for almost every t E (u, T), Proof:

P(l t - %II) - POI- u I> + H( 24- t)p’(J u - t I). un -24

Jc?(t

The reader may easily complete the proof using the dominated convergence theorem. Lemma A2: a) /,‘p(l t - u I) If’(t) 1dt < 03, u E R; b) For every u E R, fim,,,p (I t - u \)f(t) = 0, if u = -00 and lim,,,p(l t = 0, if 7 = co.

u + l>f(t>

dt 2 &(t, n

- u)f(t,),

A similar argument

. . applies ,f u = -00 and limsup,,,p(( t - u I)f(t)

> 0. Therefore

b) holds. Let g be a function on {(cu, /3): u 5 LY< p I T} such that a +P(l t + P - 1 I) + P(l t + P + 1 I).

Proof:

t - #I> If'(t) I dt 2 2Q,

whence p( t, - u)f( t,) + 0, a contradiction.

Set p = sup ( u, ( . Then, for every n and t, n

- u I)f(t)

1983

of T,,.,(cy,p). b) If u 5 (Y < p 5 T and N 2 2, TN(a, j3) is continuously differentiable and for each x E O,( (Y,j?), the (N - 1) X (N - 1) derivative matrix at x, [DT,(ol, p)](x), is nonnegative, irreducible, and has each row sum less than or equal to one, with each row sum being equal to one only if ((Y, /3) = (u, T). c) There exists u E (u, T) such that u E {x,;..,x,_,} for every x E O,( u, T) for which [DTN(~, p)](x) has all of its row sums equal to one. Then TN(u, 7) has a unique fixed point, N 12.

7 s0

1, JANUARY

where the quantity Q is the right side of (10). Hence, a) of Lemma A2 will follow once we have proved b) of that same lemma. Suppose T = cc and limsup,,.p(J t - u I)f( t) > 0. Pick a real C > u - 1 so that f is decreasing on [C, 00). Pick tn + 00 so that {[t, - 1, t,]} are disjoint subintervals of [C, co), and {p( t, - u)f( t,)} is bounded away from zero. Then,

Lemma

t - u I)g(t)

NO.

where x,, = LY,xN = p.

Then,

F

jkl 0

x E O,(a, p), then {?“;(a, p)(x)}?=,

= [‘P(I

IT-29,

if u < c, < c2 < T. Since (a, r) can be partitioned into two on one of whichf’ 2 0 holds and on the other of which f’ 5 0 holds, we must have

u, = ;Ig(+,,

Define F: R --f R by F(u)

VOL.

intervals

Let g: (u, r) + R satisfy

Al:

THEORY,

that T,v(~,~)(x,,~~~,xN-,)=(~,,~~~,uN-,),

APPENDIX Lemma

ON INFORMATION

sup p(I t - uI>f(t) O-=Lff(t> -0

dt,

(10)

[2]

R. G. Bartle, The Hements of Real Analysis. New York: John Wiley, 1964. R. M. Gray, J. C. Kieffer, and Y. Linde, “Locally optimal block

IEEE TRANSACTIONS

[3]

[4]

ON INFORMATION

THEORY,

VOL.

IT-29,

NO.

1, JANUARY

quantizer design,” Iform. Con@., vol. 45, pp. 178-198, May 1980. J. C. Kieffer, “Exponential rate of convergence for Lloyd’s Method I,” IEEE Trans. Inform. Theory, (Special issue on quantization), vol. IT-28, pp. 205-210, March 1982. S. P. Lloyd, “Least squares quantization in PCM,” Bell Telephone Labs Memorandum, Murray Hill, NJ, 1957; reprinted in IEEE Trans. Inform. Theory, (Special issue on quantization), vol. IT-28, pp.

1983

[5] [6]

41 129-137, March 1982. A. W . Roberts and D. E. Varberg, Convex Functions. New York: Academic, 1973. A. V. Trushkin, “Sufficient conditions for uniqueness of a locally optimal quantizer for a class of convex error weighting functions,” IEEE Trans. Inform. Theory, (Special issue on quantization), vol. IT-28, pp. 187-198, March 1982.

Comparative Performance of Quantum Signals in Unimodal and Bimodal Binary Optical Communications CARL W . HELSTROM,

Absfrucf-The performance of two-photon coherent-state (TCS) signals and integral-quantum signals in unimodal and bimodal free-space optical communications is compared with that of ordinary .coherent signals under the constraint of fixed average error probability. For the unimodal channel of known phase the minimum attainable error probability for both on-off and antipodal TCS signals received in thermal noise is calculated by applying perturbation theory to the detection dfrerator equation. For unimodal antipodal TCS signaling the threshold receiver is also considered. For channels of random phase the optimum photon-counting receiver is analyzed. In all cases the advantage of the nonclassical signal over an ordinary coherent signal vanishes as the transmittance of the channel goes to zero.

I.

INTRODUCTION

CLASSICAL signal is defined in quantum communication theory as one generated by a transmitter aperture field in a classical state, that is, in a state with a nonnegative P-representation. The elemental classical signal is the ordinary coherent signal, which arises from a transmitting field in a pure coherent state of the kind elaborately analyzed by Glauber [l]. A pulse emitted by a laser operating well above threshold closely resembles this ideal classical signal. Signals arising from transmitter fields in nonclassical states are called quantum signals; of particular interest are those generated by states without well-behaved P-representations. Although such quantum signals have not yet been

A

Manuscript received May 20, 1980; revised February 26, 1982. This material is based on research supported by the National Science Foundation under Grant ENG 77-04500. This paper was partially presented at the 1981 IEEE International Symposium on Information Theory, Santa Monica, February 9- 12, 198 1. The author is with the Department of Electrical Engineering and Computer Sciences, University of California, San Diego, La Jolla, CA 92093.

FELLOW, IEEE

created, not even in the laboratory, their potential advantages over classical signals in communications ought to be investigated. Quantum signals arising from two-photon coherent states, the TCS signals, have been extensively studied by Yuen and Shapiro [2]-[4]. Mandel [5] proposed communicating with signals generated by transmitter states having fixed numbers of photons; we shall call them integral-quantum (IQ) signals. W e shall compare the performance of these kinds of quantum signals with that of ordinary coherent signals under various conditions. Our results may assist in judging the potential usefulness of quantum signals in optical communications. Both unimodal and bimodal binary free-space communications will be considered. In a unimodal system a single mode of the electromagnetic field in the aperture of the transmitter is excited, and a single mode of the aperture field of the receiver responds. The degree of coupling between transmitting and receiving modes is measured by the amplitude transmittance K, so defined that if the transmitting mode contains a classical field with a complex amplitude (Ye,the complex amplitude of the receiving mode is ~cq,. The energy transmittance ] K I2 iS apprOXiImitdy 1 K I2 -

A,/+/

( XR)2,

with A, and A, the areas of the transmitter and receiver apertures, respectively, X the wavelength of the electromagnetic radiation, and R the distance from transmitter to receiver. The condition R 2 (F!,A,)“~/A,

or

1K

I,(

1,

is required in order that only a single mode of the receiving aperture be significantly excited by the transmitted field;

0018-9448/83/0100-0047$01.00

01983 IEEE

Suggest Documents