Capacity Bounds for Discrete-Time, Amplitude-Constrained ... - arXiv

11 downloads 0 Views 198KB Size Report
Nov 27, 2015 - Amplitude-Constrained, Additive White Gaussian ... cations is the additive white Gaussian noise (AWGN) channel. ... drew@ee.iitm.ac.in.
1

Capacity Bounds for Discrete-Time, Amplitude-Constrained, Additive White Gaussian Noise Channels arXiv:1511.08742v1 [cs.IT] 27 Nov 2015

Andrew Thangaraj, Gerhard Kramer and Georg B¨ocherer

Abstract—The capacity-achieving input distribution of the discrete-time, additive white Gaussian noise (AWGN) channel with an amplitude constraint is discrete and seems difficult to characterize explicitly. A dual capacity expression is used to derive analytic capacity upper bounds for scalar and vector AWGN channels. The scalar bound improves on McKellips’ bound and is within 0.1 bits of capacity for all signal-to-noise ratios (SNRs). The two-dimensional bound is within 0.15 bits of capacity provably up to 4.5 dB, and numerical evidence suggests a similar gap for all SNRs. Index Terms—additive white Gaussian noise channel, amplitude constraint, capacity

I. I NTRODUCTION The most commonly-studied channel model for communications is the additive white Gaussian noise (AWGN) channel. The AWGN model is interesting only with constraints on the channel input or output. Depending on the application, one is interested in limiting, e.g., the average input (or output) variance or the input amplitude. Input (or output) variance constraints result in elegant analytic capacity expressions such as Shannon’s 12 log(1 + SNR) formula. The amplitude constraint seems less tractable, and typical analyses use Smith’s methods [1] to show that the capacity-achieving input distribution has discrete amplitudes, see [2], [3], [4], [5] and references therein. A recent line of work studies the peak-to-average power (PAPR) ratio of good codes [6]. An alternative approach is by McKellips [7] who develops analytic and tight capacity upper bounds by bounding the channel output entropy. We instead use the dual capacity expression in [8, p. 128] (see also [9, Eqn. (7)]) and study both scalar and vector channels. The dual approach was also used in [10] for scalar AWGN channels with non-negative channel inputs. Our models and results differ from those in [10]: we do not impose a non-negativity constraint (this difference turns out to be minor for the scalar case), we study vector channels that include the important practical case of two-dimensional (complex) AWGN channels, and we develop certain bounds A. Thangaraj is with the Department of Electrical Engineering, Indian Institute of Technology Madras, Chennai 600036, India, Email: [email protected] G. Kramer and G. B¨ocherer are with the Department of Electrical and Computer Engineering, Technical University of Munich, Arcisstraße 21, 80333 M¨unchen, Germany, Email: [email protected], [email protected] This paper was presented in part at the IEEE International Symposium on Information Theory (ISIT) 2015, Hong Kong.

in more detail. Our bounds are within 0.15 bits of capacity provably up to 4.5 dB, and numerically for all SNRs for twodimensional (complex) AWGN channels. This paper is organized as follows. Section II presents functions, integrals, and bounds that we need later. Sections III-V develop the one-, two-, and n-dimensional bounds, as well as two refinements. The appendices contain technical proofs. II. P RELIMINARIES Consider the following functions: 2 1 (a) ψ(x) = √ e−x /2 Z 2π ∞

(b)

Q(x) =

ψ(z) dz

x

(c)

I0 (x) =

1 π

(e)

D(pkq) =

π

0

(d)

Q1 (a, b) =

Z

Z

Z

ex cos φ dφ



b ∞



z e−(z

2

p(z) log

+a2 )/2

I0 (az) dz

p(z) dz q(z)

where (a) is the Gaussian density, (b) is the Q-function, (c) is the modified Bessel function of the first kind of integer order 0, (d) is the Marcum Q-function, and (e) is the informational divergence between the densities p and q. Logarithms to the base e and base 2 are denoted as log and log2 , respectively. A few useful properties are Q1 (a, 0) = 1 and the bounds 1 x ψ(x) < Q(x) < ψ(x) (1) 1 + x2 x for x > 0 (and for x = 0). We also consider the integrals: Z ∞ ∞ y ψ(y) dy = −ψ(y)|x = ψ(x) (2) x Z ∞ Z ∞ y(−dψ(y)) = xψ(x) + Q(x) (3) y 2 ψ(y) dy = x Zx∞ Z ∞ y 2 (−dψ(y)) = (x2 + 2)ψ(x). (4) y 3 ψ(y) dy = x

x

For sequences f (n) and g(n), the standard big-O notation f (n) = O(g(n)) denotes that |f (n)| ≤ c|g(n)| for a constant c and sufficiently large n [11]. Finally, a useful upper bound on the capacity C of a memoryless channel pY |X (·) is based on the dual capacity expression [8, p. 128], [9, Eqn. (7)]. The bound is  C ≤ max D pY |X (·|x) k qY (·) (5) x∈S

2

qY (y)

A. McKellips’ bound Using (1), we readily see that g(u) ≤ 0 for u > 0. By symmetry, we may restrict attention to 0 ≤ x ≤ A so that g(A − x) + g(A + x) ≤ 0. Using (9), we thus have

β/2A

2

2

1−β −(y−A) √ e 2 2π

1−β −(y+A) √ e 2 2π

−A Fig. 1.

2A D(pY |X (·|x)kqY (·)) ≤ log √ β 2πe √ β 2πe [Q(A − x) + Q(A + x)] . + log (1 − β)2A

A

y

where qY (·) is any choice of “test” density qY (·) and S is the set of permitted x. III. R EAL AWGN C HANNEL Consider the real-alphabet AWGN channel (6)

where Z is a Gaussian random variable with mean 0 and variance 1. The channel density is pY |X (y|x) = ψ(y − x).

(7)

Consider the amplitude constraint |X| ≤ A where A > 0. We choose a family of test densities ( β |y| ≤ A 2A , 2 qY (y) = (8) 1−β √ e−(y−A) /2 , |y| > A 2π where β ∈ [0, 1] is a parameter to be optimized. The test density is illustrated in Fig. 1. It is a mixture of two distributions, a uniform distribution in the interval |y| ≤ A and a “split and scaled” Gaussian density for |y| > A. The parameter β specifies the mixing proportion. Inserting (8) into the divergence in (5), we have  D pY |X (·|x)kqY (·) Z ∞ pY |X (y|x) = dy pY |X (y|x) log qY (y) −∞   √  β = − log 2πe − log [1 − Q(A + x) − Q(A − x)] 2A   1−β [Q(A + x) + Q(A − x)] − log √ 2π 1 + [(A + x)2 + 1]Q(A + x) − (A + x)ψ(A + x) 2 +[(A − x)2 + 1]Q(A − x) − (A − x)ψ(A − x) √ β 2πe 2A + log [Q(A − x) + Q(A + x)] = log √ (1 − β)2A β 2πe 1 (9) + [g(A − x) + g(A + x)] 2 where g(u) , u2 Q(u) − uψ(u).

To recover McKellips’ bound [7], we choose

2A (11) 2πe + 2A to make the second term in (10) equal to zero, and obtain   2A C ≤ log 1 + √ . (12) 2πe We now combine (12) with the capacity under the (weaker) average power constraint E X 2 ≤ A2 . The √ noise power is 1 so the signal-to-noise ratio (SNR) is P = A. We thus arrive at McKellips’ bound in [7]: ( ! ) r 2P 1 C ≤ min log 1 + , log (1 + P ) . (13) πe 2 β=√

Test densities for the real AWGN channel.

Y =X +Z

(10)

Observe that the high-SNR power loss is 10 log10 (πe/2) ≈ 6.30 dB. However, this comparison is based on equating the maximum power P with the average power. Instead, if we use the uniform distribution for X then the average power is P/3 and the high-SNR power loss reduces to the high-SNR shaping loss of 10 log10 (πe/6) ≈ 1.53 dB (see [12]). B. Refined Bound McKellips’ bound seems tight for high SNR, roughly above 6 dB. For low SNR, below 0 dB, the average-constraint bound 1 2 log(1 + P ) is tight. For the intermediate range between 0 to 6 dB, we derive a better bound√next. β 2πe is positive, i.e., consider Consider β for which log (1−β)2A the range 2A β≥ √ . (14) 2πe + 2A Observe that Q(A − x) + Q(A + x) increases with x if x ∈ [0, A], since the derivative evaluates to ψ(A − x) − ψ(A + x) which is positive for x ∈ [0, A]. Thus, the RHS of (10) is maximized by setting x = A, and we obtain the bound 2A D(pY |X (·|x)kqY (·)) ≤ log √ β 2πe √ β 2πe [1/2 + Q(2A)] + log (1 − β)2A 2A = (1/2 − Q(2A)) log √ − (1/2 − Q(2A)) log β 2πe − (1/2 + Q(2A)) log(1 − β). (15) Setting β = 1/2 − Q(2A), minimizes the RHS of (15). However, from (14) this choice of β is valid only if 2A 1/2 − Q(2A) ≥ √ 2πe + 2A

(16)

3

Lower bound 1/2 log(1 + SNR) McKellips Refined

Capacity Bounds

1.4 1.2 1 0.8 0.6 0.4

0

2

4 6 SNR (dB)

8

10

into the divergence (5), we have1  D pY |X (·|x)kqY (·) = − log(2πe) − E [log qY (Y )]    πA2 πA2 − E log qY (Y ) = log 2πeβ β Z ∞ 2 2 2 A − e−(z +|x| )/2 I0 (z|x|) = log 2eβ A     2 2 (z − A) (1 − β)πA log    z dz. (22)  − p 2 2π 1 + π/2A β     p 2e 1 + π/2A β A2  Q1 (|x|, A) = log + log  2eβ (1 − β)A2 − g˜(|x|, A)

(23)

where Fig. 2. Capacity bounds for scalar AWGN channels. The rate units are bits per channel use.

g˜(|x|, A) =

Z



e

−(z 2 +|x|2 )/2

A

p πe/2. Therefore, which is equivalent to A ≤ 2.0662 ≈ √ using P = A, we have the bound r 2P + He (β(P )), P ≤ 6.303dB (17) C ≤ β(P ) log πe √ where β(P ) = 1/2 − Q(2 P ) and He (x) = −x log(x)− (1 − x) log(x) is the binary entropy function with the units of nats. The bounds are plotted in Fig. 2, where the lower bound is taken from [7] with optimized input distribution. The refined bound is best for SNR from 0 to 5 dB but is valid only for SNR below 6.3 dB.

 (z − A)2 z dz. I0 (z|x|) 1 − 2 (24) 

Lemma 1: g˜(|x|, A) in (24) is positive for |x| ∈ [0, A]. Proof: We use the definition of I0 (x) to re-write (24) as Z π 2 2 1 √ e−(|x| /2) sin φ Z2π∞ 0    2 − (z − A)2 z ψ (z − |x| cos φ) dz dφ. (25) A

The integral in square brackets can be simplified by substituting z˜ = z − |x| cos φ and u = A − |x| cos φ to become Z ∞   z − u + A) ψ (˜ z ) d˜ z. (26) 2 − (˜ z − u)2 (˜ u

Using (2)-(4), the integral (26) evaluates to

IV. C OMPLEX AWGN C HANNEL Consider next the complex-alphabet AWGN channel Y =X +Z (18) √ where Z = ZR + jZI , j = −1, and ZR , ZI are independent Gaussian random variables with mean 0 and variance 1. The channel density in Cartesian coordinates with x = [xR , xI ] and y = [yR , yI ] is pY |X (y|x) =

1 −ky−xk2 /2 e . 2π

(19)

In spherical coordinates with x = [|x|, φx ] and y = [|y|, φy ], the density is pY |X (y|x) =

1 −(|y|2 +|x|2 −2|x||y| cos(φy −φx ))/2 e . 2π

(20)

Consider again the peak power constraint |X| ≤ A where A > 0. We choose the test density   β2, |y| ≤ A πA 2 qY (y) = (21)  1−β  e−(|y|−A) /2 , |y| > A. √  2π 1+

π/2A

Again, the test density is uniform in the interval |y| ≤ A and is a “split and scaled” Gaussian density for |y| > A. Inserting

u(A − u) [ψ(u) − u Q(u)] + (A + u)Q(u)

(27)

or alternatively to   −(A − u) (1 + u2 )Q(u) − u ψ(u) + 2AQ(u).

(28)

Note that we have 0 ≤ u ≤ A + |x|. We consider two cases. • 0 ≤ u ≤ A: We have u(A − u) ≥ 0 and the bound on the right-hand side of (1) tells us that (27) is negative. • A ≤ u ≤ A + |x|: We have A − u ≤ 0 and the bound on the left-hand side of (1) tells us that (28) is positive. Thus, the expression (27) (or equivalently (28)) is positive. But this implies that the integrals in (25)-(26) are all positive, and we conclude that g˜(A, x) is positive.

A. McKellips-type Bound Using Lemma 1 in (23), we have  D pY |X (·|x)kqY (·)     p π/2A β 2 2e 1 + A  Q1 (|x|, A). (29) + log  ≤ log 2eβ (1 − β)A2

1 By symmetry we may restrict attention to φ = 0, i.e., real x satisfying x 0 ≤ x ≤ A.

4

8

By choosing β to make the second term above zero, we have (30)

  p A2 + 2e 1 + π/2A

6

which results in the McKellips-type bound   p  A2 . D pY |X (·|x)kqY (·) ≤ log 1 + π/2A + 2e Since this bound is independent of x, we have   p A2 C < log 1 + π/2A + . 2e

(31)

B. Refined Bound We refine the upper bound for the complex AWGN channel in a manner similar to the refinement in the real AWGN case.  First, rewrite the final expression for D pY |X (·|x)kqY (·) in (22) as  2  A D pY |X (·|x)kqY (·) = log 2eβ p 2β(1 + π/2A) Q1 (|x|, A) + g(|x|, A) (34) + log (1 − β)A2 where ∞

A

(z − A)2 −(z2 +|x|2 )/2 ze I0 (z|x|) dz. 2

The functions Q1 (|x|, A) and g(|x|, A) are both increasing in |x|. This is proved as part of the general n-dimensional case √ 2β(1+

π/2A)

the in Appendix B. Hence, for a positive log (1−β)A2 expression (34) is maximized at |x| = A, and we obtain  2  A D pY |X (·|x)kqY (·) ≤ log 2eβ p 2β(1 + π/2A) + log Q1 (A, A) + g(A, A) (35) (1 − β)A2 provided that β≥

A2 A2 + 2(1 +

p . π/2A)

4

2

where we measure the rate per complex symbol (two real dimensions). The high-SNR power loss is 10 log10 (e) ≈ 4.34 dB. Again, however, this comparison is based on equating the maximum power P with the average power. Instead, if we use the uniform distribution for X then the average power is P/2 and the high-SNR power loss reduces to a shaping loss of 10 log10 (e/2) ≈ 1.33 dB.

Z

Lower bound log(1 + P ) McKellips Refined Numerical

(32)

We combine (32) with  thecapacity under the (weaker) average power constraint E |X|2 ≤ A2 . Observe that the complex noise has power 2 so the corresponding SNR is P = A2 /2. We thus have the simple bound     √ P C ≤ min log 1 + πP + , log (1 + P ) (33) e

g(|x|, A) =

Capacity Bounds

β=

A2

(36)

−5

0

5

10 SNR (dB)

15

20

25

Fig. 3. Capacity bounds for complex AWGN channels. The rate units are bits per 2 dimensions.

Rewriting the bound of (35) as  1+ D pY |X (·|x)kqY (·) ≤ Q1 (A, A) log

p π/2A e

A2 + g(A, A) 2e − (1 − Q1 (A, A)) log β − Q1 (A, A) log(1 − β)

+ (1 − Q1 (A, A)) log

(37)

we see that β = 1 − Q1 (A, A) minimizes the RHS of (37). However, from (36) this choice of β is valid only if 1 − Q1 (A, A) ≥

A2 A2 + 2(1 +

p π/2A)

(38)

which requires A < 2.36 numerically. Therefore, setting P = A2 /2 we have √ P C ≤(1 − β(P )) log(1 + πP ) + β(P ) log e √ √ + He (β(P )) − g˜( 2P , 2P ), P ≤ 4.45dB

(39)

√ √ where β(P ) = 1 − Q1 ( 2P , 2P ) and we have used the relationship g˜(|x|, A) = Q1 (|x|, A) − g(|x|, A). The bounds are plotted in Fig. 3. The lower bound is obtained by evaluating mutual information for the equi-probable complex constellation {0} ∪

[

0≤k≤⌊A/2⌋−1

{(A − 2k)ejnθk : 0 ≤ n ≤ Nk }

(40)

2π and Nk = ⌊3(A − 2k)⌋. We see that where θk = 3(A−2k) the refined upper bound, valid for SNR less than 4.45 dB, is close to the lower bound. The numerical evaluation of  minβ maxx D pY |X (·|x)kqY (·) is seen to yield best bounds throughout.

5

˜ ) as follows: ing h(X|Y ˜ = y) = − h(X|Y

X X

X

X

X

∆ X

X

2∆

≤−

3∆

X

X

Fig. 4. The two-dimensional constellation A3 . X denotes a constellation point. For x ∈ A3 , (X + U )|X = x is distributed uniformly in the shaded region around x.

C. Analytical lower bound A lower bound on the capacity of real AWGN channels was derived in [12] by using PAM-like constellations. The input was peak-power constrained. By selecting the number of points suitably, PAM achieves rates within a small gap from the average-power constrained capacity 21 log(1 + SNR). For two-dimensions, consider the constellation AN = {0} ∪

N[ −1 n=1

{(n + 0.5)∆ej(l+0.5)θn : l = 0, 1, . . . , 2n}

(41) where N ≥ 2 is a positive integer, ∆ is a positive real number and θn = 2π/(2n + 1). The set AN contains N 2 points including the origin and (2n + 1) equally-spaced points on a circle of radius (n + 0.5)∆ for n = 1, 2, . . . , N − 1. The 9 points of A3 are shown in Fig. 4 for illustration. Define a random variable U jointly distributed with X ∼ ˜ = X +U is uniformly distributed in the Unif(AN ) such that X circle of radius N ∆ around the origin. For the constellation ˜ A3 , the distribution of X|X = x is illustrated through the shading around each point x ∈ A3 in Fig. 4. Specifically, for the constellation AN , U is defined so that (X + U )|X = 0 is uniform in the circle of radius ∆ around the origin, and (X + U )|X = (n + 0.5)∆ej(l+0.5)θn is uniform in the region {(r cos θ, r sin θ) : n ≤

θ r ≤ l + 1}. (42) ≤ n + 1, l ≤ ∆ θn

˜ −X −Y As before, the received value is Y = X +Z. Since X ˜ forms a Markov chain, we have I(X; Y ) ≥ I(X; Y ) by the ˜ Y) data processing inequality. We further lower bound I(X; by using a strategy similar to the real case in [12]. However, unlike the real case, U and X are correlated resulting in additional computations. ˜ Y ) = h(X) ˜ − h(X|Y ˜ ) and h(X) ˜ Since I(X; = ˜ Y ) by first upper boundlog2 πN 2 ∆2 , we lower bound I(X;

Z

Z

p(˜ x|y) log2 p(˜ x|y)d˜ x p(˜ x|y) log2 qy (˜ x)d˜ x

(43)

where qy (˜ x) is any valid density parametrized by y. Choosing 1 −||˜ x−ky||2 /2s2 (with parameters k and s to be qy (˜ x) = 2πs 2e optimized later), we obtain ˜ = y) h(X|Y ≤ log 2πs2 + log2 e ˜ ) h(X|Y = log 2πs2 + log2 e

1 ˜ − ky||2 |Y = y] E[||X 2s2 1 ˜ − kY ||2 ]. E[||X (44) 2s2

˜ = X + U and Y = X + Z, and using the Expanding using X fact that Z is independent of X and U , we have ˜ −kY ||2 ] = E[||X

N 2 ∆2 −2(1+ρN )PN k+(PN +2)k 2 (45) 2

where PN , E[X 2 ], ρN , Re(E[X ∗ U ])/PN , and we have ˜ 2 ] = N 2 ∆2 . To obtain the lowest upper bound used E[||X|| 2 ˜ − kY ||2 ], we set k = k ∗ = (1+ρN )PN and s2 = for E[||X 2+PN 1 ∗ 2 ˜ 2 E[||X − k Y || ]. Simplifying, we have    2 2 2 2 ˜ ) ≤ log πe N ∆ − PN (1 + ρN ) . (46) h(X|Y 2 2 PN + 2 ˜ Y ) by We continue to bound lower I(X;   2 2  N ∆ P 2 (1 + ρN )2 log2 πN 2 ∆2 − log2 πe − N 2 PN + 2   2 P (1 + ρN )2 e . (47) = log2 N 2 − log2 − log2 N 2 − N 2 (1 + PN /2)∆2 ˜

Defining C˜ = log2 (1 + PN /2), and setting N 2 = α2C = α(1 + PN /2) and simplifying, we have   2 2 2 ˜ Y ) ≥ C˜ − log2 e − log2 N − PN (1 + ρN ) . I(X; 2 α N 2 ∆2 (48)  2 2 In Appendix C, we show that PN = ∆ 2N 1 − O(1/N 2 ) , −0.66 ≤ ρN N 2 /(1 − O(1/N 2 )) ≤ −0.64 and provide details of the simplification needed to obtain the following lower bound:    1 1.82 ˜ . (49) +O I(X; Y ) ≥ C − 0.45 − log2 1 + α N2 We see that the gap to the average-power constrained capacity with a finite constellation can be made as small as 0.45 by choosing a large enough α at high rates (large N ). For moderate N , choosing α = 4 results in a gap of less than 1 to capacity. Finally, the rate in (47) is achieved at a peakpower constraint of |X| ≤ (N − 0.5)∆ or an equivalent SNR = (N − 0.5)2 ∆2 /2. This lets one compare the analytical lower bound against the other bounds shown in Fig. 3.

6

V. n-D IMENSIONAL AWGN C HANNEL Consider next the n-dimensinal AWGN channel Y =X +Z

(50)

where Z = [Z1 , Z2 , . . . , Zn ] has independent Gaussian entries with mean 0 and variance 1. The channel density in Cartesian coordinates with x = [x1 , . . . , xn ] and y = [y1 , . . . , yn ] is pY |X (y|x) =

2 1 e−ky−xk /2 . (2π)n/2

(51)

The n-dimensional spherical coordinate system has a radial coordinate r and n−1 angular coordinates φi , i = 1, 2, . . . , n− 1, where the domain of φi , i = 1, 2, . . . , n − 2, is [0, π), and the domain of φn−1 is [0, 2π).2 For a point x with spherical coordinates x = [rx , φx,1 , . . . , φx,n−1 ] we can compute the Cartesian coordinates x = [x1 , . . . , xn ] via x1 = rx cos(φx,1 ) x2 = rx sin(φx,1 ) cos(φx,2 ) x3 = rx sin(φx,1 ) sin(φx,2 ) cos(φx,3 ) .. .

In spherical coordinates the channel density has a complex form due to the many sine and cosine terms. However, by symmetry we may restrict attention to points x with φx,i = 0 for all i. For such x, the channel density in n-dimensional spherical coordinates is simply 2 2 1 e−(ry +rx −2rx ry cos φy,1 )/2 . (52) pY |X (y|x) = (2π)n/2 Consider now the n-dimensional amplitude constraint kXk ≤ A where A > 0. We choose the test density ( β ry ≤ A Vol(A) , (53) qY (y) = 1−β −(ry −A)2 /2 e , ry > A kn (A)(2π)n/2 where Vol(r) is the volume of an n-dimensional ball with radius r and kn (A) is a constant that ensures that qY (·) is a density. Again, the test density is uniform in the interval ry ≤ A and is a “split and scaled” Gaussian density for ry > A. We have3 (54)

where Γ(·) is Euler’s gamma function. To compute kn (A), observe that we require Z ∞Z π Z π Z 2π 2 1 · · · e−(r−A) /2 dV = 1 kn (A)(2π)n/2 A 0 0 0 (55) where dV = rn−1 dr

"n−2 Y i=1

2 See 3 See

#

sinn−1−i (φi ) dφi dφn−1

http://en.wikipedia.org/wiki/N-sphere. http://en.wikipedia.org/wiki/Volume of an n-ball .

the expression for kn (A) in (57) simplifies to  n−1 X n − 1 Γ n−i 2  Ai . kn (A) = i/2 Γ n i 2 2 i=0

(58)

A. McKellips-type Bound

xn = rx sin(φx,1 ) . . . sin(φx,n−2 ) sin(φx,n−1 ).

π n/2 rn Γ (n/2 + 1)

Using the standard integral  Z ∞ Γ n+1 n −ax2 2 x e dx = n+1 2a 2 0

For example, for p n = 1 we have k1 = 1, and for n = 2 we have k2 = 1 + π/2 A.

xn−1 = rx sin(φx,1 ) . . . sin(φx,n−2 ) cos(φx,n−1 )

Vol(r) =

is the spherical volume element in n dimensions. We thus have Z ∞ 2 2  e−(r−A) /2 rn−1 dr kn (A) = n/2 n 2 Γ 2 A ! 2−n Z ∞ n−1 X n − 1 −(r−A)2 2 2 n−1−i i 2  A (r − A) dr e = i Γ n2 A i=0 "n−1  # Z ∞ 2−n X n − 1 2 2 2 n−1−i i −r /2  = A re dr . (57) i Γ n2 0 i=0

(56)

The divergence in (5) can be written as    n D pY |X (·|x)kqY (·) = − log(2πe) − E log qY (Y )  2  Vol(A) Vol(A) − E log q (Y ) . (59) = log Y β (2πe)n/2 β The expectation in the above equation can be simplified and written as Z π 2 2 2 sinn−2 (φ1 ) e−(rx /2) sin φ1  n−1 n−1 2 2 Γ 2 0 Z ∞ ψ (z − rx cos φ1 ) A    (1 − β)Vol(A) (ry − A)2 n−1 log r dr − y dφ1 . y 2 (2π)n/2 β kn (A) (60) For n ≥ 2, define the functions Z π 2 I˜n (x) = n−1 ex cos φ (sin φ)n−2 dφ (61) √  2 2 Γ n−1 2π 0 Z ∞ 2 2 2 Qn (x, A) = e−(z +x )/2 I˜n (zx) z n−1 dz (62) Z A∞ (z − A)2 −(z2 +x2 )/2 ˜ e In (zx) z n−1 dz (63) gn (x, A) = 2  ZA∞  2 2 n (z − A)2 e−(z +x )/2 I˜n (zx) z n−1 dz. g˜n (x, A) = − 2 2 A (64) In terms of the above functions, we can write  Vol(A) D pY |X (·|x)kqY (·) = log (2πe)n/2 β   n/2 (2πe) β kn (A) + log Qn (x, A) − g˜n (x, A). (1 − β)Vol(A)

(65)

7

In Appendix A, we show that g˜n (x, A) is positive. We make the second log term in (65) zero by choosing β=

Vol(A) Vol(A) + (2πe)n/2 kn (A)

(66)

and we obtain the bound   Vol(A) . C ≤ log kn (A) + (2πe)n/2

determined numerically. Choosing P = A2 /n and Pn∗ = (A∗n )2 /n, we obtain the bound √   √  nP Vol C ≤(1 − βn (P )) log kn ( nP ) + βn (P ) log (2πe)n/2 √ √ + He (βn (P )) − g˜( nP , nP ), P ≤ Pn∗ (73) √ √ where βn (P ) = 1 − Qn ( nP , nP ).

We combine this result with the capacity under the (weaker)  average power constraint E kXk2 ≤ A2 . Observe that the C. Volume-based Lower Bound complex noise has power n so the corresponding SNR is P = To obtain a lower bound, we use the volume-based method A2 /n. We thus have the bound √      introduced in [13]. The capacity of the n-dimensional AWGN   channel Y = X + Z with the peak constraint |X| ≤ A is Vol nP √  , n log (1 + P ) C ≤ min log kn ( nP ) +   2 (2πe)n/2 1 sup I(X m ; Y m ) C = lim (74) m→∞ m p(xm )∈Pnm (67) where Pnm is the set of all valid distributions satisfying the n-dimensional peak-power constraint. Now consider

where we measure the rate per n-dimensional symbol. B. Refined Bound

The refinement is similar to the 2-dimensional case. We rewrite D pY |X (·|x)kqY (·) in (65) as follows:  Vol(A) D pY |X (·|x)kqY (·) = log (2πe)n/2 β   (2π)n/2 kn (A)β Qn (x, A) + gn (x, A) + log (1 − β)Vol(A)

(68)

provided that Vol(A) . + Vol(A)

(2π)n/2 kn (A)

(70)

Rewriting (69) as  kn (A) D pY |X (·|x)kqY (·) ≤ Qn (A, A) log n/2 e Vol(A) + gn (A, A) + (1 − Qn (A, A)) log (2πe)n/2 − (1 − Qn (A, A)) log β − Qn (A, A) log(1 − β)

(71)

we see that β = 1 − Qn (A, A) minimizes the RHS of (71). However, from (70) this choice of β is valid only if 1 − Qn (A, A) ≥

Vol(A) + Vol(A)

(2π)n/2 kn (A)

(72)

which requires A < A∗n , where A∗n is the smallest positive value that results in equality in (72). The value A∗n can be

(75)

Using the entropy-power inequality, we have sup

2

e mn h(Y

m p(xm )∈Pn

m

)



sup

2

e mn h(X

m

)

2

+ e mn h(Z

m

)

m p(xm )∈Pn 2

where we have used the relationship g˜n (x, A) = n2 Qn (x, A)− gn (x, A). As shown in Appendix B, the functions Qn (x, A) and gn (x, A) are both increasing in x. Hence, for a positive n/2 kn (A)β log (2π) (1−β)Vol(A) , the RHS of (68) is maximized at x = A, and we obtain the bound  Vol(A) D pY |X (·|x)kqY (·) ≤ log (2πe)n/2 β   (2π)n/2 βkn (A) Qn (A, A) + gn (A, A) (69) + log (1 − β)Vol(A) β≥

1 1 I(X m ; Y m ) = (h(Y m ) − h(Z m )) m m 1 n = h(Y m ) − log(2πe). m 2

m

≥ e mn log(Vol(A)) + elog(2πe)

≥ (Voln (A))2/n + 2πe.

(76)

Therefore, we have sup m p(xm )∈Pn

1 n h(Y m ) ≥ log((Vol(A))2/n + 2πe). m 2

Using (77) in (75), we have   n Vol(A)2/n C ≥ log 1 + 2 2πe   Vol(A) n−2 = log O(A )+ . (2πe)n/2

(77)

(78)

Comparing (78) with the McKellips-type upper bound in (67), we see that two bounds meet as SNR tends to infinity, and they differ by O(An−1 ) inside the logarithm. Thus, for moderate SNR the McKellips-type bound could be improved. Comparing with the refined bound is not straight-forward, and a numerical comparison for n = 2 is shown in Fig. 3. The case n = 4 is interesting because coherent optical communication with two polarizations results in a 4-dimensional signal space. The bounds for n = 4 are plotted in Fig. 5. As expected, the volume-based lower bound meets the McKellips-type bound at high SNR. The analytical refined bound is valid for SNR less than P4∗ ≈ 7.92 dB. Numerical evaluation of minβ maxx D pY |X (·|x)kqY (·) is seen to be close to the lower bound for moderate SNR. For lower SNR, the volumebased lower bound is not expected to be tight.

8

20

15 Capacity bounds

1) at low SNR: the average-power capacity n2 log(1 + P ); 2) at moderate SNR: the refined upper bound which evaluates to Dn (βˆn (A), A); 3) at high SNR: Dn (βn∗ (A), A) which tends to the McKellips-type bound. The exact range of low, moderate, and high SNR depends on the dimension n.

McKellips n log(1 + P ) 2 Refined Numerical Lower bound

10

ACKNOWLEDGMENT G. Kramer and G. B¨ocherer were supported by the German Federal Ministry of Education and Research in the framework of an Alexander von Humboldt Professorship.

5

0 −5

0

5

10 15 SNR (dB)

20

25

Fig. 5. Capacity bounds for 4-dimensional AWGN channels. The rate units are bits per 4 dimensions.

D. Remarks Based on extensive numerical evaluations,  we conjecture that the expression for D pY |X (·|x)kqY (·) in (68), denoted as Vol(A) Dn (β, x) , log + gn (x, A) (2πe)n/2 β   (2π)n/2 kn (A)β Qn (x, A) (79) + log (1 − β)Vol(A) is maximized over x ∈ [0, A] (for a fixed β) at the endpoints x = 0 or x = A. Under this conjecture, we have min max Dn (β, x) = min max(Dn (β, 0), Dn (β, A)). (80) β

x

β

Now, writing Dn (β, x) as Dn (β, x) = −(1 − Qn (x, A)) log β − Qn (x, A) log(1 − β) + terms independent of β

(81)

we see that, for a fixed x and β ∈ [0, 1], Dn (β, x) achieves a minimum at βˆn (x) = 1 − Qn (x, A), decreases with β for β ∈ [0, βˆn (x)], and increases for β ∈ [βˆn (x), 1]. Let βn∗ (A) be the value of β for which Dn (β, 0) = Dn (β, A). Using the above, the minmax in (80) evaluates to min{Dn (βn∗ (A), A), max(Dn (βˆn (0), 0), Dn (βˆn (0), A)), max(Dn (βˆn (A), 0), Dn (βˆn (A), A))}.

(82)

To obtain an expression for βn∗ (A), we simplify Dn (β, 0) = Dn (β, A) resulting in βn∗ (A) =

Vol(A) Vol(A) + (2π)n/2 ecn (A) kn (A)

R EFERENCES

30

(83)

gn (A, A) − gn (0, A) where cn (A) = . Interestingly, as A → Qn (0, A) − Qn (A, A) ∞, cn (A) → −1/2, and we see that βn∗ (A) tends to the McKellips-type expression in (66). Finally, we remark that the best upper bounds are as follows:

[1] J. G. Smith, “The information capacity of amplitude- and varianceconstrained scalar gaussian channels,” Inf. and Control, vol. 18, no. 3, pp. 203 – 219, 1971. [2] S. Shamai and I. Bar-David, “The capacity of average and peak-powerlimited quadrature gaussian channels,” IEEE Trans. Inf. Theory, vol. 41, no. 4, pp. 1060–1071, Jul 1995. [3] A. Tchamkerten, “On the discreteness of capacity-achieving distributions,” IEEE Trans. Inf. Theory, vol. 50, no. 11, pp. 2773–2778, Nov 2004. [4] T. Chan, S. Hranilovic, and F. Kschischang, “Capacity-achieving probability measure for conditionally gaussian channels with bounded inputs,” IEEE Trans. Inf. Theory, vol. 51, no. 6, pp. 2073–2088, June 2005. [5] N. Sharma and S. Shamai, “Characterizing the discrete capacity achieving distribution with peak power constraint at the transition points,” in IEEE Int. Symp. Inf. Theory and Its Appl., Dec 2008, pp. 1–6. [6] Y. Polyanskiy and Y. Wu, “Peak-to-average power ratio of good codes for gaussian channel,” IEEE Trans. Inf. Theory, vol. 60, no. 12, pp. 7655–7660, Dec 2014. [7] A. McKellips, “Simple tight bounds on capacity for the peak-limited discrete-time channel,” in IEEE Int. Symp. Inf. Theory, 2004, pp. 348– 348. [8] I. Csisz´ar and J. K¨orner, Information Theory: Coding Theorems for Discrete Memoryless Systems. Cambridge University Press, 2011. [9] A. Lapidoth and S. M. Moser, “Capacity bounds via duality with applications to multiple-antenna systems on flat fading channels,” IEEE Trans. Inf. Theory, vol. 49, no. 10, pp. 2426–2467, Oct. 2003. [10] A. Lapidoth, S. Moser, and M. Wigger, “On the capacity of free-space optical intensity channels,” IEEE Trans. Inf. Theory, vol. 55, no. 10, pp. 4449–4461, Oct 2009. [11] T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson, Introduction to Algorithms, 2nd ed. McGraw-Hill Higher Education, 2001. [12] L. H. Ozarow and A. D. Wyner, “On the capacity of the Gaussian channel with a finite number of input levels,” IEEE Trans. Inf. Theory, vol. 36, no. 6, pp. 1426–1428, Nov. 1990. [13] V. Jog and V. Anantharam, “An energy harvesting AWGN channel with a finite battery,” in IEEE Int. Symp. Inf. Theory, June 2014, pp. 806–810.

A PPENDIX A We show that the function g˜n (x, A) in (64) is positive. First, we rewrite g˜n (x, A) as Z π 2 2 1 g˜n (x, A) = n−1 sinn−2 (φ) e−(x /2) sin φ  n−1 2 2 Γ 2 0 Z ∞    n − (z − A)2 ψ (z − x cos φ) z n−1 dz dφ. (84) A

Now it suffices to show that the integral over z above is positive. We set z˜ = z − x cos φ, u = A − x cos φ and simplify the integral as Z ∞   n − (˜ z − u)2 (˜ z − u + A)n−1 ψ (˜ z ) d˜ z. (85) u

9

For example, for n = 2 we recover (26). Now (85) can be rewritten as Z ∞ n−1 X n − 1 n−1−i A [n(˜ z − u)i − (˜ z − u)i+2 ]ψ(˜ z )d˜ z. i u i=0 (86) We claim that the integral inside the summation above is positive. In fact, we prove the following stronger inequality for u ≥ 0 and a non-negative integer i: Z ∞ [(i + 1)(z − u)i − (z − u)i+2 ]ψ(z)dz ≥ 0. (87) u

Since n ≥ i + 1 in (86), the inequality (87) implies that the integral in (86) is positive. A proof of (87) is as follows: Z ∞ Z ∞ (z − u)i ψ(z)(z − u)′ dz (z − u)i ψ(z)dz = uZ u ∞ ∞ (a) = (z − u)i+1 ψ(z) u − (z − u)[(z − u)i (−zψ(z)) u

+ i(z − u)i−1 ψ(z)]dz Z ∞ Z ∞ i+2 = (z − u) ψ(z)dz + u (z − u)i+1 ψ(z)dz u u Z ∞ −i (z − u)i ψ(z)dz

where (a) usesRintegration by parts. The above simplifies to ∞ (87) because u u (z − u)i+1 ψ(z)dz ≥ 0. A PPENDIX B We use the double factorial notation ( 2 · 4 · · · n, n : even, n!! = 1 · 3 · · · n, n : odd. P∞ φ)m in the integral of (61), we Using ex cos φ = m=0 (x cos m! get the Taylor series expansion for I˜n (x), n ≥ 2, as I˜n (x) =

Γ

2

∞ X

xm an−2,m m! π m=0

(88)

Using the above am,n in (88) and simplifying, we have ∞ X

k=0

(2k − 1)!! x2k (n + 2k − 2)!! (2k)!

k=1 ∞ X −x /2

(2k − 1)!! z 2k x2k+1 (n + 2k − 2)!! (2k)! k=0   ∞ X z 2k+2 (2k − 1)!! x2k+1 2k −x2 /2 −z . = dn e n + 2k (n + 2k − 2)!! (2k)! k=0 (91) − dn e

2

Using (91) in the definition of Qn (x, A), we see that dQn (x, A) dx Z ∞ X = tn,k (x) k=0

∞ A



 2 z 2k+2 2k z n−1 e−z /2 dz −z n + 2k

(90)

with a suitably-defined dn . We show that the derivatives of the functions Qn (x, A) and gn (x, A) in (62) and (63) with respect to x are non-negative.

(92)

2k+1

(2k−1)!! x where tn,k (x) , dn e−x /2 (n+2k−2)!! (2k)! . We now show that the integral in (91) is non-negative, which implies that the derivative of Qn (x, A) with respect to x is non-negative. We have  n+2k ′ Z ∞ Z ∞ 2 2 z dz (z 2k )z n−1 e−z /2 dz = e−z /2 n + 2k A A Z ∞  2k+2  2 2 −An+2k e−A /2 z = z n−1 e−z /2 dz (93) + n + 2k n + 2k A

where we used integration by parts in the last step. Rearranging n+2k −A2 /2 e the above equation, since A n+2k > 0, we see that the integral in (92) is non-negative. For gn (x, A), a similar approach is used. Using (91) in the definition of gn (x, A), we see that dgn (x, A) = dx Z ∞ X tn,k (x) k=0

Rπ where an,m = 0 sinn φ cosm φ dφ. For odd m, it is easy to see that an,m = 0. For even m, we have  π (m − 1)!! (n − 1)!!   , n : even,  (n + m)!! an,m = 2 (m − 1)!! (n − 1)!! (89)   , n : odd.  (n + m)!! I˜n (x) = dn

d −x2 /2 ˜ (e In (zx)) dx ∞ X 2 (2k − 3)!! z 2k x2k−1 = dn e−x /2 (n + 2k − 2)!! (2k − 2)!

2

u

2(2−n)/2  n−1 √

2 First, the derivative of e−x /2 I˜n (zx) with respect to x in Taylor series form is seen to be

∞ A



z 2k+2 − z 2k n + 2k



(z − A)2 n−1 −z2 /2 z e dz. 2 (94)

A proof for the non-negativity of the integral in (94) is Z ∞ (z − A)2 n−1 −z2 /2 (z 2k ) z e dz 2 A ′  Z ∞ (z − A)2 −z2 /2 z n+2k dz e = 2 n + 2k A Z ∞  2k+2  z (z − A)2 n−1 −z2 /2 = z e dz n + 2k 2 A   Z ∞ 2 z n+2k (z − A)e−z /2 dz (95) − n + 2k A where we used integration by parts in  the last step.2Rearranging z n+2k the above equation, since n+2k (z − A)e−z /2 > 0 for z > A, we see that the integral in (94) is non-negative.

10

Since X is uniform in AN and |AN | = N 2 , we have N −1 2n 1 XX (n + 0.5)2 ∆2 N 2 n=1 l=0   −1 2 N X 1 ∆ (2n + 1) n2 + n + = 2 N n=1 4    2 1 1 ∆ N2 − 1+ 2 . = 2 2 N

PN = E[||X||2 ] =

where we have also used N 2 = α(1 + PN /2). We arrive at (96) (97)

To evaluate ρN = E[X ∗ U ]/PN , we consider E[X ∗ (X + U )] = E[E[X ∗ (X + U )|X]]. Now X ∗ (X + U )|X = 0 is zero with probability 1, and (X + U )|X = (n + 0.5)∆ej(l+0.5)θn is uniform in the region specified in (42). So X ∗ (X + U )|X = (n + 0.5)∆ej(l+0.5)θn is uniform in the region r {(r cos θ, r sin θ) : n(n + 0.5) ≤ 2 ≤ (n + 1)(n + 0.5), ∆ θ −0.5 ≤ ≤ 0.5}. (98) θn A calculation shows that E[X ∗ (X + U )|X = (n + 0.5)∆ej(l+0.5)θn ]   π 1 sinc = ∆2 n2 + n + 3 2n + 1 where sinc x =

sin x x .

(99)

Therefore, we have

E[X ∗ (X + U )]   N −1 2n π 1 1 XX 2 2 sinc ∆ n +n+ = 2 N n=1 3 2n + 1 l=0   N −1 ∆2 X π 1 = 2 sinc . (2n + 1) n2 + n + N n=1 3 2n + 1

(100)

Since E[X ∗ (X + U )] = E[||X||2 ] + E[X ∗ U ], we have   N −1 π 1 ∆2 X sinc (2n + 1) n2 + n + ρN PN = 2 N n=1 3 2n + 1   1 2 − n +n+ . (101) 4 The sequence an = (n2 + n + 1/4) − (n2 + n + 2 1/3)sinc(π/(2n + 1)) is increasing and converges to π 24−2 ≤ 0.33 with a1 ≥ 0.32. We thus have     ρN PN 1 1 . (102) ≤ −0.32 1 − −0.33 1 − 2 ≤ N ∆2 N2 2

2

From (97), using PN = ∆ 2N (1 − O(1/N 2 )), we have   −0.66 1 − O 1/N 2 ≤ ρN N 2 ≤ −0.64 1 − O 1/N 2 . (103) The expression inside the second log term in (48) is N2 P 2 (1 + ρN )2 . − N 2 2 α N ∆

∆2 N 2 2 (1

− 0.5/N 2 − O(1/N 4 )), we have  2   PN 0.5 N PN2 4 = P = − O(1/N ) − 1 1 − N N 2 ∆2 ∆2 N 2 α N2

Since PN =

A PPENDIX C

N2 P 2 (1 + ρN )2 − N 2 2 α N  ∆  N2 N2 = − − 1 (1 − 0.5/N 2 − O(1/N 4 ))(1 + ρN )2 α α 1.82 + O(1/N 2 ) ≥1+ α using ρN N 2 ≥ −0.66(1 − O(1/N 2 )). This explains the final analytical lower bound given in (49).

Suggest Documents