arXiv:0807.1543v2 [cs.IT] 25 Sep 2008
On the Capacity of MIMO Interference Channels Xiaohu Shang
Biao Chen
Gerhard Kramer
H. Vincent Poor
Department of EECS Syracuse University Email:
[email protected]
Department of EECS Syracuse University Email:
[email protected]
Bell Laboratories Alcatel-Lucent Email:
[email protected]
Department of EE Princeton University Email:
[email protected]
Abstract— The capacity region of a multiple-input-multipleoutput interference channel (MIMO IC) in which the channel matrices are square and invertible is studied. The capacity region for strong interference is established where the definition of strong interference parallels that of scalar channels. Moreover, the sum-rate capacity for Z interference, noisy interference, and mixed interference is established. These results generalize known results for the scalar Gaussian IC.
z1 H1
x1
0 This research was supported in part by the National Science Foundation under Grants CCF-05-46491, ECS-05-01534, ANI-08-38807 and CNS-0625637.
y1
H3 x2
H2 +
y2
H4
I. I NTRODUCTION The interference channel (IC) models the situation in which transmitters communicate with their respective receivers while generating interference to all other receivers. This channel model was mentioned in [1, Section 14] and its capacity region is still generally unknown. In [2] Carleial showed that interference does not reduce capacity when it is very strong. This result follows because the interference can be decoded and subtracted at each receiver before decoding the desired message. Later Han and Kobayashi [3] and Sato [4] showed that the capacity region of the strong interference channel is the same as the capacity region of a compound multiple access channel. In both above cases, the interference is fully decoded at both receivers. When the interference is not strong, the capacity region is unknown. The best inner bound is by Han and Kobayashi [3], which was later simplified by Chong et al. in [5], [6]. Etkin et al. and Telatar and Tse showed that Han and Kobayashi’s inner bound is within one bit of the capacity region of scalar Gaussian ICs [7], [8]. Various outer bounds have been developed in [7]–[12]. Special ICs such as the degraded IC and the ZIC have been studied in [13], [14]. The sum-rate capacity for the ZIC was established in [13], [15], and Costa proved the equivalence of the ZIC and the degraded IC for the scalar Gaussian case [14]. A recent result in [10]–[12] has shown that if a simple condition is satisfied, then treating interference as noise can achieve the sum-rate capacity. [11] and [16] derived the sum-rate capacity for mixed interference, i.e., one receiver experiences strong interference and the other experiences weak interference. In this paper, we study the sum-rate capacity of the two-user Gaussian multiple-input-multiple-output (MIMO) IC shown in Fig. 1. The received signals are defined as
+
z2 Fig. 1.
The MIMO IC.
y 1 = H1 x 1 + H2 x 2 + z 1 y 2 = H3 x 1 + H4 x 4 + z 2 ,
(1)
where x i , i = 1, 2, is the transmitted signal of user i which is subject to an average block power constraint Pi ; z i , i = 1, 2 is a Gaussian random vector with zero mean and identity covariance matrix; and Hj , j = 1, . . . , 4, are the channel matrices. For simplicity, we assume that the Hj ’s are real and that H1 and H4 are invertible. However we remark that one can generalize our results to non-invertible or rectangular channel matrices (see Remark 1). For the MIMO IC Telatar and Tse [8] showed that Han and Kobayashi’s region is within one bit per receive antenna of the capacity region. Some upper bounds were discussed in [17] and some lower bounds on the sum-rate capacity based on Han and Kobayashi’s region were discussed in [18]. However capacity results for the MIMO IC are still lacking. In our work, assuming the channel matrices are invertible, we derive the sum-rate capacity with noisy-interference, strong interference and mixed interference, as well as one-sided interference. The capacity region of the MIMO IC with strong interference is also obtained. The rest of the paper is organized as follows: we present our main results and proofs in Section II and III; numerical results are given in Section IV, and we conclude in Section V. Before proceeding we introduce some notation which will be used in the paper. • Italic font X denotes a scalar; and the bold fonts x and X denote vectors and matrices respectively. • A B means that A − B is positive semi-definite. • I denotes the identity matrix and 0 denotes the zero matrix. T H −1 • |X|, X , X , X , X−T denote respectively the de-
• • • •
•
terminant, transpose, conjugate transpose, inverse, and transpose T matrix X. inverse of the is a long vector which consists x n = x T1 , x T2 , . . . , x Tn of a sequence of vectors x i , i = 1, . . . , n. ||S|| denotes the size of the set S. abs(·) denotes the absolute value. x ∼ N (0, Σ) means that the random vector x is Gaussian distributed with zero mean and covariance matrix Σ. E[·] denotes expectation; Cov(·) denotes covariance matrix; I(·; ·) denotes mutual information; h(·) denotes differential entropy with the logarithm base e and log(·) = loge (·). II. M AIN R ESULTS
Theorem 1: For the MIMO IC defined in (1), and where the channel matrices H1 and H4 are square and invertible, the sum-rate capacity is achieved by treating interference as noise at both receivers if for any covariance matrices Si , i = 1, 2, with tr(Si ) ≤ Pi , the following conditions are satisfied: 1 1 1 (2) max abs αH M− 2 W1 M− 2 α ≤ , 2 αH α=1 and 1 1 1 max abs αH M− 2 W2 M− 2 α ≤ , 2 αH α=1
(3)
Remark 3: Theorem 1 is also valid by replacing the power constraint with the covariance matrix constraint. This extension applies to all the following theorems. Theorem 2: For the MIMO IC defined in (1), and where the channel matrices H1 and H4 are square and invertible, the sum-rate capacity is achieved by treating interference as noise at both receivers, if for any covariance matrices Si , i = 1, 2, with tr(Si ) ≤ Pi , there exist symmetric positive definite matrices Σ1 and Σ2 satisfying the following conditions T Σ1 I − A2 Σ−1 and 2 A2 T Σ2 I − A1 Σ−1 A , 1 1
where A1 and A2 are defined in Theorem 1. Theorem 2 is another description of a sufficient condition for single-user detection to be sum-rate optimal. It can be shown that for the scalar case, (11) and (12) reduce to (10). Theorem 3: For the MIMO IC defined in (1) with H3 = 0 and H2 and H4 square and invertible, the sum-rate capacity is −1 1 log I + H1 S1 HT1 I + H2 S2 HT2 C ∗ = max tr(S1 )≤P1 2 tr(S2 )≤P2
1 + log I + H4 S2 HT4 , (13) 2 if the following condition is satisfied HT2 H2 ≺ HT4 H4 .
where M = I − A1 AT1 − A2 AT2 ,
W1 = AT1 AT2 , W2 = AT2 AT1 ,
T
T and A1 = I + H2 S2 H2 H−T 1 H3 −T T T A2 = I + H3 S1 H3 H4 H2 .
(4) (5) (6) (7) (8)
The sum-rate capacity is the solution of the following optimization problem −1 1 log I + H1 S1 HT1 I + H2 S2 HT2 max 2 −1 1 + log I + H4 S2 HT4 I + H3 S1 HT3 2 subject to tr(S1 ) ≤ P1 , tr(S2 ) ≤ P2 , S1 0, S2 0.
(11) (12)
(9)
(14)
Furthermore, C∗ =
max
tr(S1 )≤P1 ,tr(S2 )≤P2
1 T T log I + H1 S1 H1 + H2 S2 H2 , 2 min (15) 1 log I + H S HT + 1 log I + H S HT 1 1 1 4 2 4 2 2 if HT2 H2 HT4 H4 .
(16)
Theorem 3 gives the sum-rate capacity of a MIMO ZIC. Specifically, when HT2 H2 ≺ HT4 H4 we consider the interference to be weak and the sum-rate capacity can be achieved by treating the interference as noise. When HT2 H2 HT4 H4 we consider the interference to be strong and the sum-rate capacity can be achieved by fully decoding the interference. Theorem 4: For the MIMO IC defined in (1), and where the channel matrices are square and invertible, if HT2 H2 HT4 H4 and HT3 H3 HT1 H1 , then the sum-rate capacity is
Remark 1: Theorem 1 can be generalized to the MIMO ICs with the channel matrices H1 and H4 being non-invertible or rectangular. In those cases, two additional conditions must be satisfied such that the matrices A1 and A2 exist. This result C∗ = max will be reported in a subsequent paper. tr(S1 )≤P1 ,tr(S2 )≤P2 Remark 1 √ scalar case, if we have H1 = H4 = 1, √ 2: In the log I + H1 S1 HT1 + H2 S2 HT2 , H2 = a, H3 = b, from (2) and (3) we obtain 2 √ 1 √ min log I + H3 S1 HT3 + H4 S2 HT4 , a(1 + bP1 ) + b(1 + aP2 ) ≤ 1. (10) 2 1 log I + H1 S1 HT1 + 1 log I + H4 S2 HT4 Therefore Theorem 1 is an extension of the noisy-interference 2 2 sum-rate capacity of the scalar IC [10]–[12] to the MIMO IC.
(.17)
Theorem 4 shows that if HT2 H2 HT4 H4 and HT3 H3 is satisfied, then the receivers experience strong interference. Thus the channel acts as a compound MIMO multiple access channel and the sum-rate capacity is achieved by fully decoding the interference at both users. Theorem 5: For the MIMO IC defined in (1), and where the channel matrices are square and invertible, if HT2 H2 ≺ HT4 H4 and HT3 H3 HT1 H1 , then the sum-rate capacity is HT1 H1
C∗ =
min
1=1
max
tr(S1 )≤P1 ,tr(S2 )≤P2
1 log I + H3 S1 HT3 + H4 S2 HT4 , 2 −1 1 log I + H1 S1 HT1 I + H2 S2 HT2 2 1 + log I + H4 S2 HT4 2
optimal solution of the following optimization problem t h21i P1i 1X log 1 + max 2 i=1 1 + h22i P2i h24i P2i + log 1 + 1 + h23i P1i t t X X subject to P1i ≤ P1 , P2i ≤ P2
1=1
P1i ≥ 0,
P2i ≥ 0.
(19)
Then the sum-rate capacity is . (18)
Theorem 5 gives the sum-rate capacity of the MIMO IC with mixed interference HT2 H2 ≺ HT4 H4 and HT3 H3 HT1 H1 . The sum-rate capacity is achieved by treating interference as noise at the receiver that experiences weak interference and fully decoding the interference at the receiver that experiences strong interference. Theorem 6: For the MIMO IC defined in (1) with H3 = 0 and all other channel matrices being square and invertible, if HT2 H2 HT4 H4 , then the capacity region is 1 T R1 ≤ log I + H1 S1 H1 2 [ 1 T R2 ≤ log I + H4 S2 H4 2 tr(S1 )≤P1 1 T T tr(S2 )≤P2 R1 + R2 ≤ log I + H1 S1 H1 + H2 S2 H2 2 Theorem 7: For the MIMO IC defined in (1), and where the channel matrices H2 and H4 are square and invertible, if HT2 H2 HT4 H4 and HT3 H3 HT1 H1 , the capacity region is 1 I + H1 S1 HT log R ≤ 1 1 2 1 T [ R2 ≤ log I + H4 S2 H4 2 1 R + R ≤ log I + H S HT + H S HT 1 1 1 2 2 2 2 tr(S1 )≤P1 1 2 tr(S2 )≤P2 1 R1 + R2 ≤ log I + H3 S1 HT + H4 S2 HT 3 4 2 Theorems 6 and 7 give the capacity region of the MIMO ZIC and MIMO IC under strong interference. Finally we connect the MIMO IC with the parallel Gaussian interference channel (PGIC), which is a special case of (1) with all Hi ’s being diagonal matrices. In [19] we present conditions for which single user detection for each sub-channel is sum-rate optimal under the assumption that the coding and decoding is independent across sub-channels. The following theorem proves that independent coding and decoding is indeed sum-rate optimal under noisy-interference if some conditions are satisfied. Theorem 8: For the MIMO IC defined in (1) with Hi = ∗ ∗ diag (hi1 , . . . , hit ) , i = 1, . . . , 4, let P1i and P2i be the
C∗ =
t X
∗ ∗ Ci (P1i , P2i )
(20)
i=1
t ∗ ∗ 1X h21i P1i h24i P2i = log 1 + + log 1 + ∗ ∗ 2 i=1 1 + h22i P2i 1 + h23i P1i if ∗ ∗ abs (h1i h2i ) 1 + h23i P1i + abs (h3i h4i ) 1 + h22i P2i ≤ abs (h1i h4i ) , (21) and t \
i=1
∗ ∗ ∂Ci (P1i , P2i ) 6= φ,
(22)
∗ ∗ for all i = 1, . . . , t, where ∂Ci (P1i , P2i ) denotes the subdiffer∗ ∗ ential of Ci (·, ·) at point (P1i , P2i ), and φ denotes the empty set. The notion of subdifferential follows that in [20]. Theorem 8 illustrates that if each sub-channel (each antenna pair in MIMO IC) satisfies the noisy-interference condition, then independent decoding at each sub-channel with singleuser detection achieves the sum-rate capacity. Theorem 8 shows the conditions for independent coding and single-user detection across sub-channels to be optimal.
III. P ROOF
OF THE
M AIN R ESULTS
A. Preliminaries We introduce some lemmas that we use to prove our main results. Lemma 1: [21, Lemma 1] Let x 1 , . . . , x n be zero-mean random vectors and denote the covariance matrix of the T stacked vector x T1 , . . . , x Tn as K. Let S be a subset of {1, 2, . . . , n} and S¯ be its complement. Then we have xS |x x S¯ ) ≤ h x ∗S x ∗S¯ , (23) h (x T T where x 1 , . . . , x Tn ∼ N (0, K). T Lemma 2: Let x ni = x Ti,1 , . . . , x Ti,n , i = 1, . . . , k, be k stacked random vectors T each of which consists of n vectors. Let y n = y T1 , . . . , y Tn be n Gaussian random vectors with covariance matrix k X i=1
xni ) = Cov (yy n ) , λi Cov (x
(24)
Pk where i=1 λi = 1, λi ≥ 0. Let S be a subset of {1, 2, . . . , n} and S¯ be its complement. Then we have k X i=1
λi h x i,S x i,S¯ ≤ h (yy S |yy S¯ ) .
(25)
The proof of Lemma 2 is given in the appendix. Lemma 2 shows the concave-like property of the conditional entropy xS |x xS¯ ) over the covariance matrix Cov (x xn ). h (x Consider a special case of Lemma 2 with n = 1, S¯ being the empty set and λi = 1/k. We obtain the following lemma. Lemma 3: Let x k be a set of k random vectors. Then b∗ , h xk ≤ k · h x (26) ∗
b is a Gaussian vector with the covariance matrix where x 1 b∗ = Cov x k
k X
xi ) . Cov (x
(27)
i=1
¯ = 1 and λi = 1/k. We obtain Let n = 2, ||S|| = ||S|| another special case of Lemma 2. Lemma 4: Let x k and y k be two sequences of random vectors. Then we have ∗ ∗ b , (28) h y k x k ≤ k · h yb x
Lemma 7:
(30)
If Kz ≻ 0, then equality is achieved if and only if z ∼ N (0, Kz ). Lemma 6: Let x n be a sequence of n zero mean random vectors. Let z and z˜ be two independent Gaussian random vectors and z n and z˜n be two sequences of random vectors each independent and identically distributed (i.i.d.) as z and z˜ respectively, then x n + z n ) − h (x xn + z n + z˜n ) h (x b∗ + z − nh x b∗ + z + z˜ , ≤ nh x
The proof is given in the Appendix.
0 if and only if B AT A.
has a positive definite solution X if and only if 1 1 1 max abs αH M− 2 WM− 2 α ≤ . 2 αH α=1
(33)
(34)
B. Proof of Theorem 1 Suppose the channel is used n times. The transmit and receive vector sequences are denoted by x ni and y ni for user i, i = 1, 2. For the jth use of the channel, the covariance matrix of x i,j is denoted as Si,j , j = 1, . . . , n, and we use the power constraints n X j=1
tr (Si,j ) ≤ nPi .
(35)
From Fano’s inequality we have that the achievable sum rate R1 + R2 must satisfy n(R1 + R2 ) − nǫ xn1 ; y n1 ) + I (x xn2 ; y n2 ) ≤ I (x xn1 ; y n1 , H3x n1 + n n1 ) + I (x xn2 ; y n2 , H2x n2 + n n2 ) ≤ I (x nn1 ) + h (yy n1 |H3x n1 + n n1 ) = h (H3x n1 + n n1 ) − h(n nn2 ) nn1 ) + h (H2x n2 + n n2 ) − h(n −h (H2x n2 + z n1 |n
nn2 ) +h (yy n2 |H2x n2 + n n2 ) − h (H3x n1 + z n2 |n (36) T where z ni = z Ti,1 , z Ti,2 , . . . , z Ti,n , i = 1, 2, with all n the T z i,jT, j = 1,T . .. , n independent of each other. n i = n i,1 , n i,2 , . . . , n i,n , and n i,j are i.i.d. Gaussian vectors with zero mean and covariance matrices Σi . We further let n i to be correlated with z i , and E z in Ti = Ai . We can write the joint distribution of z i and n i as I Ai zi , i = 1, 2, (37) ∼ N 0, ATi Σi ni and we have T ni ) = I − Ai Σ−1 Cov (zz i |n i Ai .
(38)
T Σ1 = I − A2 Σ−1 2 A2 ;
(39)
n1 ) = Cov (zz 2 |n n2 ) . Cov (n
(40)
Let
(31)
b∗ is a zero mean Gaussian random vector with where x covariance matrix n 1X xi ) . b∗ = Cov (x (32) Cov x n i=1
A B
X + WH X−1 W = M
i=1
x∗ ; x ∗ + z ) ≥ I (x x∗ ; x ∗ + z ∗ ) . I (x
I AT
The proof is omitted. Lemma 8: [23, Theorem 5.2] Suppose W is nonsingular and M is positive definite. Then the matrix equation
b∗ and yb∗ are Gaussian vectors with the joint covariance where x matrix ∗ k 1X b x xi Cov = . (29) Cov ∗ yb yi k
The proof straightforward from Lemma 2 by noticing that isP k x i ). h y k x k ≤ i=1 h (yy i |x Lemma 5: [22, Lemma II.2] Let x ∗ ∼ N (0, Kx ), and let z and z ∗ be real random vectors (independent of x ∗ ) with the same covariance matrix Kz . If z ∗ ∼ N (0, Kz ), and z has any other distribution with covariance matrix Kz then
so we have
Since n 1,j is independent of n 1,k and z 2,j is independent of n 2,k for any j 6= k, we have nn1 ) = Cov (zz n2 |n nn2 ) . Cov (n
(41)
Therefore we have nn2 ) = 0. h (H3x n1 + n n1 ) − h (H3x n1 + z n2 |n
(42)
Similarly, let Σ2 = I − A1 Σ1−1 AT1 ;
(43)
so we have nn1 ) = 0. h (H4x n2 + n n2 ) − h (H2x n2 + z n1 |n
(44)
Therefore if (39) and (43) hold, (42) and (44) are constants regardless of the distribution of x n1 and x n2 . Then we can write nn2 ) h (H3x n1 + nn1 ) − h (H3xn1 + z n2 |n ∗ b∗1 + z 2 |n n2 (45) b1 + n 1 − nh H3x = nh H3x nn1 ) h (H2x n2 + n n2 ) − h (H2x n2 + z n1 |n b∗2 + z 1 |n n1 , (46) b∗2 + n 2 − nh H2x = nh H2x ∗
∗
b1 and x b2 are zero mean Gaussian vectors with respecwhere x tive covariance matrices n 1X ∗ b ∗1 , x b Cov 1 = S1,i , S (47) n i=1 and
n 1X b∗ . b∗2 = S2,i , S Cov x 2 n i=1
(48)
Next by Lemma 4 we have
h (yy n1 |H3x n1 + n n1 ) xn1 + H2x n2 + z n1 |H3x n1 + n n1 ) = h (Hx b∗1 + H2x b∗2 + z 1 H3x b∗1 + n 1 ≤ nh H1x n b∗ T b∗ T b∗ T = log H1 S 1 H1 + H2 S2 H2 + I − H1 S1 H3 + A1 2 n −1 ∗ T T ∗ T b b H3 S1 H1 + A1 + log 2π.(49) · H3 S1 H3 + Σ1 2
Similarly, we obtain
h (yy n2 |H2x n2 + n n2 ) n b∗ T b∗ T b∗ T ≤ log H4 S 2 H4 + H3 S1 H3 + I − H4 S2 H2 + A2 2 n −1 ∗ T T ∗ T b b H2 S2 H4 + A2 + log 2π.(50) · H2 S2 H2 + Σ2 2
On substituting (45)-(50) into (36) we have
R1 + R2 − ǫ 1 1 b∗ T log |Σ1 | ≤ log H3 S 1 H 3 + Σ1 − 2 2 1 b∗ T −1 T − log H2 S 2 H2 + I − A1 Σ1 A1 2 1 b∗ T ∗ T ∗ T b b + log H1 S S H + A S H + I − H H + H 1 1 1 3 2 2 2 1 1 2 −1 b ∗1 HT1 + AT1 b ∗1 HT3 + Σ1 H3 S · H3 S 1 b∗ T 1 log |Σ2 | + log H2 S 2 H2 + Σ2 − 2 2
1 b∗ T −1 T − log H3 S H + I − A Σ A 2 2 1 3 2 2 1 b∗ T b ∗ HT + A2 b ∗ HT + I − H4 S S H + H + log H4 S 3 2 2 1 3 2 4 2 −1 b ∗ HT + AT b ∗ H T + Σ2 H2 S · H2 S 2 4 2 2 2 (a) 1 1 b∗ T = log H3 S log |Σ1 | 1 H 3 + Σ1 − 2 2 1 b ∗ HT − 1 log I − H−T HT Σ−1 AT − log I + H2 S 2 2 3 1 1 1 2 2 1 ∗ T b ∗2 HT2 + 1 log |I b + log I + H1 S1 H1 + H2 S 2 2 −1 T −T T ∗ T T ∗ T b b H3 S1 H1 + A1 −H1 H3 H3 S1 H3 + Σ1 +··· 1 (b) 1 b∗ T log |Σ1 | = log H3 S 1 H 3 + Σ1 − 2 2 1 b ∗ H2 − 1 log I − Σ−1 AT H−T HT − log I + H2 S 2 1 3 1 1 2 2 1 b ∗2 HT2 + 1 log |I b ∗1 HT1 + H2 S + log I + H1 S 2 2 −1 T −T T ∗ T T ∗ T b b H +Σ H H S H + A H − H3 S 3 1 1 3 1 1 3 1 1 +··· −1 1 ∗ T ∗ T b b = log I + H1 S1 H1 I + H2 S2 H2 2 −1 1 b ∗1 HT3 b ∗2 HT4 I + H3 S + log I + H4 S 2 (c) 1 −1 ≤ log I + H1 S∗1 HT1 I + H2 S∗2 HT2 2 −1 1 + log I + H4 S∗2 HT4 I + H3 S∗1 HT3 , (51) 2 where in (a) we let b ∗ HT H−T HT , (52) A1 = I + H2 S 3 2 2 1 and
b ∗1 HT3 H−T HT2 . A2 = I + H3 S 4
(53)
Σ1 AT1 A1 ,
(54)
Equality (b) is from the fact |I − UV| = |I − VU|. Inequality (c) is from the assumption that S∗1 and S∗2 optimize (9) and b ∗ = S∗ and S b ∗ = S∗ . the equality holds when S 1 1 2 2 The above sum rate in (51) is also achievable by treating interference as noise at each receiver, therefore the sum-rate capacity is (51), if there exist Gaussian vectors n 1 and n 2 with distribution in (37) that satisfies (39), (43), (52) and (53). We consider the existence of n 1 . From Lemma 7, n 1 exists if and only if
with A1 defined in (52). From (43) and Woodbury identity [24]: −1 A + CBCT
= A−1 − A−1 C B−1 + CT A−1 C
−1
CT A−1 , (55)
(c)
we have
=
T Σ−1 2 = I − A1 −Σ1 + A1 A1
On substituting (56) into (39) we have
−1
AT1 .
Σ1 = I − A2 AT2 + A2 A1 AT1 A1 − Σ1 Define X = Σ1 − AT1 A1
−1
(56)
2
AT1 AT2 . (57)
(59)
C. Proof of Theorem 2 In the proof of Theorem 1, we let (39) and (43) hold, and obtain (45) and (46). On the other hand, by Lemma 6, if (11) and (12) hold then we can still obtain (45) and (46). The rest of the proof of Theorem 2 is the same as the proof of Theorem 1. Therefore, treating interference as noise is sum-rate capacity achieving if there exist Σ1 and Σ2 that satisfy (11) and (12). D. Proof of Theorem 3 We provide two proofs of the first part of Theorem 3, i.e., ≺ HT4 H4 . The first proof applies the same genie-aided method we used in the proof of Theorem 1. The second proof does not need a genie and is based on Lemma 6. 1) Genie-aided proof: This proof is similar to the proof of Theorem 1 but much simpler. Assume a Gaussian vector n which has joint distribution with z 1 as z1 I A ∼ N 0, . (60) n AT Σ HT2 H2
Let n n be a sequence of n column random vectors with each n i being i.i.d. Then from Fano’s inequality we have n(R1 + R2 ) − nǫ xn1 ; y n1 ) + I (x x n2 ; y n2 ) ≤ I (x xn1 ; y n1 , H1x n1 + n n ) + I (x x n2 ; y n2 ) ≤ I (x = h (H1x n1 + n n ) + h (H1x n1 + H2x n2 + z n1 |H1x n1 + n n )
1
1
2
2
− log |Σ|) n b ∗1 HT1 A−1 + log I + H4 S b ∗2 HT4 = log I + H1 S 2 −1 n ∗ T ∗ T b b log I + H1 S1 H1 I + H2 S2 H2 = 2 b ∗ HT + log I + H4 S 2 4 −1 n ≤ · max log I + H1 S1 HT1 I + H2 S2 HT2 2 tr(S1 )≤P1 tr(S2 )≤P2 + log I + H4 S2 HT4 (61)
(58)
Equation (59) is a special case of a discrete algebraic Ricatti equation [23]. From Lemma 8, with M symmetric and positive definite, (59) has symmetric positive definite solution X if and only if (2) holds. Therefore n 1 exists with condition (2). Similarly, n 2 exists with condition (3). Therefore if (2) and (3) hold for any Si satisfying the power constraint, for any choice of Sij , i = 1, 2, j = 1, . . . , n, the sum rate must satisfy (51). This completes our proof.
4
−1 ∗ T T b b ∗ HT + Σ b ∗ HT + A H1 S S H + A H − H1 S 1 1 1 1 1 1 1
and substitute (4) and (5) into (57). We then have the following matrix equation: X + W1T X−1 W1 = M.
n b ∗ T b∗ T −1 T H + I − AΣ A log H1 S1 H1 + Σ − log H2 S 2 2 2 b∗ T b∗ T b ∗ HT + I H + H2 S H + I + log H1 S + log H4 S
b∗i is zero mean where, (a) is from Lemmas 3 andP 4, and x n 1 ∗ x i ) = n j=1 Cov (x x i,j ), i = 1, 2; Gaussian vector with Cov (x in (b) we let −T H−1 = H−1 I − AΣ−1 AT H−T (62) 4 H4 2 2 , and thus
n ) + h (H4x n2 + z n2 ) −h (H2x n2 + z n1 |n
= −n log (abs |H2 |) + n log (abs |H4 |) b∗2 + z 2 ; (63) b∗2 + z 1 |n n + nh H4x = −nh H2x
in (c) we let
b ∗ H2 . A = I + H2 S 2
(64)
In order that all the equalities in (61) hold, there must exist n such that the covariance matrix in (60) satisfies (62) and (64). From (62) and (64) we have −T T (65) Σ = AT I − H2 H−1 4 H4 H2 A. Therefore n exists if and only if
−T T I − H2 H−1 4 H4 H2 ≻ 0,
(66)
which is equivalent to HT2 H2 ≺ HT4 H4 .
(67)
2) Proof based on Lemma 6: Starting from Fano’s inequality we have
nn ) + h (H4x n2 + z n2 ) − h (n nn ) − h (zz n2 ) −h (H2x n2 + z n1 |n (a) b∗1 + n b∗2 + z 1 H1x b∗1 + nH2x b∗1 + n + nh H1x ≤ nh H1x nn ) + h (H4x n2 + z n2 ) − nh (n n) − nh (zz 2 ) −h (H2x n2 + z n1 |n (b) b∗1 + n b∗2 + z 1 H1x b∗1 + nH2x b∗1 + n + nh H1x ≤ nh H1x n) − nh (zz 2 ) b∗2 + z 2 − nh (n n + nh H4x b∗2 + z 1 |n −nh H2x
n(R1 + R2 ) − nǫ xn1 ; y n1 ) + I (x x n2 ; y n2 ) ≤ I (x = h (H1x n1 + H2x n2 + z n1 ) − h (H2x n2 + z n1 ) +h (H4x n2 + z n2 ) − h (zz n2 ) (a) b∗2 + z 1 b∗1 + H2x ≤ nh H1x −1 n n n z n2 ) −h x n2 + H−1 2 z 1 + h x 2 + H4 z 2 − h (z −n log (abs |H2 |) + n log (abs |H4 |) b∗2 + z n1 − nh x b∗1 + H2x b∗2 + H−1 ≤ nh H1x 2 z1
(b)
b∗2 + H−1 z 2) +nh x 4 z 2 − h (z −n log (abs |H2 |) + n log (abs |H4 |) −1 n T ∗ T b b = log I + H1 S1 H1 I + H2 S2 H2 2 n b ∗ HT + log I + H4 S 2 4 2 n −1 n log I + H1 S1 HT1 I + H2 S2 HT2 ≤ max 2 tr(S1 )≤P1 tr(S2 )≤P2 o n + log I + H4 S2 HT4 , (68) 2 x b∗i is zero mean Gaussian where (a) is from Lemma 3 and P n xi,j ); (b) is from x∗i ) = n1 j=1 Cov (x vector with Cov (x Lemma 6. Next we prove the second part of Theorem 3. The achievability of the sum rate is straightforward by letting the first receiver decode both messages. We need only to show the converse. Start from Fano’s inequality and notice that −T −T H−1 H−1 2 H2 4 H4 , then the second and third terms of (b) in (68) become −1 n n n −h x n2 + H−1 2 z 1 + h x 2 + H4 z 2 −1 n n n ˜n = −h x n2 + H−1 2 z 1 + h x 2 + H2 z 1 + z n ˜n = I z˜n ; x n2 + H−1 2 z1 + z n ˜n ≤ I z˜n ; H−1 2 z1 + z −1 n n (69) = −h H−1 2 z 1 + h H4 z 2 , −T −T . On substituting where z˜ ∼ N 0, H−1 − H−1 4 H4 2 H2 (69) back into (68) we have 1 b 2 HT . (70) b ∗ HT + H2 S R1 + R2 − ǫ ≤ log I + H1 S 2 1 1 2 On the other hand, we have n(R1 + R2 ) − nǫ x n1 ; y n1 |x xn2 ) + I (x x n2 ; y n2 ) ≤ I (x 1 b ∗ HT + 1 log I + H4 S b ∗ HT . (71) ≤ log I + H1 S 1 1 2 4 2 2 From (70) and (71) we have R1 + R2− ǫ 1 b 2 HT b 1 HT + H2 S log I + H1 S 2 1 2 1 ≤ min 1 T b 1 H + log I + H4 S b 2 HT log I + H1 S 1 4 2 2 ≤ max min Cov(S1 )≤P1 Cov(S2 )≤P2
1 log I + H1 S1 HT1 + H2 S2 HT2 2 1 1 log I + H1 S1 HT1 + log I + H4 S2 HT4 2 2 E. Proof of Theorem 4
F. Proof of Theorem 5 The achievability part is straightforward by letting user 2 first decode message from user 1 and then decode its own message, and user 1 treat signals from user 2 as noise. To prove the converse, we first let H3 = 0 and use the first part of Theorem 3, and then let H2 = 0 and use the second part of Theorem 3. We obtain R1 + R2 ≤
max
tr(S1 )≤P1 ,tr(S2 )≤P2
1 log I + H3 S1 HT3 + H4 S2 HT4 , 2 1 −1 log I + H1 S1 HT1 I + H2 S2 HT2 2 . (73) 1 I + H4 S2 HT4 , + log 2 1 log I + H1 S1 HT + 1 log I + H4 S2 HT 1 4 2 2 We complete the proof by pointing out that the last line of (73) is redundant because of the second line. G. Proof of Theorems 6, 7 and 8 Theorems 6 and 7 are consequences of Theorems 3 and 4 respectively. The proof is straightforward and hence is omitted. The proof of Theorem 8 is also omitted due to the lack of space. IV. N UMERICAL R ESULT Consider a symmetric MIMO IC with two transmit antennas and two receive antennas. Let H1 = H2 = I, H2 = H3 = √ λ1 ρ , where a varies from 0 to 1. Fig. 2 shows a ρ λ2 the noisy-interference sum-rate capacity v.s. a, for different λ1 , λ2 and ρ. There is a range of a, within which the channel has noisy interference. Fig. 2 shows that the range of a and the sum-rate capacity decrease as the norm of H2 and H3 increases. V. C ONCLUSION We have extended the capacity results on scalar ICs to MIMO ICs and have obtained the sum-rate capacity of the MIMO IC with noisy-interference, strong interference, and Zinterference, and the capacity region of the MIMO IC with strong interference. ACKNOWLEDGMENT We gratefully acknowledge S. Annapureddy and V. Veeravalli from the University of Illinois at Urbana Champaign for pointing out an error in the original manuscript related to Theorems 1 and 2. A PPENDIX
. (72)
The achievability is straightforward by letting both receivers decode both messages. We need only to show the converse, which can be shown by setting H2 = 0 and H3 = 0, respectively, and using Theorem 3.
min
A. Proof of Lemma 2 Define a discrete random variable E with the distribution PE (E = i) = λi ,
i = 1, . . . , k.
(74)
Let the conditional distribution of x n be xn |E = i ) = pxni (x x ni ) . pxn |E (x
(75)
Then the probability density function of x n is
i=1
λ1 = 0.5, λ2 = 1, ρ = 0.3
λi · pxni .
0.815
(76)
Therefore xn ) = Cov (x
k X
xni ) = Cov (yy n ) . λi Cov (x
(77)
i=1
Then from Lemma 1 we have xS |x x S¯ ) ≤ h (yy S |yy S¯ ) . h (x
(78)
From (75) we have
Sum−rate capacity (nat per channel use)
pxn =
k X
0.82
λ1 = λ2 = 1,
ρ = 0.3
λ1 = λ2 = 1, ρ = 0.9 0.81
0.805
0.8
0.795
0.79
0.785
0.78
xS |x xS¯, E ) h (x k X PE (E = i)h x i,S x i,S¯ = =
i=1 k X i=1
λi h x i,S x i,S¯
xS |x xS¯ ) . ≤ h (x
0.775
i=1
λi h x i,S x i,S¯ ≤ h (yy S |yy S¯ ) .
(79)
(80)
B. Proof of Lemma 6 xn + z n ) − h (x x n + z n + z˜n ) h (x n n = −I (˜ z ; x + z n + z˜n ) (a)
≤ −I (˜ z n ; x ∗n + z n + z˜n ) x∗n + z n + z˜n ) = −h (˜ z n ) + h (˜ z n |x (b) ∗ b + z + z˜ ≤ −nh (˜ z ) + nh z˜ x b∗ + z + z˜ = −nI z˜; x b∗ + z − nh x b∗ + z + z˜ , = nh x
0.02
0.04
Fig. 2.
Therefore we have k X
0
(81)
where (a) is from Lemma 5 and x ∗n has the same covariance matrix as x n , and (b) is from Lemma 4. R EFERENCES [1] C. E. Shannon, “Two-way communication channels,” in Proc. 4th Berkeley Symp. Math. Stat. and Prob., Berkeley, CA, 1961, vol. 1, pp. 611–644, Also available in Claude E. Shannon: Collected Papers, IEEE Press, New York, 1993. [2] A. B. Carleial, “A case where interference does not reduce capacity,” IEEE Trans. Inf. Theory, vol. 21, pp. 569–570, Sep. 1975. [3] T. S. Han and K. Kobayashi, “A new achievable rate region for the interference channel,” IEEE Trans. Inf. Theory, vol. 27, pp. 49–60, Jan. 1981. [4] H. Sato, “The capacity of the Gaussian interference channel under strong interference,” IEEE Trans. Inf. Theory, vol. 27, pp. 786–788, Nov. 1981. [5] H. F. Chong, M. Motani, H. K. Garg, and H. El. Gamal, “On the Han-Kobayashi Region for the interference channel,” IEEE Trans. Inf. Theory, vol. 54, pp. 3188–3195, Jul. 2008. [6] G. Kramer, “Review of rate regions for interference channels,” in Proc. International Zurich Seminar, Zurich, Feb.22-24 2006, pp. 162–165. [7] R. H. Etkin, D. N. C. Tse, and H. Wang, “Gaussian interference channel capacity to within one bit,” submitted to IEEE Trans. Inf. Theory, 2007.
0.06
a
0.08
0.1
0.12
0.14
Sum-rate capacity v.s. a
[8] E. Telatar and D. Tse, “Bounds on the capacity region of a class of interference channels,” in Proc. IEEE International Symposium on Information Theory 2007, Nice, France, Jun. 2007, pp. 2871–2874. [9] G. Kramer, “Outer bounds on the capacity of Gaussian interference channels,” IEEE Trans. Inf. Theory, vol. 50, pp. 581–586, Mar. 2004. [10] X. Shang, G. Kramer, and B. Chen, “A new outer bound and the noisyinterference sum-rate capacity for Gaussian interference channels,” submitted to IEEE Trans. Inf. Theory. http://arxiv.org/abs/0712.1987, Dec. 2007. [11] A. S. Motahari and A. K. Khandani, “Capacity bounds for the Gaussian interference channel,” submitted to IEEE Trans. Inf. Theory. http://arxiv.org/abs/0801.1306, Jan. 2008. [12] V. S. Annapureddy and V. Veeravalli, “Gaussian interference networks: Sum capacity in the low interference regime and new outer bounds on the capacity region,” submitted to IEEE Trans. Inf. Theory. http://arxiv.org/abs/0802.3495, Feb. 2008. [13] H. Sato, “On degraded Gaussian two-user channels,” IEEE Trans. Inf. Theory, vol. 24, pp. 634–640, Sep. 1978. [14] M. H. M. Costa, “On the Gaussian interference channel,” IEEE Trans. Inf. Theory, vol. 31, pp. 607–615, Sept. 1985. [15] I. Sason, “On achievable rate regions for the Gaussian interference channels,” IEEE Trans. Inf. Theory, vol. 50, pp. 1345–1356, June 2004. [16] Y. Weng and D. Tuninetti, “On Gaussian interference channels with mixed interference,” in Proc. Information Theory and Applications Workshop, San Diego, CA, Jan. 2008. [17] S. Vishwanath and S. A. Jafar, “On the capacity of vector Gaussian interference channels,” in Proc. IEEE Information Theory Workshop, San Antonio, TX, Oct. 2004. [18] X. Shang, B. Chen, and M. J. Gans, “On achievable sum rate for MIMO interference channels,” IEEE Trans. Inf. Theory, vol. 52, no. 9, pp. 4313– 4320, Sep. 2006. [19] X. Shang, B. Chen, and G. Kramer, “On sum-rate capacity of parallel Gaussian symmetric interference channels,” in Proc. Globecom, New Orleans, LA, Nov. 2008, to appear. [20] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1997. [21] J. A. Thomas, “Feedback can at most double Gaussian multiple access channel capacity,” IEEE Trans. Inf. Theory, vol. 33, no. 5, pp. 711–716, Sep. 1987. [22] S. N. Diggavi and T. M. Cover, “The worst additive noise under a covariance constraint,” IEEE Trans. Inf. Theory, vol. 47, no. 7, pp. 3072–3081, Nov. 2001. [23] J. C. Engwerda, A. C. M. Ran, and A. L. Rijkeboer, “Necessary and sufficient conditions for the existence of a positive definite solution of the matrix equation X + A∗ X−1 A = Q,” Linear Algebra and Its Applications, vol. 186, pp. 255–275, 1993. [24] G. H. Golub and C. F. Van Loan, Matrix Computations, The Johns Hopkins University Press, Baltimore, MD, 3rd edition, 1996.