Jul 26, 2013 - [1] M.A. Maddah-Ali, A.S. Motahari, and A.K. Khandani, âCommunication over MIMO X channels: Interference align- ment, decomposition, and ...
Two Performance-limiting Factors for Interference Alignment: Channel Diversity Order and the Number of
arXiv:1307.6125v2 [cs.IT] 26 Jul 2013
Data Streams Per User∗ Ruoyu Sun† and Zhi-Quan Luo† July, 2013
Abstract Consider an interference channel with a finite diversity order L (i.e., each channel matrix is a generic linear combination of L fixed basis matrices) whereby each user sends d independent data streams to its intended receiver. In this paper, we study the effect of finite L and d on the achievable Degrees of Freedom (DoF) via vector space interference alignment (IA). The main result of this work is that for d = 1, the total number of users K that can communicate interference-free using linear transceivers is upper bounded by NL + N 2 /4, where N is the dimension of the signalling space. An immediate consequence of this upper bound is that for a K-user SISO interference channel extensions (not necessarily generic), the maximum DoF is no more than q with T channel
5 1 min 4 K, L + 4 T . Additionally, for a MIMO interference channel with d = 1, we show that the maximum DoF lies in [Mr + Mt − 1, Mr + Mt ] regardless of the number of generic channel extensions, where Mr and Mt are the number of receive/transmit antennas (per user) respectively. In establishing these DoF bounds, we also provide some clarifications on the use of the dimensionality counting technique.
1 Introduction With the rapid growth of wireless data traffic, multiuser interference has become a major performance limiting factor in today’s cellular wireless networks, making proper interference management absolutely crucial in order to achieve high network throughput. Among various interference management techniques, interference alignment (IA) has recently attracted considerable attention due to its significant potential to mitigate interference in wireless communication networks. ∗
This work is supported in part by the National Science Foundation, grant number TF-0728676. The authors are with the Department of Electrical and Computer Engineering, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455. Emails: {sunxx394,luozq}@ece.umn.edu. †
Introduced in [1] for MIMO X channels and in [2] for interference channels, interference alignment has been shown to achieve the maximum Degrees of Freedom (DoF). Roughly speaking, the DoF is the first-order approximation of the network capacity in the high SNR regime and can be interpreted as the number of data streams that can be transmitted in an interference-free manner. For a K-user time-varying or frequency selective SISO interference channel, Cadambe and Jafar [2] constructed an asymptotic IA scheme (referred herein as C-J scheme) that achieves K/2 DoF, provided that the number of independent channel extensions grows exponentially in K 2 . This surprising result matches the outer bound proven by Host-Madsen and Nosratinia [3], and implies that each user can get “half-the-cake”, with the cake representing the maximum achievable DoF in a point-to-point channel, regardless of the number of interfering users present in the system. This result can be easily extended to a MIMO interference channel where each transmitter/receiver has the same number of antennas [2]. The C-J scheme in [2] belongs to the class of vector space IA strategies that apply linear transceivers and align the interference subspaces at each receiver into a low dimensional space. This is in contrast to signal level IA schemes which apply lattice codes at transmitters with the goal of aligning the codes of interference at each receiver into few lattice points. Although signal level IA schemes can theoretically achieve a higher DoF than vector space IA schemes (e.g., for constant SISO interference channel [4] and constant MIMO interference channel [5]), they require unrealistic assumptions of infinite precision of the channel state information and exponentially (in terms of the number of users) large codebooks. In this paper, we focus on vector space IA schemes and study the maximum DoF they can achieve. To achieve the K/2 DoF in a K-user interference channel, the C-J scheme [2] requires at least independent channel extensions which may be unrealistic in practice when K is large. How will a polynomial (in K) number of independent or dependent channel extensions affect the achievable DoF? For a MIMO IC with no channel extension, the references [6–8] have analyzed the achievable DoF using algebraic techniques. In particular, the authors of [6] formulated the IA condition as a polynomial system of equations with beamforming vectors being the variables, and defined the notion of a “proper” system for which the number of equations is no more than the number of variables in every subsystem of equations. For the single beam case (i.e. every user transmits a single data stream), they applied the Bernstein’s theorem to show that the IA condition is feasible only if it is proper, thus obtaining a DoF upper bound by counting the number of equations and number of variables (referred herein as the dimensionality counting argument). More recently, Razaviyayn et al. [7] and Bresler et al. [8] independently established the same DoF upper bound (based on the counting argument) without the restriction of one data stream per user. For the special case that each user transmits d data streams in a M × M MIMO channel with no channel extension, their result implies that the DoF is upper bounded by 2M (more precisely, 2MK/(K + 1)). Compared to the M DoF that can be achieved by simply using orthogonalizing strategies, the gain brought about by interference alignment, called “alignment gain” in [8], is upper bounded by 2. This is significantly smaller than the K/2 alignment gain for a K-user 2 M × M MIMO interference channel with O(eK ) independent channel extensions [2]. The achievability of the DoF upper bound has been established for the case dk = d, ∀ k and Mk , Nk divisible by d in [7], 2 O(eK )
2
and for the case Mk = Nk = N, dk = d, ∀ k in [8], where dk is the number of data streams user k transmits and Mk , Nk are the number of antennas at transmitter k and receiver k respectively. The issue of how to design linear transceivers that can achieve interference alignment has been considered in [9]. When a finite number of time or frequency extensions are allowed, the maximum DoF has only been studied for the case K = 3 [2, 10]. In particular, for a 3-user M × M MIMO interference channel, reference [2] showed that the maximum DoF is 3M/2, either with an even M and no channel extension or with an odd M and 2 channel extensions. Reference [10] has further considered a 3-user SISO interference channel with L independent channel extensions for a general L, and characterized the maximum DoF as a function of L. However, it is not clear how to extend these results to general K. In this paper, we consider an interference channel with a finite diversity order L whereby each channel matrix is a generic linear combination of L fixed basis matrices. This channel model is rather general and can be used to describe many practical channels such as the L-tap SISO interference channel or the SISO interference channel with L blocks of generic time or frequency extensions. Suppose each user sends d independent data streams to its intended receiver. In this paper, we study the effect of finite L and d on the achievable Degrees of Freedom (DoF) via vector space interference alignment. The main result of this work is that for an interference channel with diversity order L and ambient transmit dimension N and assuming d = 1, the total number of users K that can communicate interference-free using linear transceivers is upper bounded by NL + N 2 /4. We emphasize that our result applies to any number of users, any channel diversity order and any number of channel extensions. An immediate consequence of this upper bound is that for a K-user SISO interference channel where every user transmits a single q data stream(d = 1), the maximum DoF achievable via interference alignment is no more 1 5 than min 4 K, L + 4 T , where T is the number of channel extensions (not necessarily generic extensions). Compared to the K/2 DoF achievable for the multi-beam case [2], our result shows that the single-stream restriction is a major limiting factor on the system throughput in an interference system. Additionally, for a MIMO interference channel with d = 1, we show that the maximum DoF lies in [Mr + Mt − 1, Mr + Mt ] regardless of number of generic channel extensions, where Mr and Mt are the number of receive/transmit antennas (per user) respectively.
Apart from establishing new DoF upper bounds, we provide some clarifications in this paper on the use of Bernstein’s theorem (more specifically, the so called counting argument to determine the feasibility of a system of polynomial equations). In particular, in Section 4 we discuss some misconceptions on how to determine if a system of polynomial equations has a solution.
3
2 System Model 2.1 Channel Diversity Order We say a K-user symmetric interference channel has a diversity order L if each channel matrix Hi j , 1 ≤ i, j ≤ K is a linear combination of L fixed matrices A1 , . . . , AL ∈ CNr ×Nt : Hi j = τ1i j A1 + · · · + τiLj AL , where A1 , A2 , . . . , AL are linearly independent. We call A1 , . . . , AL the building blocks of this interference channel. For a symmetric interference channel with T time or frequency extensions where each transmitter (receiver) is equipped with Mt (Mr ) antennas, the dimensions of the channel matrices Nt , Nr are related to T, Mt , Mr through the relations Nt = Mt T, Nr = Mr T . We are interested in the case that the coefficients τli j , 1 ≤ i, j ≤ K, 1 ≤ l ≤ L are generic (e.g. independently drawn from the same or different continuous random distributions). Strictly speaking, a property is said to hold for generic x = (x1 , . . . , xn ) ∈ Cn if there is a nonzero polynomial f such that the property holds in set {x | f (x) , 0}. In the so called Zariski topology, such a set {x | f (x) , 0} is called a Zariski open set, and has measure 1. In this paper, a property is generic if it holds over a Zariski open set. In order to avoid degenerate channel matrices, we impose some mild requirements on the building blocks A1 , . . . , AL . Define Ψ = {(A1 , . . . , AL ) | Aℓ ∈ CNr ×Nt , ℓ = 1, . . . , L; ∃ k1 , . . . , kL ∈ C, s.t. rank(k1 A1 + k2 A2 + · · · + kL AL ) = min{Nr , Nt }}.
(1)
Then for any (A1 , . . . , AL ) ∈ Ψ, each channel matrix Hi j = τ1i j A1 + · · · + τiLj AL is full rank for generic (τli j )1≤i, j≤K,1≤l≤L . We describe below several channel models with finite diversity order. Example 2.1: A SISO interference channel with L generic time or frequency extensions has a (L) diversity order L. Indeed, each channel matrix Hi j = diag(Hi(1) j , . . . , Hi j ) is a generic linear combination of L diagonal matrices diag(1, 0, . . . , 0), diag(0, 1, . . . , 0), . . . , diag(0, 0, . . . , 1), which serve as the building blocks. Such a channel model is commonly used in the interference alignment studies [2, 10]. Example 2.2: A SISO interference channel with L-tap discrete channels and N ≥ L frequency extensions has a diversity order L. Indeed, suppose the channel between all transmitter-receiver pairs (k, j) is an L-tap channel with the same time delays λ1 < λ2 < · · · < λL , where λℓ ∈ {0, 1, . . . , N − 1}. The channel matrix (in the frequency domain) Hi j is a N × N diagonal 2πλ √ matrix, and can be expressed 2(N−1)πλl √ PL l l −1 as Hi j = l=1 τi j Al , where Al = diag 1, e N −1 , . . . , e N , and τli j is the l’th tap channel coefficient between transmitter j and receiver i. Example 2.3: A SISO network with L blocks of generic channel extensions has a diversity order L. More specifically, suppose there are N channel extensions that can be divided into L blocks, where the channel coefficients in the same block are equal, and are independent across blocks. This is the 4
well-known block-fading channel model. As a simple example, suppose N1 +N2 channel extensions are divided into two blocks with N1 , N2 channel extensions respectively and the channel matrices can be ex pressed as Hi j = diag τ1i j IN1 , τ2i j IN2 , then the building blocks are A1 = diag IN1 , 0 , A2 = diag 0, IN2 . In general, the building blocks are block diagonal matrices consisting of identity matrices and zero matrices. Such a channel model considers a more general “coherence block structure” than the model with constant extensions or generic extensions. Special asymmetric coherence block structures have been considered in blind IA [11, 12] to achieve high DoF. For a general symmetric coherence block structure, the DoF has not been studied in the literature yet. Example 2.4a: A constant MIMO interference channel (“constant” means flat-fading channel and no channel extension) where each transmitter has Mt antennas and each receiver has Mr antennas (we call Mt × Mr MIMO channel) has a diversity order Mt Mr . The channel matrix Hi j is a Mr × Mt matrix with generic entries. The building blocks are Ekl , 1 ≤ k ≤ Mr , 1 ≤ l ≤ Mt , where Ekl denotes the matrix with only one nonzero entry 1 in the entry (k, l). Such a channel model has been studied in references [7, 8]. Example 2.4b: A Mt × Mr MIMO channel with T constant channel extensions has a diversity H¯ i j 0 · · · 0 0 ¯ i j · · · 0 H order Mt Mr . The channel matrix Hi j is a T Mr × T Mt block diagonal matrix . .. .. , .. .. . . . 0 . . . 0 · · · H¯ i j where H¯ i j is a Mr × Mt matrix with generic entries. The building blocks are block diagonal matrices 0 · · · 0 Ekℓ 0 Ekℓ · · · 0 . .. .. . To our knowledge, such a channel model has not been considered in interfer.. .. . . . 0 . . . 0 · · · Ekℓ ence alignment area. Example 2.4c: A Mt × Mr MIMO channel with T generic channel extensions has a diversity order 1 H 0 · · · 0 i j 2 ··· 0 H 0 ij Mt Mr T . The channel matrix Hi j is a Mr T × Mt T block diagonal matrix . .. , where .. .. .. . . . 0 . . . 0 · · · H T ij each block Hil j is a Mr × Mt matrix with generic entries. Such a channel model is a generalization of the SISO channel in Example 2.1, and has been investigated in [5, 13] for the case T → ∞. At last, we show an example where asymmetric complex signaling (ACS) can double the channel diversity order. Example 2.5: Using ACS, an SISO IC with constant channel extensions has a diversity order 2. The idea of ACS is to convert each complex channel to two real channels, thus doubling the ambient dimension and the diversity order. In a point-to-point channel with input x ∈ C, channel h ∈ C, noise 5
n ∈ C and output y ∈ C, we have y = hx + n
Re(y) Re(h) −Im(h) Re(x) Re(n) , + = Im(n) Im(h) Re(h) Im(x) Im(y)
⇐⇒
where Re(·), Im(·) denote the real part and the imaginary part of a complex number respectively. In a K-user SISO IC with T constant channel extensions, suppose the channel matrix between transmitter j and receiver k is a T × T diagonal matrix diag(hk j , . . . , hk j ). The channel diversity order is 1, with the identity matrix I being the building block. Using ACS, the channel matrix becomes a 2T × 2T block diagonal matrix 0 0 Re(hk j ) −Im(hk j ) · · · 0 0 Im(hk j ) Re(hk j ) · · · . . . . . . .. .. .. .. Hk j = .. 0 0 . . . · · · Re(h ) −Im(h ) k j k j 0 0... · · · Im(hk j ) Re(hk j )
The two basis matrices (or building blocks) are 1 0 0 1 .. . I = .. . 0 0 . . . 0 0...
· · · 0 0 · · · 0 0 . . .. .. . . . · · · 1 0 ··· 0 1
and
This gives a diversity order of L = 2.
0 −1 1 0 .. . . . . 0 0 . . . 0 0...
· · · 0 0 · · · 0 0 .. . . .. . . . . · · · 0 −1 ··· 1 0
We remark that the notion of channel diversity order we define here is different from the number of channel extensions. Although in Example 2.1 with generic channel extensions, channel diversity order coincides with the number of channel extensions, in general this is not the case. For instance, the networks in Example 2.4a and Example 2.4b have the same diversity order, but different numbers of channel extensions. In Example 2.3, the number of channel extensions is larger than the channel diversity order. In general, channel diversity is a more limited resource than the number of channel extensions.
2.2 System Model and IA Condition Consider a K-user interference channel with diversity order L, i.e. each channel matrix Hi j is the linear combination of L fixed matrices A1 , . . . , AL ∈ CNr ×Nt , where A1 , . . . , AL ∈ Ψ as defined in (1). In this work, we focus on vector space interference alignment strategies which deploy linear transmit and receive beamformers and align the subspaces of interference into a low dimensional space at each receiver.
6
Specifically, suppose transmitter k intends to transmit dk independent data streams sk ∈ Cdk ×1 to receiver k. In the transmitter side, a linear beamforming matrix Vk ∈ CNt ×dk is used to encode sk , i.e. transmitter k sends a signal xk = Vk sk . Receiver k receives a signal X X yk = Hkk xk + Hk j x j + nk = Hkk Vk sk + Hk j V j s j + nk , j,k
j,k
where nk ∈ C Nr ×1 is a zero mean additive Gaussian white noise. The receiver k applies a receive beamforming matrix Uk ∈ CNr ×dk to process the received signal to obtain an estimate of the signal X (2) UkH Hk j V j s j + UkH nk . sˆk = UkH yk = UkH Hkk Vk sk + j,k
In vector space interference alignment, we want to design beamforming matrices {Vk } such that all interference is aligned into a small space that is linearly independent of the signal space, thus the interference can be eliminated by multiplying zero-forcing matrices {Uk }. More formally, {Vk , Uk } must satisfy the following IA condition [6–8, 14]: UkH Hk j V j = 0, rank(UkH Hkk Vk )
∀ 1 ≤ k , j ≤ K, = dk ,
∀ 1 ≤ k ≤ K.
(3)
The first equation implies that the interference from any user can be eliminated at the receiver side by using the zero-forcing beamforming matrix Uk . The second equation ensures that no information in sk is lost when multiplied by UkH Hkk Vk . We say that a DoF d¯ is achievable by a vector space IA scheme if there exist Vk ∈ CNt ×dk , Uk ∈ CNr ×dk , k = 1, . . . , K, such that (3) holds and d¯ = (d1 + · · · + dK )/T . Throughout this paper, we only consider the total DoF (or DoF for short) that is achievable by a vector space IA scheme.
3 Main Results and Discussions Consider an interference channel with a given diversity order. In this section, we present several DoF bounds for interference channels in which each user transmits a single data stream (single-beam case, d = 1). The first result shows that in a Mt × Mr MIMO IC with generic channel extensions, the DoF is upper bounded by Mr + Mt if d = 1. This implies that the DoF is bounded by 2 for a SISO IC with d = 1, regardless of the number of generic channel extensions. The proof of this result will be provided in Section 5. Theorem 3.1 Consider a K-user Mt × Mr MIMO interference channel with T channel extensions and dk = 1 for all k. The following results hold for for almost all channel coefficients. 7
(a) A vector space IA scheme exists only if K ≤ (Mr + Mt )T − 1.
(4)
(b) If K = (Mr + Mt − 1)T, then a vector space IA scheme exists. (c) The maximum DoF achievable via vector space IA is no more than Mr + Mt , and no less than Mr + Mt − 1. Theorem 3.1 is equivalent to the following result: if the channel coefficients are independently drawn from continuous random distributions, then the properties (a), (b) and (c) hold with probability one. By a variant of our proof, it is possible to show a slightly stronger result that the same properties in fact hold for generic channel coefficients (i.e. for all channel coefficients in a Zariski open set). Similarly, Theorem 3.2 can be shown to hold for “generic” coefficients. Notice that part (c) of Theorem 3.1 is a direct consequence of part (a) and part (b) since for the single-beam case the DoF equals K/T . Part (b) of Theorem 3.1 can be proved by the result in [7]. In fact, [7] shows that when T = 1 and K = Mr + Mt − 1, the DoF tuple (1, . . . , 1) is achievable. For general T , we can use an orthogonalizing scheme to achieve (1, . . . , 1) for K = (Mr + Mt − 1)T users: for each of the T channel uses, (Mr + Mt − 1) users transmit a single data stream interference-free while other users do not transmit; in total, K = (Mr + Mt − 1)T users can transmit a single data stream interference-free. It remains to prove part (a) of Theorem 3.1 and Section 5 is devoted to such a proof. Part (a) of Theorem 3.1 has been studied in [15] for SISO IC, and more recently in [16] for MIMO IC under some additional assumptions (e.g., the beamforming solutions are independent and marginally isotropic). Their results require beamforming solutions to have no zero entries. Theorem 3.1 states that for the single-beam case, the maximum DoF lies in the region [Mr + Mt − 1, Mr + Mt ] for any T . When T = 1 (i.e. for a constant MIMO IC), this result has been proved in [7, 8], and [7] shows that the maiximal DoF is indeed Mr + Mt − 1. Thus, for the single-beam case, increasing the number of generic channel extensions does not lead to a significant improvement of the DoF. When Mt = Mr = M, Theorem 3.1 implies that the DoF is no more than 2M for the single-beam case. This result is in sharp contrast to K2 M DoF in a M × M MIMO IC when each user is allowed to transmit multiple data streams [2]. Therefore, the single data stream restriction appears to be a throughput limiting factor even in the presence of an arbitrary number of generic channel extensions. It will be interesting to find how the DoF scales with the number of data streams that each user can transmit. The second result considers a general channel model with any number of antennas, channel extensions or channel diversity order. We provide a universal upper bound on the number of users that can be accommodated to achieve IA. This upper bound is a function of the diversity order L and the minimum of transmit dimension and receive dimension min{Nt , Nr }. The proof of this result will be presented in Section 6. 8
Theorem 3.2 Consider a K-user Mt × Mr MIMO interference channel with T channel extensions. Let Nt = Mt T, Nr = Mr T and N = min{Nt , Nr }. Suppose the channel diversity order is L, i.e. the channel matrices Hi j = τ1i j A1 + · · · + τiLj AL ∈ CNr ×Nt , , the DoF tuple with (A1 , . . . , AL ) ∈ Ψ defined in (1). Then for almost all coefficients τli j 1≤i, j≤K,1≤l≤L (d1 , . . . , dK ) = (1, 1, . . . , 1) is achievable via a vector space IA scheme only if K ≤ NL +
N2 . 4
(5)
Theorem 3.2 can be used to derive DoF upper bounds of various channel models for the singlebeam case. We first consider the SISO IC, and provide two upper bounds: one in terms of K and another in terms of T and L. Corollary 3.1 Consider a K-user SISO interference channel with a diversity order L and T channel extensions. If d = 1, then the maximum DoF achievable via a vector space IA scheme is at most q 1 5 min 4 K, L + 4 T . Proof of Corollary 3.1: In the SISO IC, the transmit dimension Nt and receive dimension Nr are both equal to T , thus N = min{Nt , Nr } = T . According to Theorem 3.2, we have K ≤ TL +
T2 , 4
(6)
thus the DoF for the single-beam case d¯ can be bounded as Kd K 1 d¯ = = ≤ L + T. T T 4 Since the channel matrices are T × T diagonal matrices, the diversity order L should be no more than T . Then (6) implies T2 5 K ≤ T2 + ≤ T 2. 4 4 Thus the DoF for the single-beam case r 5 K ¯ ≤ K. d= T 4 Q.E.D. Corollary to Example 2.1-2.3 in Section 2.1. Specifically, the DoF is upper bounded q 3.1 applies 1 5 by min 4 K, L + 4 T for the single-beam case in the following networks: the SISO IC with T = L generic time or frequency extensions, L-tap SISO IC with T frequency extensions, the SISO IC with L generic blocks of T channel extensions.
9
Compared with the K/2 achievable DoF for a SISO IC in the multi-beam case, Theorem 3.1 proves √ a O( K) upper bound in a SISO IC for the single-beam case, regardless of the channel diversity order or the number of channel extensions. Theorem 3.1 also shows that the DoF for the single-beam case is bounded by L + 41 T , which increases linearly both with L and T . When the T channel extensions are generic, Theorem 3.1 proves a stronger result that the DoF is bounded by 2. When the channel √ extensions are not generic, it is not known whether the O( K) bound or the L + 14 T bound can be improved to a constant bound. Let us now apply Theorem 3.2 to the MIMO IC. For an MIMO IC with generic channel extensions (Example 2.4a and Example 2.4c), the DoF bound obtained from Theorem 3.2 is weaker than the bound in Theorem 3.1 (we omit the comparison). For an Mt × Mr MIMO IC with T constant channel extensions (Exmaple 2.4b), Nt = Mt T, Nr = Mr T and L = Mt Mr . Assume Mt ≤ Mr , then N = Mt T , and (5) becomes K ≤ NL +
N2 1 = Mt2 Mr T + Mt2 T 2 . 4 4
Thus, the DoF for the single-beam case 1 K ≤ Mt2 Mr + Mt2 T, T 4 which is linear over T . This bound is possibly loose; nevertheless, it is a first upper bound for this channel model.
4 Solvability of Polynomial Equations and Bernstein’s Theorem The zero-forcing conditions for interference alignment can be interpreted as a system of polynomial equations with the beamforming matrices being the variables. Thus the solvability of a system of polynomial equations is crucial for the existence of an interference alignment scheme. In this section, we clarify some misconceptions related to the dimensionality counting argument. Consider a system of polynomial equations P:
fi (x1 , . . . , xn ) = 0,
i = 1, . . . , m,
(7)
where fi , i = 1, . . . , m are polynomials in variables x1 , . . . , xn . The number of equations is m and the number of variables is n.
4.1
Solvability of a System with Random Coefficients
We say Q is a subsystem of P if Q consists of a subset of equations in P, denoted as Q ⊆ P. 10
(8)
For any system of equations Q, denote Ne (Q) as the number of (nonzero) equations in Q and Nv (Q) as the number of variables in Q. For a system P with randomly generated coefficients, it is tempting to conjecture: P is solvable iff in every subsystemQ of P, Ne (Q) ≤ Nv (Q).
(9)
The underlying “intuition” for this dimensionality counting argument is as follows. The variables represent the “degrees of freedom” in the system, and the equations represent the constraints. When the number of constraints Ne (Q) exceeds the number of degrees of freedom Nv (Q) for some subsystem Q, the subsystem should not be solvable; consequently, the original system P is not solvable. When Ne (Q) ≤ Nv (Q), ∀Q ⊆ P, there are enough degrees of freedom in every subsystem Q, thus the original system P should have a solution. It is easy to argue that (9) is incorrect. If there are contradicting equations in P (for instance, ax1 = bx2 , ax1 = bx2 + c, where a, b, c are random), Ne (Q) ≤ Nv (Q), ∀Q ⊆ P may not be sufficient for the solvability of P. The converse direction is also false as is shown by the following example. Example 4.1. Consider the zero-forcing condition in the IA problem of [2]: P:
UkH Hk j V j = 0,
∀ 1 ≤ k , j ≤ K,
(10)
where Hk j ’s are N × N diagonal matrices with generic diagonal entries, Uk , Vk ∈ CN×dk . For simplicity, assume dk = d, ∀ k. To avoid the trivial solution Uk = 0, ∀ k, we can scale each column of Uk by a nonzero number to make one entry of each column to be 1. Similarly, we scale each column of Vk to make one entry to be 1. Now, the number of variables in each Uk or Vk is d(N − 1). Thus, the total number of variables is Nv (P) = 2Kd(N − 1). The number of equations is Ne (P) = K(K − 1)d2 . If (9) were true, then a necessary condition for (10) to have a solution is Ne (P) ≤ Nv (P), i.e. K(K − 1)d2 ≤ 2Kd(N − 1) However, (10) is solvable for any d ≥ 1. Let U¯ k , Uk = 0⌊ N ⌋×d
=⇒
d≤
2(N − 1) . K−1
0 N ⌈ ⌉×d , Vk = 2 V¯ k
2
(11)
(12)
N N where U¯ k ∈ C⌈ 2 ⌉×d , V¯ k ∈ Cd×⌊ 2 ⌋ are arbitrary matrices. Then UkH Hk j V j = 0, ∀ j , k. Thus, Ne (P) ≤ Nv (P) is not a necessary condition for (10) to be solvable.
Although the beamforming matrices defined in (12) can eliminate all the interference, the transmitted signals are also eliminated. One may argue that a practical IA solution needs to satisfy the full rank condition, i.e. rank(UkH Hkk Vk ) = d, ∀ k. However, the C-J scheme in [2] did construct beamforming matrices that satisfy both (10) and the full rank condition for any d that is arbitrarily close to N/2. This shows that the straightforward dimensionality counting argument can not work in this case. 11
In general, to achieve a DoF that is much higher than what is achievable by orthogonalizing schemes, we need to find solutions to an overdetermined system of polynomial equations with Ne (P) ≫ Nv (P). In other words, designing a good IA scheme amounts to finding a strong counterexample to (9).
4.2 Solvability of a System with Generic Coefficients Another tempting conjecture is the following (used in [17, Section IV.B]): A polynomial system P with generic coefficients is solvable only if Ne (P) ≤ Nv (P).
(13)
Here, we assume each polynomial in P has fixed monomial terms and the coefficients of these monimials are generic (e.g. independently generated from continuous random distributions). The statement (13) is incorrect in general. One simple counterexample is the following: Example 4.2 Consider the following system with n = 2 variables and m = 3 equations: f1 (x1 , x2 ) = a1 x21 + a2 x1 = 0, f2 (x1 , x2 ) = b1 x1 x2 + b2 x1 = 0,
(14)
f3 (x1 , x2 ) = c1 x1 x2 + c2 x1 = 0, where ai , bi , ci , i = 1, 2 are generic coefficients. Obviously, (x1 , x2 ) = (0, x2 ) is a solution to (14) for any x2 ; thus (14) has infinitely many solutions in C2 . Another counterexample is Example 4.1 (SISO IC with generic channel extensions) for the case d = 1 . The coefficients of the K(K − 1) equations in the zero-forcing condition (10) are drawn from different channel matrices, thus are generic. We have shown that (12) is a solution to (10) for any K, thus m = K(K − 1) ≤ n = 2K(N − 1) is not a necessary condition for (10) to be solvable. It is possible that (13) holds for specific fi ’s, such as the following example, but it requires some nontrivial arguments. Example 4.3 The IA condition for the constant MIMO interference channel for the single beam case is: ukH Hk j v j = 0, ukH Hkk vk
, 0,
∀ 1 ≤ k , j ≤ K,
(15a)
∀ k,
(15b)
where Hk j is a Mr × Mt matrix with generic entries, and v j ∈ C Mt ×1 , uk ∈ C Mr ×1 are beamformers. The nonzero solution to (15a) uk , 0, vk , 0, ∀ k will satisfy (15b) automatically since Hkk is a matrix with generic entries. Thus, (15) is solvable iff (15a) has a nonzero solution. (15a) has m = K(K − 1) equations, where the coefficients of different equations come from different channel matrices. Thus, (15a) is a system of polynomial equations with generic coefficients. To obtain the nonzero solution uk , 0, vk , 0, ∀ k, we can scale one entry of each uk , vk to be 1. Thus, the number of variables 12
becomes n = K(Mt + Mr − 2). The following proposition shows that m ≤ n is a necessary condition for (15a) to have a nonzero solution. Proposition 4.1 A necessary condition for (15a) to have a solution uk , 0, vk , 0, ∀ k is K(K − 1) ≤ K(Mt + Mr − 2). Hence, the DoF for the single-beam case is no more than Mt + Mr − 1. Proposition 4.1 is a special case of Theorem 3.1, and also a special case of the results in [7, 8]. Reference [6] was the first to provide a proof for this result by using Bernstein’s theorem. Suppose fi is a polynomial function with mi monomials, and ci j is the coefficient of the j’th monomial, j ∈ {1, 2, . . . , mi }. The number of common solutions of polynomials fi , ∀ i is simply the number of solutions of the system equations fi = 0, ∀ i. Define C∗ , C\{0}. Bernstein’s theorem [6, Theorem 4] Given n polynomials f1 , . . . , fn ∈ C[x1 , . . . , xn ] with common solutions in (C∗ )n , let Pi be the Newton polytope of fi in Rn . For independent random coefficients ci j , ∀ i ∈ {1, . . . , n} and ∀ j ∈ {1, 2, . . . , mi }, the number of common solutions is exactly equal to the mixed volume of Newton polytopes, MV(P1 , . . . , Pn ). We will not define the Newton polytope Pi and mixed volume MV(P1 , . . . , Pn ) here; interested readers can refer to [6] or [18] for the definitions and more information. For our purpose of deciding the solvability of system of equations, we only need to know that the mixed volume MV(P1 , . . . , Pn ) is a finite number for any polynomials f1 , . . . , fn . An important clarification is needed in the above Bernstein’s theorem. Namely, the term “common solutions” should be interpreted as “common solutions in (C∗ )n ”, that is, “strictly nonzero common solutions (i.e., with no zero entries)”; see [19, Theorem 3.1], [20, Thereom 3.2]. Following private communication with the authors of [6], it has become clear that as intended by [6], the “common solutions” in the above Bernstein’s theorem and the related analysis in [6] all mean “strictly nonzero common solutions”, which was originally misunderstood by us in version 1 of this paper posted on ArXiv. To illustrate the subtle and yet important difference of “solutions in (C∗ )n ” and “solutions in Cn ”, we revisit Example 4.2. Example 4.2 (revisited): Consider the system P which consists of the first two equations in (14). We already show that P has infinitely many solutions in Cn . If we require x1 , x2 ∈ C∗ , P has only one solution (x1 , x2 ) = (−a2 /a1 , −b2 /b1 ); thus P has only one solution in (C∗ )n . Bernstein’s theorem can be used to prove Proposition 4.1, see the last paragraph of Section VI in [6] for a proof sketch. It should be noted that certain details were not explicitly stated in the proof. These are: (1) Because Bernstein’s theorem counts only strictly non-zero solutions in (C∗ )n , throughout the proof sketch, wherever Bernstein’s theorem is applied, the term “solution” to a polynomial system should be interpreted as a strictly nonzero solution; (2) The structureless property of the channel, i.e., 13
that the channel is not partial to any particular basis representation, implies that there is no loss of generality in restricting the solutions to (C∗ )n , as opposed to Cn . This can be formalized by a unitary transformation argument presented by the authors of [6] in private communication. Bernstein’s theorem and the dimensionality counting argument have also been applied to other IA problems in [15–17] to derive performance bounds. It should be noted that the corresponding results hold when the IA solutions are restricted to be strictly nonzero, rather than general nonzero solutions in Cn . This is because Bernstein’s theorem only deals with the number of solutions in (C∗ )n . Another important issue related to the use of Bernstein’s theorem is a nontrivial assumption that is often made in the algebraic geometry literature. In particular, many versions of Bernstein’s theorem assume that the number of solutions in (C∗ )n is finite, such as [18, p. 346, (5.4)], [21, Theorem 1], [22, Theorem1.1] (the latter two references counted the number of “isolated” solutions in (C∗ )n , which can be understood as “the number of solutions in (C∗ )n when this number is finite”). Obviously, this assumption does not always hold: the system x1 + x2 = 1, (x1 + x2 )2 = 1 has infinitely many solutions. However, in order to determine the feasibility of IA systems, we are interested in the following question: under what condition does the finiteness assumption hold? This question is answered in some versions of Bernstein’s theorem (e.g. [19, Theorem 3.1], [6, Theorem 4]) that the genericity is enough for this assumption to hold. More specifically, these versions state that for a generic system, the number of solutions in (C∗ )n is exactly the mixed volume MV(P1 , . . . , Pn ), which is a finite number.
4.3
Exploring the Number of Solutions with Zero Components
One may ask whether Bernstein’s theorem can be extended to consider the number of solutions in Cn as opposed to (C∗ )n . In fact, such an extension of Bernstein’s theorem has been studied extensively; see [21–23] and the references therein. It seems that most effort has been been made on the root counting under the assumption that the system has a finite number of solutions in Cn . For the question when the finiteness assumption holds, several conditions have been provided in the literature. In particular, a simple sufficient condition is provided in [21, Lemma 5], and a necessary and sufficient condition is provided in [22, Lemma 3]. We are interested in exploring the conditions under which an overdetermined system has no solution in Cn , since for such systems a simple dimensionality counting will provide a DoF upper bound. The genericity is not enough, since a generic overdetermined system may have infinitely many solutions in Cn (e.g. Example 4.2). The results in [21, 22] can not be directly applied to show that a generic overdetermined system has no solution under the conditions of these results. One may argue that if a subsystem has a finite number of solutions in Cn , then these solutions can not satisfy the remaining polynomial equations with generic coefficients, thus the system is not solvable. A simple counterexample to this argument is that (x1 , x2 , x3 ) = (0, 0, 1) satisfies a polynomial equation ax1 +bx2 x3 +cx1 x3 = 0 for generic a, b, c. To make this argument work, we need to assume that these finite number of solutions
14
are in (C∗ )n . This motives us to view a solution in Cn with support J as a solution in (C∗ )|J| (this idea is also adopted in [21, Lemma 5]). In what follows, we apply this idea to provide a sufficient condition under which a generic overdetermined system can not have a solution in Cn . The support of a vector x = (x1 , . . . , xn ) ∈ Cn is defined as the set { j | x j , 0}. For any support J, define a truncated variable xJ = (x j ) j∈J and its complement variable x¯ J = (xi )i |P J | = 0 for T ≥ 3. The second reason is more serious: we need to take (22b) into account. Unlike the constant MIMO IC, the solution to the zero forcing condition (22a) does not satisfy the full rank condition (22b) automatically. For example, (22a) has a nonzero solution for any K, T ≥ 2, namely, uk = (1, 0, . . . , 0),
v j = (0, v j2 , . . . , v jT ), 17
∀ k, j
(23)
and yet this solution does not satisfy (22b). Unfortunately, it is not easy to utilize the condition (22b) explicitly since it is not a system of equations but a system of inequalities. For the hybrid system of equations and inequalities (22), simply applying Bernstein’s theorem type of results cannot provide any bound on the achievable DoF. We need to use techniques that can determine the feasibility of a hybrid system of equations and inequalities. Our proof technique can be briefly described as follows. We will first use Claim 4.1 to prove an intermediate bound (31) (see Section 5.2), and then prove Theorem 3.1 by an induction analysis in which the condition (22b) is implicitly utilized.
5.2 Proof of Theorem 3.1 for a SISO IC We define the maximal K as f (T ), i.e. f (T ) , max K | (22) is solvable for a positive measure of hk j (t)
1≤k, j≤K, 1≤t≤T
.
(24)
When T = 0, we define f (0) = 0. To prove part (a) of Theorem 3.1, we only need to prove the following bound: f (T ) ≤ 2T − 1,
∀ T ≥ 1.
(25)
We will do so using an induction argument. For the basis of the induction (T = 1), it is easy to show that f (1) ≤ 2T − 1 = 1. In fact, any two nonzero 1-dimensional vectors (i.e. scalars) are linearly dependent, thus when K ≥ 2 the IA condition (22) has no solution. Now suppose (25) holds for each positive integer that is smaller than T . We prove that (25) holds for T . Suppose K satisfies that the IA condition (22) has a solution {˜uk , v˜ k }k=1,...,K for a positive measure of hk j (t) . For a vector x = (x1 , . . . , xT )T ∈ CT , denote the support of x as supp(x) = { j | x j , 0}. Each hk j (t) corresponds to (at least) one collection of supports {supp(˜uk ), supp(˜vk )}k=1,...,K . Since there are finitely many possible collections of supports, it follows that there exists a positive mea sure of hk j (t) which corresponds to the same collection of supports {Rk , S k }k=1,...,K , where Rk , S k ⊆ {1, . . . , T }. Denote this set of hk j (t) as H which has a positive measure. Define a set of transmitter-receiver pairs as Ω , {(k, j) | 1 ≤ k , j ≤ K, supp(˜uk ) ∩ supp(˜v j ) = ∅} = {(k, j) | 1 ≤ k , j ≤ K, Rk ∩ S j = ∅} . 18
(26)
The complement of Ω in the set {(k, j) | 1 ≤ k , j ≤ K} is Ωc = {(k, j) | 1 ≤ k , j ≤ K, Rk ∩ S j , ∅} . Furthermore, we denote ak , bk as the number of nonzero entries in u˜ k , v˜ k respectively, i.e. ak , |supp(˜uk )| = |Rk |, bk , |supp(˜vk )| = |S k |,
k = 1, . . . , K.
Since u˜ k , 0, v˜k , 0, ∀ k, it follows that ak ≥ 1 and bk ≥ 1. We will bound |Ωc | by Claim 4.1 and bound |Ω| by the induction hypothesis. We first provide an upper bound on |Ωc |. Fix rk ∈ Rk , sk ∈ S k , k = 1, . . . , K. In the system of equations (22a), scale each uk by 1/ukrk and each vk by 1/vksk . Then we obtain a new system of polynomial equations, denoted as P, with K(K − 1) equations and 2K(T − 1) variables. For any hk j (t) ∈ H, the IA condition (22) has a solution {˜uk , v˜ k }k=1,...,K with supports supp(˜uk ) = Rk , supp(˜vk ) = S k ,
k = 1, . . . , K.
(27)
Since rk ∈ Rk , sk ∈ S k , we have u˜ krk , 0, v˜ ksk , 0. Scale each u˜ k by 1/˜ukrk and each v˜ k by 1/˜vksk , we obtain a new set of beamformers {˜uk /˜ukrk , v˜ k /˜vksk }k=1,...,K , which is a solution to the system P. For simplicity, we still denote the scaled version {˜uk /˜ukrk , v˜ k /˜vksk }k=1,...,K as {˜uk , v˜ k }k=1,...,K (i.e. assume u˜ krk = 1, v˜ ksk = 1, ∀k). The beamforming vectors u˜ k , v˜ k , k = 1, . . . , K can be concatenated to form a vector in C2K(T −1) (after discarding the 2K entries u˜ krk , v˜ ksk , k = 1, . . . , K since they have been normalized to 1). Let J be the support of this concatenated vector. Then J is a fixed subset of {1, . . . , 2K(T − 1)} determined by Rk \{rk }, S k \{sk }, k = 1, . . . , K. The size of J is X X X |J| = (ak − 1) + (bk − 1) = (ak + bk ) − 2K. (28) k
k
k
Consider the system of equations P J , obtained by restricting P to the subset of variables indexed by J (i.e. set ukr = 0 if r < Rk and vks = 0 if s < S k in P. See (17) for a formal definition of P J ). We claim that the number of nonzero polynomial equations in P J is exactly |Ωc |, i.e. |P J | = |Ωc |. Note that ukH Hk j v j =
P
∗ t ukt hk j (t)v jt .
⇐⇒
Then we have
ukH Hk j v j = 0 does not become a zero equation in P J u˜ kH Hk j v˜ j , 0 (since the support of {˜uk , v˜ k } is J)
⇐⇒
∃ t such that u˜ ∗kt hk j (t)˜v jt , 0
⇐⇒
∃ t ∈ supp(˜uk ) ∩ supp(˜v j )
⇐⇒
supp(˜uk ) ∩ supp(˜v j ) , ∅
⇐⇒
(k, j) ∈ Ωc .
19
(29)
Therefore, |Ωc | equals the number of nonzero equations in P J , which proves (29). We then prove the following bound on |P J | by Claim 4.1: |P J | ≤ |J|.
(30)
Assume the contrary, that |P J | ≥ |J| + 1. According to Claim 4.1, the system P J has no solution in (C∗ )|J| for generic hk j (t) . This contradicts the fact that for a positive measure of hk j (t) (in H), the system P J has a solution {˜uk , v˜ k }k=1,...,K in (C∗ )|J| . Thus (30) is proved. Plugging (28) and (29) into (30), we obtain an upper bound on |Ωc |: X |Ωc | ≤ (ak + bk ) − 2K. (31) k
Next, we provide an upper bound on |Ω|. Define Ωk = { j | (k, j) ∈ Ω} = { j | j ∈ {1, . . . , K}\{k}, supp(˜uk ) ∩ supp(˜v j ) = ∅} . Then Ω =
S
k=1,...,K
Ωk . We will prove the following bound on |Ωk |: |Ωk | ≤ f (T − ak ),
k = 1, . . . , K.
(32)
Without loss of generality, assume supp(˜uk ) = {1, 2, . . . , ak }. Since supp(˜v j ) ∩ supp(˜uk ) = ∅, ∀ j ∈ Ωk , we have v˜ j1 = · · · = v˜ jak = 0, ∀ j ∈ Ωk . (33) Consider a restriction of the beamforming vectors and channel matrices to the (T − ak ) dimensional space. Specifically, we define x j = (˜u j,ak +1 , . . . , u˜ jT ), y j = (˜v j,ak +1 , . . . , v˜ jT ), ∀ j ∈ Ωk , Hˆ i j = diag hi j (ak + 1), . . . , hi j (T ) , ∀ i, j ∈ Ωk . It follows from (33) that u˜ iH Hi j v˜ j = xiH Hˆ i j y j , ∀ i, j ∈ Ωk . Since {˜uk , v˜ k }k=1,...,K satisfies the IA condition (22), we have that {xk , yk }k=1,...,K satisfies xiH Hˆ i j y j = 0, xHj Hˆ j j y j , 0,
∀ i , j ∈ Ωk , ∀ j ∈ Ωk .
(34)
Note that (34) is the IA condition for a SISO IC with (T − ak ) channel extensions and |Ωk | users. For a positive measure of hk j (t) (in H), (34) has a solution {xk , yk }k=1,...,K . By the definition of f (·) in (24), we have |Ωk | ≤ f (T − ak ), which proves (32). Since ak ≥ 1, it follows from the induction hypothesis that f (T − ak ) ≤ 2(T − ak ) − 1 when ak < T . Since f (0) = 0, we have f (T − ak ) = 2(T − ak ) when ak = T . In summary, we have f (T − ak ) ≤ 2(T − ak ), 20
k = 1, . . . , K.
(35)
Combining (32) and (35), we obtain |Ωk | ≤ 2(T − ak ),
k = 1, . . . , K.
(36)
P Summing up (36) for k = 1, . . . , K, we obtain |Ω| ≤ k 2(T − ak ). Similarly, we can prove |Ω| ≤ P k 2(T − bk ). Combining these two bounds on |Ω|, we get X X 1 X |Ω| ≤ 2(T − ak ) + 2(T − bk ) = 2KT − (ak + bk ). (37) 2 k k k Finally, combining the bounds on |Ω| and |Ωc | (cf. (31) and (37)), we can obtain the following bound on K X X K(K − 1) = |Ωc | + |Ω| ≤ (ak + bk ) − 2K + 2KT − (ak + bk ) k
k
which simplifies to K − 1 ≤ −2 + 2T or equivalently K ≤ 2T − 1. Thus f (T ) ≤ 2T − 1 holds for T . This completes the induction step, so that (25) holds for any T , as desired.
6 Proof of Theorem 3.2 Consider a K-user interference channel of diversity order L, with each channel matrix given by Hi j = τ1i j A1 + · · · + τiLj AL , where A1 , A2 , . . . , AL ∈ Ψ ⊆ C Nr ×Nt (see the definition of Ψ in (1)). In the single-beam case (i.e., dk = 1 for each k), the IA condition (3) becomes ukH Hk j v j = 0, ukH Hkk vk , 0,
∀ 1 ≤ k , j ≤ K,
(38a)
∀ 1 ≤ k ≤ K,
(38b)
where v j ∈ CNt ×1 , uk ∈ CNr ×1 are transmit and receive beamforming vectors respectively. Due to the symmetry between the transmit and receive beamforming vectors in the IA condition (38), we can assume without loss of generality that Nt ≤ Nr so that N = min{Nr , Nt } = Nt . In the proof, we first eliminate the receive beamforming vectors from the IA condition (38) to obtain Hkk vk y
span j∈{1,2,...,K}\{k} {Hk j v j } ,
Hkk vk ∈ CNr ×1 \{0},
∀ 1 ≤ k ≤ K,
∀ 1 ≤ k ≤ K,
(39)
where the notation y signifies linear independence. The IA condition (39) is solvable iff the IA condition (38) is solvable. Furthermore, since the basis matrices (A1 , . . . , AL ) ∈ Ψ satisfy the condition (1) and Nr ≤ N, it follows that Hkk has full column rank. Thus the condition Hkk vk , 0 can be equivalently stated as vk , 0. Consequently, the IA condition (39) can be further simplified as Hkk vk y
span j∈{1,2,...,K}\{k} {Hk j v j } ,
vk ∈ CN×1 \{0},
∀ 1 ≤ k ≤ K. 21
∀ 1 ≤ k ≤ K,
(40)
The proof of Theorem 3.1 cannot be directly applied to prove Theorem 3.2. There are two difficulties in applying Claim 4.1 and the induction analysis. First, Claim 4.1 is based on Bernstein’s Theorem, which only applies to a system with generic coefficients. However, the coefficients of the system (38a) are not generic. Therefore, we need other algebraic tools to determine the solvability of (38a). Reference [7] has analyzed the solvability of a special type of systems. Specifically, they show that if each equation takes the form that a generic parameter equals a polynomial function of the variables, then a necessary condition for the solvability is that the number of equations does not exceed the number of equations. We generalize this result to a system where each equation has the form that a generic parameter equals a rational function of variables. See Section 6.1 for more details. The second difficulty has to do with the use of induction analysis. For the IA condition (38), we need to consider the case that a subset S of v j ’s lie in a N ′ -dim linear space, where N ′ < N. To ′ ′ this end, we can express v j as v j = D¯v j , j ∈ S , where D ∈ CN×N , v¯ j ∈ CN ×1 . Then ukH Hk j v j = ′ ukH Hk j D¯v j = ukH H¯ k j v¯ j , where the new channel matrix H¯ k j = Hk j D ∈ CNr ×N . At the first glance, it seems that {¯v j } j∈S satisfies an IA condition with lower ambient transmit dimension N ′ . However, the new “channel matrix” H¯ k j depends on D, the basis matrix of span{v j , j ∈ S } which can be designed by us. As a result, H¯ k j ’s cannot be viewed as the channel matrices for an interference channel. To resolve this difficulty, we use a “lifting” technique. Specifically, we show that the original IA condition (38) implies a “lifted” IA condition in a higher dimensional space. Thus, to upper bound the number of users K such that (38) is feasible, we only need to bound K such that the lifted IA condition is feasible. As will be seen, the induction analysis applies to the lifted IA condition.
6.1 Review of Field Theory and a Useful Lemma In this subsection, we review the theory of transcendence degree (see Chapter 1 and Chapter 8 of [24]) and prove a useful lemma that can be used to establish the infeasibility of a class of systems of polynomial equations. Let Ω/F be a field extension, i.e. F and Ω are two fields such that F ⊆ Ω. Denote F[x1 , . . . , xn ] as the polynomial ring which consists of all polynomials in variables x1 , . . . , xn with coefficients in F. We say α1 , . . . , αn ∈ Ω are algebraically independent over F if the only polynomial P in F[x1 , . . . , xn ] that satisfies P(α1 , . . . , αn ) = 0 is P = 0. Otherwise, we say α1 , . . . , αn are algebraically dependent (i.e. there exists a nonzero polynomial P such that P(α1 , . . . , αn ) = 0). The notion of algebraically independence/dependence is analogous to linear independence/dependence in linear algebra. Example 6.1: Consider the field extension R/Q, where R is the field of real numbers, and Q is the field of rational numbers. For any rational number q, π and q are algebraically independent over Q √ since π is not the root of any polynomial with rational coefficients. On the other hand, π and 2 π + 1 √ are algebraically dependent over Q because P(π, 2 π + 1) = 0, where P(z1 , z2 ) = 4z1 − (z2 − 1)2 . Example 6.2: Consider the field extension C(x1 , x2 )/C, where C(x1 , x2 ) is the field of rational 22
functions in variables (x1 , x2 ) with complex coefficients. A rational function has the form f1 / f2 , where f1 , f2 are polynomials and f2 , 0. Note that the field of rational functions C(x1 , x2 ) is different from the ring of polynomials C[x1 , x2 ]. The three rational functions g1 (x1 , x2 ) =
x22 , x1 + 1
g2 (x1 , x2 ) = x1 ,
g3 (x1 , x2 ) = x1 x2
are algebraically dependent over C. This is because P(g1 , g2 , g3 ) = 0 where P(z1 , z2 , z3 ) = z1 z22 (z2 + 1) − z23 . We say α ∈ Ω is algebraically dependent on A ⊆ Ω if there exists β1 , . . . , βm ∈ A and a polynomial P(x1 , . . . , xm , y) such that P(β1 , . . . , βm , α) = 0 (i.e. β1 , . . . , βm , α are algebraically dependent) and P(β1 , . . . , βm , y) is not a zero polynomial in the variable y. We say a set B is algebraically dependent on A if every element of B is algebraically dependent on A. Note that the statement “B is algebraically dependent on A” is analogous to “B can be spanned by A” in linear algebra. In linear algebra, a basis of a linear space is defined as a minimum set of vectors that can span all vectors in the linear space. In field theory, the transcendence basis is defined as follows: we say A is a transcendental basis for the filed extension Ω/F if A ∈ Ω is an algebraically independent set such that Ω is algebraically dependent on A. The two definitions are both consistent with the notion of “minimum spanning set”: a “basis” should be a minimum set of elements that can “span” all elements. It can be shown that the transcendence basis exists and any two transcendence bases have the same cardinality (possibly infinite). Based on these results, we can define the transcendence degree of Ω/F as the cardinality of a transcendence basis for Ω/F. Example 6.3: Denote C(x1 , . . . , xn ) as the field of rational functions in variables x1 , . . . , xn with coefficients from C. It can be shown that g1 (x1 , . . . , xn ) = x1 , . . . , gn (x1 , . . . , xn ) = xn is a transcendence basis of C(x1 , . . . , xn )/C. Hence the transcendence degree of C(x1 , . . . , xn )/C is n. The correspondences between linear algebra and the theory of transcendence degree are summarized in the following table [24]: Linear algebra
Transcendence
linearly independent
algebraically independent
A can be spanned by B
A is algebraically dependent on B
basis
transcendence basis
dimension
transcendence degree
In linear algebra, it can be shown that any n + 1 vectors in a n-dimensional linear space are linearly dependent. In field theory, a similar result holds: for a field extension Ω/F with transcendence degree n, any n + 1 elements in Ω are algebraically dependent over F (A simple proof is to use Theorem 8.5 in [24]). The result for the case that F = C, Ω = C(x1 , . . . , xn ) is useful for our problem, and we restate it in the following proposition. 23
Proposition 6.1 Consider the field extension C(x1 , . . . , xn )/C. Any n + 1 rational functions g1 (x1 , . . . , xn ), . . . , gn+1 (x1 , . . . , xn ) are algebraically dependent, i.e. there exists a nonzero polynomial P ∈ C[z1 , . . . , zn+1 ] such that P(g1 , . . . , gn+1 ) = 0, or equivalently, P (g1 (x1 , . . . , xn ), . . . , gn+1 (x1 , . . . , xn )) = 0, ∀ x1 , . . . , xn . In Example 6.2, we have shown that the three rational functions g1 , g2 , g3 ∈ C(x1 , x2 ) are algebraically dependent. Proposition 6.1 implies that any three rational functions in x1 , x2 are algebraically dependent. Proposition 6.1 is more general than Example 4 in Section III of [7], which states that any n + 1 polynomial functions in C[x1 , . . . , xn ] are algebraically dependent. Reference [7] used this fact to show that for a certain type of polynomial systems of equations, a necessary condition for the solvability is that the number of variables is no less than the number of equations. Using the more general result Proposition 6.1, we show below that the dimensionality counting argument works for a broader class of systems. Lemma 6.1 Consider a system of polynomial equations and inequalities αi fi (x) + gi (x) = 0,
i = 1, . . . , m,
(41a)
fi (x) , 0,
i = 1, . . . , m,
(41b)
where α = (α1 , . . . , αm ) is a parameter vector, x = (x1 , . . . , xn ) is the variable and fi , gi , i = 1, . . . , m are polynomial functions of x. If for a positive measure of α, the above polynomial system (41) has a solution x, then the number of equations cannot exceed the number of variables, i.e. m ≤ n. Besides the polynomial equations in (41a), Lemma 6.1 adds an requirement that the coefficients of αi are nonzero. This requirement guarantees that αi can be expressed as a rational function −gi (x)/ fi (x) if x is a solution to (41) ((41b) ensures that these rational functions are well defined). In the special case of fi = 1 for all i, Lemma 6.1 reduces to the key result proved in [7]. The difference of Lemma 6.1 and the corresponding result in [7] only lies in how αi is expressed: αi equals a rational function in the former case and a polynomial function in the latter case. Proof of Lemma 6.1: We prove by contradiction. Assume the contrary that m > n. Denote I as the set of α such that (41) has a solution x. The assumption of Lemma 6.1 is that I has a positive measure. By Proposition 6.1, the m rational functions −gi (x)/ fi (x), i = 1, . . . , m, in n variable x1 , . . . , xn are algebraically dependent. Thus, there exists a nonzero polynomial P ∈ C[z1 , . . . , zm ] such that ! g1 (x) gm (x) P − ,...,− = 0, ∀ x. (42) f1 (x) fm (x) We claim that P(α1 , . . . , αm ) = 0, 24
∀ α ∈ I.
(43)
In fact, for any α ∈ I, there exists a solution x to (41), i.e. αi fi (x) = −gi (x) and fi (x) , 0, i = 1, . . . , m. Thus for any α ∈ I, there exists an x such that αi = −gi (x)/ fi (x), i = 1, . . . , m. Applying (42), we have ! gm (x) g1 (x) ,...,− P(α1 , . . . , αm ) = P − = 0. f1 (x) fm (x) Thus, P(α1 , . . . , αm ) = 0 for a positive measure of α, implying that P is a zero polynomial, which is a contradiction. Q.E.D.
6.2
Proof of Theorem 3.2 for L = 2
The proof of Theorem 3.2 consists of two steps: first, we “lift” the IA condition to a higher dimensional IA condition; second, using Lemma 6.1, we prove the bound for the “lifted” IA condition by induction on N. In this subsection, we prove Theorem 3.2 for the case L = 2. The case of general L can be treated similarly and is given in Appendix B. Also, without loss of generality, we can assume τ1i j = 1, ∀ i, j since multiplying Hi j by a nonzero scalar 1/τ1i j does not affect the IA condition (40). For simplicity, we further denote τ2i j as τi j . Now the channel model becomes Hi j = A1 + τi j A2 .
(44)
Denote f (N) as the maximal K such that IA is feasible; more specifically, f (N) , max{K | for a positive measure of (τi j ), the DoF tuple (d1 , . . . , dK ) = (1, 1, . . . , 1) is achievable via vector space IA} (45) = max{K | for a positive measure of (τi j ), ∃ {vk }k=1,...,K , s.t. IA condition (40) holds }. Note that f (N) depends on the receive dimension Nr and the building blocks A1 , A2 . What we need to prove is: for any Nr and A1 , A2 ∈ Ψ as defined in (1), we have f (N) ≤ 2N +
N2 . 4
(46)
Bounding f (N) directly by the IA condition is difficult. Instead, we consider a lifted IA condition: v j vk , ∀ 1 ≤ k ≤ K, | j ∈ {1, 2, . . . , K}\{k} y span τk j v j τkk vk (47) vk ∈ CN×1 \{0},
∀ 1 ≤ k ≤ K.
Define g(N) as the maximal K for the lifted IA condition to hold; more specifically, g(N) = max{K | for a positive measure of (τi j ), ∃ {vk }k=1,...,K , s.t. IA condition (47) holds}.
(48)
Note that g(N) does not depend on Nr , A1 , A2 . The following lemma shows that f (N) is upper bounded by g(N). 25
Lemma 6.2 For any Nr , N and any Nr × N building blocks A1 , A2 ∈ Ψ as defined in (1), we have f (N) ≤ g(N).
(49)
K satisfies the IA condition Proof of Lemma 6.2: We only need to prove: for any given (τi j ), if {vk }k=1 K satisfies the lifted IA condition (47). We prove by contradiction. Assume {v }K (40), then {vk }k=1 k k=1 does not satisfy (47), then the independence condition does not hold for some k, i.e. X X ∃λ1 , . . . , λK , s.t. vk = λ j v j , τkk vk = λ j τk j v j . (50) j,k
j,k
Left multiplying the lth equation of (50) by Al , l = 1, 2 and summing up yields X Hkk vk = λ j Hk j v j , j,k
where we have used the equation Hk j = A1 + τk j A2 . This contradicts the assumption that {vk } satisfies the IA condition (40). Q.E.D. According to Lemma 6.2, we only need to prove g(N) ≤ 2N +
N2 . 4
(51)
We prove (51) by induction on N. For the basis of the induction, we consider the case N = 1. We need to prove K ≤ 2N + N 2 /4 = 2.25 or equivalently K ≤ 2. Assume K ≥ 3 and for a positive measure of K . Since scaling each v by a nonzero factor 1/v (τk j ), the lifted IA condition (47) has a solution {vk }k=1 j j does not affect the linear independence condition, we have v j vk | j ∈ {1, 2, . . . , K}\{k} y span , for k = 1, τk j v j τkk vk 1 1 1 ⇐⇒ y span . (52) , τ11 τ12 τ13 The linear independence relation (52) can not hold for a positive measure of (τ11 , τ12 , τ13 ), a contradiction. Therefore, (51) holds for N = 1.
Suppose (51) holds for 1, . . . , N − 1. We will prove (51) holds for N. To this end, suppose K satisfies that for a positive measure of (τi j ), ∃ vk ∈ CN×1 , k = 1, . . . , K, such that the lifted IA condition u1 (47) holds. Let us introduce the receive beamforming vector uk = k2 ∈ C2N×1 , where u1k , u2k ∈ CN×1 , uk and transform the lifted IA condition (47) to the following zero-forcing type IA condition: v j H = 0, ⇐⇒ (u1k )H v j + τk j (u2k )H v j = 0, ∀ k , j, uk (53a) τk j v j vk H , 0, ∀ k. (53b) uk τkk vk 26
Define a set of transmitter-receiver pairs as n o Ω = (k, j) | 1 ≤ k , j ≤ K, (u1k )H v j = (u2k )H v j = 0 . The complement of Ω with respect to {(k, j) | 1 ≤ k , j ≤ K} is given by n o Ωc = (k, j) | 1 ≤ k , j ≤ K, (u2k )H v j , 0 . In addition, define n o Ωk = { j | (k, j) ∈ Ω} = j | j ∈ {1, . . . , K}\{k}, (u1k )H v j = (u2k )H v j = 0 . Then Ω =
S
1≤k≤K
Ωk .
Notice that for a positive measure of (τi j ), the lifted zero-forcing type IA condition (53) has a solution {uk , vk }k=1,...,K . In other words, each (τi j ) corresponds to at least one solution {uk , vk }k=1,...,K which defines a collection of sets {Ωk }k=1,...,K ⊆ {1, . . . , K}K . Since there are finitely many subsets of {1, . . . , K}K , we must have one collection of sets {Ωk }k=1,...,K that corresponds to a positive measure of (τi j ). Denote this set of {τi j } (which has a positive measure) as H0 . Consider the collection of sets {Ωk }1≤k≤K that corresponds to (τi j ) ∈ H0 . We will bound |Ω| by induction and bound |Ωc | by Lemma 6.1. Specifically, to derive an upper bound on |Ω|, we use the definition of Ωk to obtain Uk , span{u1k , u2k } ⊥ Vk , span{v j | j ∈ Ωk },
∀ 1 ≤ k ≤ K,
which implies dim(Vk ) + dim(Uk ) ≤ N, u1 Since uk = k2 is a nonzero vector, dim(Uk ) ≥ 1. Then uk
∀ 1 ≤ k ≤ K.
(54)
pk , dim(Vk ) ≤ N − dim(Uk ) ≤ N − 1.
Although all (τi j ) in H0 correspond to the same collection of sets {Ωk }k=1,...,K , they may correspond to different sets of dimensions {pk }k=1,...,K . Using a similar argument as before, we can show that there exist a positive measure of (τi j ) in H0 which corresponds to the same set of dimensions {pk }k=1,...,K . These (τi j ) form a subset of H0 , denoted as H. Consider the collection of dimensions {pk }k=1,...,K that corresponds to (τi j ) ∈ H. We prove that |Ωk | ≤ g(pk ) ≤ 2pk +
p2k , 4
∀ 1 ≤ k ≤ K.
(55)
Let D ∈ CN×pk be the basis matrix of Vk , and suppose v j = D¯v j , ∀ j ∈ Ωk , where v¯ j ∈ C pk \{0}.
27
From the lifted IA condition (47), we have v j vi | j ∈ Ωk \{i} , ∀ i ∈ Ωk , y span τi j v j τii vi D¯v j D¯vi | j ∈ Ω \{i} , ∀ i ∈ Ωk , ⇐⇒ y span k τi j D¯v j τii D¯vi v¯ i v¯ j =⇒ , ∀ i ∈ Ωk . y span | j ∈ Ωk \{i} τii v¯ i τi j v¯ j
(56a) (56b)
The last step can be proved by contradiction. If (56b) does not hold, then the LHS (left-hand of side) D 0 , we (56b) belongs to the space on the RHS (right-hand side) of (56b). Multiply both sides by 0 D obtain that the LHS of (56a) belongs to the space on the RHS of (56a), which contradicts (56a). Now for a positive measure of (τi j )i, j∈Ωk , there exist v¯ i ∈ C pk \{0}, i ∈ Ωk that satisfy (56b). By
the definition of g(·), |Ωk | ≤ g(pk ). Using the induction hypothesis, g(pk ) ≤ 2pk + proved. Summing up (55) for k = 1, . . . , K, we have K X X p2k |Ω| = |Ωk | ≤ 2pk + . 4 k=1 k
p2k 4 .
Thus, (55) is
(57)
Next, we provide an upper bound on |Ωc |. Equation (53a) implies that τk j = −
(u1k )H v j (u2k )H v j
,
∀ (k, j) ∈ Ωc .
(58)
The system of equations (58) in variables u1i , u2i , vi , i = 1, . . . , K (parameterized by (τk j )) has |Ωc | equations. We compute the number of free variables in {uk , vk }1≤k≤K . Since scaling v j does not affect the system of equations (58), we can scale each v j to make one of its entries to be 1. Therefore, the number of variables in {vi }1≤i≤K is K(N − 1).To count the free variables in uk , notice that the condition (6.2) implies that (u1k )H v j = 0 for all j ∈ Ωk . Since span{v j : j ∈ Ωk } has dimension pk , it follows that pk entries of u1k can be written as linear functions of the remaining N − pk entries of u1k (with coefficients being the rational functions of {v j : j ∈ Ωk }). Similarly, pk entries of u2k can be represented as linear functions of the remaining N − pk entries of u2k , with coefficients being some rational functions of {v j : j ∈ Ωk }. Substituting these linear functions into the right hand sides of (58) yields a new representation of each τk j as a rational function of the 2(N − pk ) free variables in u1k , u2k as well as the K(N − 1) variables in v j ’s. Because of the homogeneity of these rational functions over the 2(N − pk ) free entries of uk , we can further scale uk to make one of these entries to be 1. Thus, the number of free K is variables in u1k , u2k is 2(N − pk ) − 1. In summary, the number of free variables in {uk , vk }k=1 X X K(N − 1) + (2(N − pk ) − 1) = K(3N − 2) − 2 pk . k
k
28
For a positive measure of (τk j )(k, j)∈Ωc , the rational system (58) has a solution {u1k , u2k , vk }1≤k≤K such that (u2k )H v j , 0. It follows from Lemma 6.1 that the number of equations should not exceed the number of variables, i.e. X |Ωc | ≤ K(3N − 2) − 2 pk . (59) k
With the bounds on |Ω| and |Ωc |, we can now provide an bound on K. Summing up (59) and (57) yields X X p2k c pk K(K − 1) = |Ω| + |Ω | ≤ 2pk + + K(3N − 2) − 2 4 k k implying
K(K − 1) ≤ K(3N − 2) +
X p2 k 4 k
K ≤ 3N − 1 +
or equivalently
1 X p2k . K k 4
(60)
Thus, if pk ≤ N − 2, ∀ k, then (60) leads to K ≤ 3N − 1 +
(N − 2)2 N2 1 X p2k ≤ 3N − 1 + = 2N + K k 4 4 4
(61)
as desired. It remains to consider the case pk = N − 1 for some k. We need to prove the following claim. Claim 6.1 If there exists k such that pk = N − 1, then K ≤ |Ωk | + 2.
(62)
Proof of Claim 6.1: According to (54), 1 ≤ dim(Uk ) ≤ N − pk = 1, thus dim(Uk ) = 1. Then u1k is parallel to u2k . Without loss of generality, we assume u1k = γu2k ; then Uk = span{u2k }. According to the IA condition (53a), we have 0 = (u1k )H v j + τk j (u2k )H v j = (u2k )H v j (γ + τk j ),
∀ j ∈ {1, . . . , K}\{k}.
Since γ can be equal to at most one −τk j , we have that γ + τk j , 0 for at least K − 2 j’s. Therefore, 0 = (u2k )H v j ⇐⇒ u2k ⊥ v j holds for at least (K − 2) j’s. Hence, Ωk = { j | Uk ⊥ v j } has at least (K − 2) elements, which proves (62). Q.E.D. To complete the induction step, suppose pk = N − 1. Using (62) and (55), we have p2k (N − 1)2 N2 K ≤ |Ωk | + 2 ≤ + 2pk + 2 = + 2(N − 1) + 2 < + 2N 4 4 4 29
2
2
as desired. Combining this with (61) yields K ≤ 2N + N4 , which further implies g(N) ≤ 2N + N4 holds for N. This completes the induction step, so that (51) holds for any N. Finally, combining (51) and 2 Lemma 6.2, we obtain f (N) ≤ g(N) ≤ 2N + N4 . Q.E.D.
Appendix A
Proof of Theorem 3.1 for MIMO IC
In an Mt × Mr MIMO IC with T channel extensions, the channel matrix Hk j = diag(Hk1 j , . . . , HkTj ) is a Mr T × Mt T block diagonal matrix, where each block Hkl j = Hkl j (p, q) is a Mr × Mt matrix. The IA condition is given as follows: ukH Hk j v j = 0, ukH Hkk vk , 0,
∀ 1 ≤ k , j ≤ K,
(63a)
∀ 1 ≤ k ≤ K,
(63b)
where v j ∈ C Mt T ×1 , uk ∈ C Mr T ×1 are beamformers. Fix Mt , Mr and define f (T ) as f (T ) , max K | (63) is solvable for a positive measure of Hkl j (p, q)
. (64)
1≤k, j≤K,1≤l≤T,1≤p≤Mr ,1≤q≤Mt
To prove part (a) of Theorem 3.1, we only need to prove the following bound: f (T ) ≤ (Mt + Mr )T − 1,
∀T ≥ 1.
(65)
We will do so by using an induction analysis. For the basis of the induction (T = 1), f (1) ≤ Mt + Mr − 1 holds according to Proposition 4.1. Now suppose (65) holds for any positive integer that is smaller than T . We will prove (65) holds for T . Suppose K satisfies that (63) has a solution {˜uk , v˜ k }k=1,...,K for a positive measure of Hkl j (p, q) . 1 1 l u˜ k v˜ k u˜ k1 . . . l We divide u˜ k and v˜ k into T blocks: u˜ k = .. , vk = .. , where each block u˜ k = .. ∈ C Mr ×1 , u˜ Tk v˜ Tk u˜ lkMr l 1 v˜ k1 x . . vlk = .. ∈ C Mt ×1 . For a vector with T blocks x = .. , define the block-support of x as v˜ lkMt xT B-supp(x) = {t | xt , 0}. 30
Each Hkl j (p, q) corresponds to one collection of block-supports {B-supp(˜uk ), B-supp(˜vk )}k=1,...,K . There are finitely many possible choices for the collection of block-supports, thus there exist a positive mea sure of Hkl j (p, q) which corresponds to the same collection of block-supports {Rk , S k }k=1,...,K , where Rk , S k ⊆ {1, . . . , T }. Denote this set of Hkl j (p, q) as H which has a positive measure. Define a set of transmitter-receiver pairs as Ω , {(k, j) | 1 ≤ k , j ≤ K, B-supp(˜uk ) ∩ B-supp(˜v j ) = ∅} = {(k, j) | 1 ≤ k , j ≤ K, Rk ∩ S j = ∅} .
(66)
The complement of Ω in {(k, j) | 1 ≤ k , j ≤ K} is Ωc = {(k, j) | 1 ≤ k , j ≤ K, Rk ∩ S j , ∅} . Furthermore, we denote ak , bk as the number of nonzero blocks of u˜ k , v˜ k respectively, i.e. ak , |B-supp(˜uk )| = |Rk |, bk , |B-supp(˜vk )| = |S k |,
k = 1, . . . , K.
Since u˜ k , 0, v˜k , 0, ∀k, it follows that ak ≥ 1 and bk ≥ 1. We will bound |Ωc | by Claim 4.1 and bound |Ω| by the induction hypothesis. We first provide an upper bound on |Ωc |. Since scaling does not affect the solutions of (63), we can scale each u˜ k , v˜ k to make one entry of them to be one. For simplicity, we still denote the scaled version of {˜uk , v˜ k }k=1,...,K as {˜uk , v˜ k }k=1,...,K . After scaling, (63a) becomes a new system of polynomial equations with K(K − 1) equations and K(Mt T + Mr T − 2) variables, denoted as P. Then the system P has a solution {˜uk , v˜ k }k=1,...,K . The beamforming vectors u˜ k , v˜ k , k = 1, . . . , K can be concatenated to form a vector in CK(Mt T +Mr T −2) (after discarding the 2K one’s that are generated by scaling). Let J be the support of this concatenated vector. Then J is a subset of {1, . . . , K(Mt T + Mr T − 2)} determined by the block-supports of uk , vk , k = 1, . . . , K and the positions of the normalized entries. The size of J is X X X |J| = (ak Mr − 1) + (bk Mt − 1) = (ak Mr + bk Mt ) − 2K. (67) k
k
k
Consider the system of equations P J , which is obtained by restricting P to the subset of variables J. We claim that the number of nonzero polynomial equations in P J is exactly |Ωc |, i.e. |P J | = |Ωc |.
31
(68)
Note that ukH Hk j v j =
P
⇐⇒
l H l l l (uk ) Hk j v j .
Then we have
ukH Hk j v j = 0 does not become a zero equation in P J u˜ kH Hk j v˜ j , 0 (since the support of{˜uk , v˜ k } is J)
⇐⇒
∃ l such that (˜ulk )H Hkl j v˜ lj , 0
⇐⇒
∃ l ∈ B-supp(˜uk ) ∩ B-supp(˜v j )
⇐⇒
B-supp(˜uk ) ∩ B-supp(˜v j ) , ∅
⇐⇒
(k, j) ∈ Ωc .
Therefore, |Ωc | equals the number of nonzero equations in P J , which proves (68). Since P J has a solution in (C∗ )|J| , using a similar argument as the proof of (30), we have |P J | ≤ |J|. Plugging (67) and (68) into (69), we obtain an upper bound on |Ωc |: X |Ωc | ≤ (ak Mr + bk Mt ) − 2K.
(69)
(70)
k
Next, we provide an upper bound on |Ω|. Define Ωk = { j | (k, j) ∈ Ω} = { j | j ∈ {1, . . . , K}\{k}, B-supp(˜uk ) ∩ B-supp(˜v j ) = ∅} . S Then Ω = k=1,...,K Ωk . We claim that |Ωk | ≤ f (T − ak ),
k = 1, . . . , K.
(71)
Without loss of generality, assume B-supp(˜uk ) = {1, 2, . . . , ak }. Since B-supp(˜v j ) ∩ B-supp(˜uk ) = ∅, ∀ j ∈ Ωk , we have v˜ 1j = · · · = v˜ aj k = 0, ∀ j ∈ Ωk . (72) Consider a restriction of the beamformers and channel matrices to the lower dimensional space for (T − ak ) channel uses. Specifically, define ak +1 ak +1 v˜ j u˜ j x j = ... , y j = ... , ∀ j ∈ Ωk , T T v˜ j u˜ j a +1 Hˆ i j = diag Hi jk , . . . , HiTj , ∀ i, j ∈ Ωk . It follows from (72) that u˜ iH Hi j v˜ j = xiH Hˆ i j y j , ∀ i, j ∈ Ωk . Since {˜uk , v˜ k }k=1,...,K satisfies the IA condition (63), we have that {xk , yk }k=1,...,K satisfies xiH Hˆ i j y j = 0, xHj Hˆ j j y j , 0,
∀ i , j ∈ Ωk , ∀ j ∈ Ωk . 32
(73)
Note that (73) is the IA condition for a Mt × Mr MIMO IC with (T − ak ) channel extensions and |Ωk | users. For a positive measure of Hkl j (p, q) (in H), (73) has a solution {xk , yk }k=1,...,K . By the definition of f (·) in (64), we have |Ωk | ≤ f (T − ak ), which proves (71). Similar to (35), using induction hypothesis and f (0) = 0, we have f (T − ak ) ≤ (T − ak )(Mr + Mt ),
k = 1, . . . , K.
(74)
Combining (71) and (74), we obtain |Ωk | ≤ (T − ak )(Mr + Mt ),
k = 1, . . . , K.
(75)
|Ωk | ≤ (T − bk )(Mr + Mt ),
k = 1, . . . , K.
(76)
Similarly, we have
Multiplying (75) by
Mr Mt +Mr
and (76) by
Mt Mt +Mr ,
and summing up yields
|Ωk | ≤ T (Mr + Mt ) − (ak Mr + bk Mt ),
k = 1, . . . , K.
(77)
Summing up (77) for k = 1, . . . , K, we obtain |Ω| ≤ KT (Mr + Mt ) −
X (ak Mr + bk Mt ).
(78)
k
Finally, combining the bounds of |Ω| and |Ωc | (c.f. (70) and (78)), we can obtain the following bound on K X X K(K − 1) = |Ωc | + |Ω| ≤ (ak Mr + bk Mt ) − 2K + KT (Mr + Mt ) − (ak Mr + bk Mt ), (79) k
k
which simplifies to K − 1 ≤ −2 + T (Mr + Mt ), or equivalently K ≤ T (Mr + Mt ) − 1. Thus, f (T ) ≤ T (Mr + Mt ) − 1 holds for T. This completes the induction step, so that (65) holds for any T , as desired.
B Proof of Theroem 3.2 for general L We prove Theroem 3.2 for general channel diversity order L. The channel matrix Hi j = τ1i j A1 + · · · + τiLj AL , where A1 , A2 , . . . , AL ∈ Ψ are fixed Nr × N matrices and N ≤ Nr . Denote f (N) as the maximal K such that IA is feasible; more specifically, f (N) = max{K | for a positive measure of τli j , ∃ {vk }k=1,...,K , s.t. IA condition (40) holds }. (80) Note that f (N) depends on the receive dimension Nr and the building blocks A1 , . . . , AL . What we need to prove is: for any Nr and A1 , . . . , AL ∈ Ψ as defined in (1), we have f (N) ≤ LN + 33
N2 . 4
(81)
Define a lifted IA condition 1 τkk vk . .. y Lv τkk k
τ1k j v j .. , ∀k, | j ∈ {1, 2, . . . , K}\{k} span . τL v j kj
(82)
vk ∈ CN×1 \{0}, ∀k.
Define g(N) as the maximal K for the lifted IA condition (82) to hold; more specifically, g(N) = max{K | for a positive measure of τli j , ∃ {vk }k=1,...,K , s.t. IA condition (82) holds }. (83)
Note that g(N) does not depend on Nr , A1 , . . . , AL .
Lemma B.1 For any Nr , N and any Nr × N building blocks A1 , . . . , AL ∈ Ψ as defined in (1), we have f (N) ≤ g(N).
(84)
K satisfies the IA condition Proof of Lemma B.1: We only need to prove: for any given τli j , if {vk }k=1 K satisfies the lifted IA condition (82). We prove by contradiction. Assume a set of (40), then {vk }k=1 K does not satisfy (82), then the independence condition does not hold for some nonzero vectors {vk }k=1 k, i.e. X X L ∃λ1 , . . . , λK , s.t. τ1kk vk = λ j τ1k j v j , . . . , τkk vk = λ j τkLj v j . (85) j,k
j,k
Left multiplying the lth equation of (85) by Al and sum up these L equations, we obtain X Hkk vk = λ j Hk j v j , j,k
where we have used the equation Hi j = satisfies the IA condition (40). Q.E.D.
τ1i j A1
+ · · · + τiLj AL . This contradicts the assumption that {vk }
According to Lemma B.1, we only need to prove N2 . (86) 4 We prove (86) by induction on N. For the basis of the induction, we consider the case N = 1. We need 2 to prove K ≤ LN + N4 = L + 14 or equivalently K ≤ L. Assume K ≥ L + 1 and for a positive measure K . Since scaling each v by a nonzero factor of (τk j ), the lifted IA condition (82) has a solution {vk }k=1 j 1 does not affect the linear independence condition, we have vj 1 τ1k j v j τkk vk . . . y span . | j ∈ {1, 2, . . . , K}\{k} , for k = 1, . . L L τ vj τkk vk kj 1 1 τ1 τ12 τ11 1,L+1 . . . , . . . , .. . (87) ⇐⇒ .. y span . . L L τL τ1,L+1 τ11 12 g(N) ≤ LN +
34
The linearly independence relation (87) does not hold for a positive measure of τli j , which is a contradiction. Therefore, (86) holds for N = 1. Suppose (86) holds for 1, . . . , N − 1. We will prove that (86) holds for N. To this end, suppose K satisfies that for a positive measure of τli j , ∃ vk ∈ CN×1 , k = 1, . . . , K, such that the lifted IA condition (82) holds. 1 uk . We introduce the receive beamformer uk = .. ∈ CLN×1 , where u1k , . . . , ukL ∈ CN×1 , and transform ukL the lifted IA condition (82) to the following zero-forcing type IA condition: 1 τk j v j . ukH .. = 0 L τk j v j 1 τkk vk . H uk .. , 0, Lv τkk k
⇐⇒
τ1k j (u1k )H v j + · · · + τkLj (ukL )H v j = 0,
∀ k , j,
∀ k.
(88a)
(88b)
Define a set of transmitter-receiver pairs as n o Ω = (k, j) | 1 ≤ k , j ≤ K, (u1k )H v j = · · · = (ukL )H v j = 0 . The complement of Ω with respect to {(k, j) : 1 ≤ k , j ≤ K} is given by o n Ωc = (k, j) | 1 ≤ k , j ≤ K, ∃ l, s.t. (ulk )H v j , 0 . In addition, define n o Ωk = { j | (k, j) ∈ Ω} = j | j ∈ {1, . . . , K}\{k}, (u1k )H v j = · · · = (ukL )H v j = 0 . Then Ω =
S
1≤k≤K
Ωk .
Notice that for a positive measure of τli j , there exists {uk , vk }k=1,...,K that satisfies the lifted zero forcing type IA condition (88). In other words, each τli j corresponds to at least one solution {uk , vk }k=1,...,K which defines a collection of sets {Ωk }k=1,...,K ⊆ {1, . . . , K}K . There are finitely many subsets of {1, . . . , K}K , thus there exists one collection of sets {Ωk }k=1,...,K , such that a positive measure of τli j K . Denote this set of τl as H which has a positive measure. correspond to {Ωk }k=1 0 ij Consider the collection of sets {Ωk }1≤k≤K that corresponds to τli j ∈ H0 . We will bound |Ω| by the induction hypothesis and bound |Ωc | by Lemma 6.1. Specifically, to derive an upper bound on |Ω|, we use the definition of Ωk to obtain Uk , span{u1k , . . . , ukL } ⊥ Vk , span{v j | j ∈ Ωk }, 35
∀ 1 ≤ k ≤ K,
(89)
which implies dim(Vk ) + dim(Uk ) ≤ N, ∀ 1 ≤ k ≤ K. 1 uk . Since uk = .. is a nonzero vector, dim(Uk ) ≥ 1. Then ukL pk , dim(Vk ) ≤ N − dim(Uk ) ≤ N − 1,
(90)
∀ 1 ≤ k ≤ K.
Although all (τi j ) in H0 correspond to the same collection of sets {Ωk }k=1,...,K , they may correspond to different sets of dimensions {pk }k=1,...,K . Using a similar argument as before, we can show that there exist a positive measure of (τi j ) in H0 which corresponds to the same set of dimensions {pk }k=1,...,K . These (τi j ) form a subset of H0 , denoted as H. Consider the collection of dimensions {pk }k=1,...,K that corresponds to (τi j ) ∈ H. We prove that |Ωk | ≤ g(pk ) ≤ Lpk +
p2k , 4
∀ 1 ≤ k ≤ K.
(91)
Let D ∈ CN×pk be the basis matrix of Vk , and suppose v j = D¯v j , ∀ j ∈ Ωk , where v¯ j ∈ C pk \{0}. From the lifted IA condition (82), we have 1 τ1i j v j τii vi . . . y span . | j ∈ Ωk \{i} , ∀ i ∈ Ωk , . . L L τ vj τii vi ij 1 τ1i j D¯v j τii D¯vi . . . , ∀ i ∈ Ωk , (92a) | j ∈ Ω \{i} ⇐⇒ .. y span k . τL D¯v j τiiL D¯vi ij 1 1 τi j v¯ j τii v¯ i . . .. | j ∈ Ωk \{i} =⇒ .. y span , ∀ i ∈ Ωk . (92b) τL v¯ j τiiL v¯ i ij
The last step can be proved by contradiction. If (92b) does not hold, then the LHS (left-hand side) of (92b) belongs to the space on the RHS (right-hand side) of (92b). Multiply both sides by D 0 · · · 0 0 D · · · 0 LN×Lpk , we obtain that the LHS of (92a) belongs to the space on the RHS of .. ∈ C .. .. . . . . . . 0 0 ··· D (92a), which contradicts (92a). Now for a positive measure of (τli j )i, j∈Ωk ,1≤l≤L , there exist v¯ i ∈ C pk \{0}, i ∈ Ωk that satisfy (92b). By the definition of g(·), |Ωk | ≤ g(pk ). Using the induction hypothesis, we have g(pk ) ≤ Lpk + Thus, (91) is proved. 36
p2k 4 .
Summing up (91) for k = 1, . . . , K, we have K X
2 X p k Lp + . |Ω| = |Ωk | ≤ k 4 k=1 k
(93)
Next, we provide an upper bound on |Ωc |. For each pair (k, j) ∈ Ωc , there exists lk j ∈ {1, . . . , L} H v , 0. Then equation (88a) implies that such that ukl j kj P l H l,lk j τk j ukl v j lk j τk j = − , ∀ (k, j) ∈ Ωc . (94) H uklk j v j The system of equations (94) in variables u1i , . . . , uiL , vi , i = 1, . . . , K (parameterized by τlk j ) has |Ωc | equations. We compute the number of free variables in {uk , vk }1≤k≤K . Since scaling v j does not affect the system of equations (94), we can scale each v j to make one entry of it to be 1. Therefore, the number of variables in {vi }1≤i≤K is K(N − 1). To count the number of free variables in {uk }, notice that the condition (89) implies that u1k , . . . , ukL lie in Vk⊥ (the orthogonal complement of a fixed pk -dim space Vk ). By a similar argument as that in Section 6.2, we can show that the number of free variables in uk K is is L(N − pk ) − 1. In summary, the number of free variables in {uk , vk }k=1 X X K(N − 1) + (L(N − pk ) − 1) = K((L + 1)N − 2) − L pk . k
k
For a positive measure of (τlk j )(k, j)∈Ωc ,1≤l≤L , the polynomial system (94) has a solution {uk , vk }1≤k≤K H v , 0. It follows from Lemma 6.1 that the number of equations should not exceed the such that ukl j kj number of variables, i.e. X pk . (95) |Ωc | ≤ K((L + 1)N − 2) − L k
With the bounds on |Ω| and |Ωc |, we can now provide an bound on K. Summing up (95) and (93) yields X X p2k c pk , K(K − 1) = |Ω| + |Ω | ≤ Lpk + + K((L + 1)N − 2) − L 4 k k implying
X p2 1 X p2k k K(K − 1) ≤ K((L + 1)N − 2) + or equivalently K ≤ (L + 1)N − 1 + .. 4 K k 4 k
(96)
Thus, if pk ≤ N − 2, ∀k, then (96) leads to K ≤ (L + 1)N − 1 +
(N − 2)2 N2 1 X p2k ≤ (L + 1)N − 1 + = LN + , K k 4 4 4
as desired. It remains to consider the case pk = N − 1 for some k. We need to prove the following claim. 37
(97)
Claim B.1 If there exists k such that pk = N − 1, then K ≤ |Ωk | + L.
(98)
Proof of Claim B.1: According to (90), 1 ≤ dim(Uk ) ≤ N − pk = 1, thus dim(Uk ) = 1. Then u1k , . . . , ukL are parallel. Without loss of generality, we assume ukL , 0; then Uk = span{ukL }. Suppose ukt = γt ukL , t = 1, . . . , L − 1; According to IA condition (88a), we have X H 0= + τkLj , ∀ j ∈ {1, . . . , K}\{k}. v j = (ukL )H v j γ1 τ1k j + · · · + γL−1 τkL−1 ukl j l
Since (γ1 , . . . , γL−1 , 1) can be orthogonal to at most L − 1 vectors τ1k j , . . . , τkLj , we have that γ1 τ1k j + + τkLj , 0 for at least (K − 1) − (L − 1) = K − L j’s. Therefore, · · · + γL−1 τkL−1 j 0 = (u2k )H v j ⇐⇒ u2k ⊥ v j holds for at least (K − L) j’s. Hence, Ωk = { j | Uk ⊥ v j } has at least (K − L) elements, which proves (62). Q.E.D. To complete the induction step, suppose pk = N − 1. Using (98) and (91), we have K ≤ |Ωk | + L ≤ Lpk +
p2k (N − 1)2 N2 + L = L(N − 1) + + L < LN + , 4 4 4 2
2
as desired. Combining this with (97) yields K ≤ LN + N4 , which further implies g(N) ≤ LN + N4 holds for N. This completes the induction step, so that (86) holds for any N. Finally, combining (86) 2 and Lemma B.1, we obtain f (N) ≤ g(N) ≤ LN + N4 .
Acknowledgement We would like to thank Guy Bresler and David Tse for many fruitful discussions related to the subject of this paper. We are also grateful to Syed Jafar for the valuable discussions regarding the use of Bernstein’s theorem.
References [1] M.A. Maddah-Ali, A.S. Motahari, and A.K. Khandani, “Communication over MIMO X channels: Interference alignment, decomposition, and performance analysis,” IEEE Transactions on Information Theory, vol. 54, no. 8, pp. 3457– 3470, Aug. 2008. [2] V.R. Cadambe and S.A. Jafar, “Interference alignment and degrees of freedom of the K-user interference channel,” IEEE Transactions on Information Theory, vol. 54, no. 8, pp. 3425–3441, 2008.
38
[3] A. Host-Madsen and A. Nosratinia, “The multiplexing gain of wireless networks,” in International Symposium on Information Theory (ISIT), 2005, pp. 2065–2069. [4] A. S. Motahari, S. O. Gharan, M.-A. Mohammad-Ali, and A. K. Khandani, “Real interference alignment: Exploiting the potential of single antenna systems,” arXiv preprint arXiv:0908.2282, 2009. [5] A. Ghasemi, A.S. Motahari, and A.K. Khandani, “Interference alignment for the K user MIMO interference channel,” in IEEE International Symposium on Information Theory Proceedings (ISIT), Jun. 2010, pp. 360–364. [6] C.M. Yetis, T Gou, S.A. Jafar, and A.H. Kayran, “On feasibility of interference alignment in MIMO interference networks,” IEEE Transactions on Signal Processing, vol. 58, no. 9, pp. 4771–4782, Sep. 2010. [7] M. Razaviyayn, G. Lyubeznik, and Z.-Q. Luo, “On the degrees of freedom achievable through interference alignment in a MIMO interference channel,” IEEE Transactions on Signal Processing, vol. 60, no. 2, pp. 812–821, Feb. 2012. [8] G. Bresler, D. Cartwright, and D. Tse, “Settling the feasibility of interference alignment for the MIMO interference channel: the symmetric square case,” CoRR, vol. abs/1104.0888, 2011. [9] M. Razaviyayn, M. Sanjabi, and Z.-Q. Luo, “Linear transceiver design for interference alignment: Complexity and computation,” IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 2896–2910, 2012. [10] G. Bresler and D. Tse, “3 user interference channel: Degrees of freedom as a function of channel diversity,” in Annual Allerton Conference on Communication, Control, and Computing (Allerton), Oct. 2009, pp. 265–271. [11] S.A. Jafar, “Exploiting channel correlations - simple interference alignment schemes with no CSIT,” in IEEE Global Telecommunications Conference (GLOBECOM), Dec. 2010, pp. 1–5. [12] T. Gou, C. Wang, and S.A. Jafar, “Aiming perfectly in the dark-blind interference alignment through staggered antenna switching,” IEEE Transactions on Signal Processing, vol. 59, no. 6, pp. 2734–2744, Jun. 2011. [13] T. Gou and S.A. Jafar, “Degrees of freedom of the K user MIMO interference channel,” in Asilomar Conference on Signals, Systems and Computers (Asilomar), Oct. 2008, pp. 126–130. [14] K. Gomadam, V.R. Cadambe, and S.A. Jafar, “Approaching the capacity of wireless networks through distributed interference alignment,” in IEEE Global Telecommunications Conference (GLOBECOM), Dec. 2008, pp. 1–6. [15] C. Shi, R.A. Berry, and M.L. Honig, “Interference alignment in multi-carrier interference networks,” in IEEE International Symposium on Information Theory Proceedings (ISIT). IEEE, 2011, pp. 26–30. [16] R. Brandt, P. Zetterberg, and M. Bengtsson, “Interference alignment over a combination of space and frequency,” . [17] S.H. Song, X. Chen, and K. B. Letaief, “Achievable diversity gain of K-user interference channel,” in IEEE International Conference on Communications (ICC). IEEE, 2012, pp. 4197–4201. [18] D. A. Cox, J. B. Little, and D. O’Shea, Using algebraic geometry, Springer New York, 1998. [19] I.Z. Emiris and J.F. Canny, “Efficient incremental algorithms for the sparse resultant and the mixed volume,” Journal of Symbolic Computation, vol. 20, no. 2, pp. 117–149, 1995. [20] B. Sturmfels, Solving systems of polynomial equations, vol. 97, Amer Mathematical Society, 2002. [21] B. Huber and B. Sturmfels, “Bernstein’s theorem in affine space,” Discrete & Computational Geometry, vol. 17, no. 2, pp. 137–141, 1997. [22] J. M. Rojas and X. Wang, “Counting affine roots of polynomial systems via pointed newton polytopes,” Journal of Complexity, vol. 12, no. 2, pp. 116–133, 1996. [23] T. Li and X. Wang, “The BKK root count in Cn ,” Mathematics of Computation of the American Mathematical Society, vol. 65, no. 216, pp. 1477–1484, 1996. [24] P. Morandi, Field and Galois theory.
39