1
Shannon Capacity of MIMO Systems and Metastability of Random Frustrated Spin Systems Ori Shental and Ido Kanter∗
Abstract A noiseless multiple-input multiple-output (MIMO) channel operating under a strict complexity constraint at the receiver is introduced. According to this constraint, detected bits, obtained by performing hard decisions directly on the channel’s matched filter output, must be the same as the transmitted binary inputs. An asymptotic analysis is carried out using mathematical tools originated from the study of metastable states of spin glasses. It is shown that such complexity-constrained channel exhibits a non-trivial Shannon-theoretic capacity, rigorously analyzed and corroborated using finite-size channel simulations.
Index Terms Shannon capacity, metastability, MIMO, large-system analysis, statistical mechanics, spin glass.
∗ Corresponding Author. O. Shental is with the Center for Magnetic Recording Research (CMRR), University of California, San Diego (UCSD), 9500 Gilman Drive, La Jolla, CA 92093, USA (e-mail:
[email protected]). I. Kanter is with the Minerva Center and Department of Physics, Bar-Ilan University, Ramat-Gan 52900, Israel (e-mail:
[email protected]). November 29, 2007
DRAFT
2
I. I NTRODUCTION Multiple-input multiple-output (MIMO) multi-antenna systems are prominent in modern high data-rate wireless communications. Investigation of reliable (i.e. , errorless) communication via the MIMO channel is a long-standing and productive research topic (see, e.g. , [1]). In these systems, information is conveyed simultaneously from a group of antennas to another group of antennas over the same physical medium and bandwidth. These transmissions are not orthogonal and interfere with each other. Detrimental effects such as interference and noise can be completely eliminated in theory by adopting optimal detection schemes and sophisticated error-correcting codes [2]. A typical investigation of a MIMO channel often assumes an upper bounded transmission power, but no restrictions on complexity are imposed. In the era of ubiquitous and pervasive communications in, for example, indoor and personal area networks (PAN), there is an emerging interest in a complementary scenario. According to this scenario, the MIMO system operates in a high SNR regime (thus power limitation is less crucial), but is highly restricted by its signal processing complexity. In this contribution such a reduced complexity setting is introduced and analyzed. This reduced complexity scenario requires that detected bits, sliced at the output of a bank of filters matched to the MIMO channel, be the same as the transmitted binary inputs. This constraint is analogous to the constraint on single-spin (neuron) flip metastable states of the SK [3], [4] (Hopfield [5]) model for spin glasses (neural networks) [6], [7] ( [8], [9]). Hence, we term this scheme metastable MIMO, as transmitted bit combinations are preselected so as to comply with the metastability constraint. Such a transmission scheme would result in the appealing use of low-cost receivers, in the context of reduced signal processing and computing. In a sense, this scheme is equivalent to outsourcing part of the detection complexity back to the transmitters. However, although the receivers need only do minimal signal processing, communication does not conclude until they properly interpret the received bits. This can be done using a decoding table identical to the transmitters’ joint coding table. Although the complexity of this task cannot be outsourced back to the transmitters, it is solved by proper memory allocation [2]. In this contribution, a variant of the Hopfield model is used to compute the Shannon capacity of the noiseless metastable MIMO channel in the many-antennas limit, revealing the cost in ‘information rate’ caused by limiting the channel complexity. Interestingly, we
November 29, 2007
DRAFT
3
find that contrary to intuition, the metastable MIMO channel yields a non-trivial capacity. It is important to note that the metastable MIMO rule employed in this paper is conceptually different than the known technique of multiuser precoding (or pre-equalization) [10]–[12] in which there is no restriction on the input signaling. At this point, it is worth mentioning the seminal work of Sourlas [13], first to establish a connection between statistical physics and Shannon capacity. A valuable relation between the metastability of random frustrated systems, which was first developed for the SK model, and advanced communication protocols, like CDMA, was also recently derived [14]–[17]. The paper is organized as follows. The metastable MIMO model is outlined in Section II. In Section III the asymptotic information capacity of the channel is derived. The results are shown and discussed in Section IV. We conclude in Section V. II. M ETASTABLE MIMO M ODEL Consider a synchronous noiseless MIMO channel with K transmitting antennas and N receiving antennas. The channel is characterized by a N × K random matrix S with independent identically distributed entries with 1/N variance and zero mean. The channel matrix S is assumed to be perfectly known at both ends of the channel. The input vector consists of K binary entries which are subject to transformation by the channel matrix S. The N length received column vector is projected back into the K-dimensional information space by passing through a bank of filters matched to the channel matrix S. Thus overall, an input column vector x , (x1 , ..., xK ) is transformed into an output vector y , (y1 , ..., yK ) by y = ST Sx,
(1)
where the operator {·}T denotes a matrix transpose. The pointwise counterpart is given by X yk = x k + ρki xi , (2) i6=k
where the k’th transmitting antenna’s matched filter output, y k , is the designated bit, xk , corrupted by an interference term. This interference term is composed of a summation over (cross-correlation) scaled versions of all other transmitted bits. The set of all cross-correlations ρki , i.e. , the entries of matrix ST S, is hereinafter denoted by ρ. In the following asymptotic analysis, we assume that K → ∞, yet the antenna ratio β , K/N , α −1 is kept constant, and the information rate is the same for all transmitting antennas, i.e., R k = R.
November 29, 2007
DRAFT
4
We wish to convey information reliably through the channel (2) under a low-complexity constraint on the receiver. According to this (metastability) constraint, detected bits, xˆ k , obtained by performing hard decisions directly on the channel’s matched filter output samples, must be the same as the transmitted bits. Explicitly, xk ≡ xˆk = Sign(yk ),
(3)
where Sign(·) is the trivial signum function. This complexity constraint turns out to be similar to the fixed-points equation determining the metastable states of Hopfield model, remaining unchanged under a single spin-flip dynamics. Under the constraints outlined above it is clear that not all combinations of input symbols will result in errorless communication. Thus, the capacity of the channel can be obtained by evaluating the number of codewords that ensure errorless detection. In the following, we prove that this metastable MIMO channel setting yields non-trivial capacity, which, again, corresponds to the expectation of the number of single spin-flip metastable states of a version of Hopfield model. III. S HANNON C APACITY In the following capacity analysis, the assumption is made that as the dimensions of the system go to infinity at constant ratio (K, N → ∞; K/N = β), the microstate (probability) distribution converges to its own average and fluctuations away from the most likely macrostate vanish. This dominant macrostate is equivalent to what in information theory is known as the set of typical sequences. This entails that the distribution of the total number N of input symbols that obey the metstable MIMO rule is largely peaked around < N >, where < · > denotes the configurational average with respect to all channel microstates S. This observation allows us to simplify the problem by taking the so called annealed
approximation and directly performing the average of N over the quenched channel states S. This assumption is not uncommon in the study of various spin and neural systems exhibiting quenched interactions (see, e.g. , [8], [9], [18]–[24]). The annealed approximation is for now accepted, and its validity will be further addressed in Section IV, where it will be discussed and tested against finite size simulations. A binary codeword xc , {xc1 , . . . , xcK }, composed of all K transmitting antennas’ bits at a given channel use, for which the channel constraints hold, satisfies the condition Z ∞Y Z ∞Y K K c dλk δ(yk − λk xk ) = dλk δ(αyk − λk xck ) = 1, 0
November 29, 2007
k=1
0
(4)
k=1
DRAFT
5
where δ(·) is the Dirac delta function. Let the random variable N(K, β, ρ) be (an upper bound on) the number of codewords, i.e., Z ∞Y XY X N(K, β, ρ) , dλk δ αρki xi − λk xk , −α
where
P
x
k
k
x
(5)
i6=k
corresponds to a sum over all the possible values of the transmitted input symbols.
Assuming equal information rates per transmitting antenna, the corresponding asymptotic Shannon capacity of the channel is defined [2], in bit information units, as C∞ (β) , lim log2 N(K, β, ρ)/K.
(6)
K→∞
According to the self-averaging property [25], in the large-system limit, K → ∞, the number of successful codewords N(K, β, ρ) is equal to its expectation with respect to (w.r.t.) the distribution of ρ, i.e., lim N(K, β, ρ) = N (β) = lim
K→∞
K→∞
X
×
x
*
Z
∞ −α
Y
dλk
k
Y X δ αρki xi − λk xk k
i6=k
+
,
(7)
ρ
where N (β) and < · >ρ denote the average and averaging operation, respectively. Representing the delta function by the inverse Fourier transform of an exponent and substituting xk ωk for the angular frequency of the Fourier transform ωk , expression (7) can be rewritten as Z ∞Y 1 dωk dλk N (β) = lim K→∞ −α (2π)K −∞ k k X X × exp j ωk λk · E, Z
√ where j , −1 and
∞
E,
(8)
k
x
*
Y
N X 1 X exp − j sµk sµi xi xk ωk K µ=1 i6=k
+
.
(9)
ρ
The expectation E can be also written as X E = exp (jα ωk ) k
×
November 29, 2007
D
X µ E j XX µ ( s xk ωk )( s k xk ) . exp − K µ k k ρ k
(10)
DRAFT
6
Using a transformation [24, eq. (2.14)], the expectation becomes Z ∞Y X daµ E = exp (jα ωk ) 1/2 −∞ µ (2π/K) k Z ∞Y KX dbµ 2 2 exp j × (a − bµ ) 1/2 2 µ µ −∞ µ (2π/K) X × exp log cos(ck,µ ) ,
(11)
k,µ
where
1 ck,µ , √ ωk (aµ + bµ ) + (aµ − bµ ) . (12) 2 √ P µ Since k sk xk in (10) is O( K) for an overwhelming majority of codewords, for the √ expectation E to be finite, aµ and bµ must be O(1/ K). Hence, expanding the log cos(·) term in exponent (11) and neglecting terms of order 1/K and higher, we get Z ∞Y X daµ E = exp (jα ωk ) 1/2 −∞ µ (2π/K) k Z ∞Y KX dbµ 2 2 (a − bµ ) exp j × 1/2 2 µ µ −∞ µ (2π/K) 1X cˆk,µ , × exp − 4 k,µ
(13)
where cˆk,µ , ωk2 (aµ + bµ )2 + 2ωk (a2µ − b2µ ) + (aµ − bµ )2 .
(14)
Now, the multi-dimensional integral (13) is solved using the following mathematical recipe: New variables are introduced 1 X (aµ + bµ )2 , a, 2α µ
b,
j X 2 (a − b2µ ) + 1. 2α µ µ
(15)
Equations (15) can be reformulated via the integral representation of a delta function using the corresponding angular frequencies A and B, respectively, Z ∞ X (aµ + bµ )2 da dA exp jKA(αa − ) = 1, 2 −∞ 2π/Kα µ Z ∞ X (a2µ − b2µ ) db dB exp jKB(αb − j − α) = 1. 2 −∞ 2π/Kα µ
(16) (17)
Substituting these (unity) integrals into the expectation expression (13) and rewriting it using a and b, the integrations over aµ and bµ are decoupled and can be performed easily. Next, for the asymptotics K → ∞, the integration over the frequencies A and B can be November 29, 2007
DRAFT
7
performed algebraically by the Saddle-Point method [25]. According to this method, the main contribution to the integral comes from values of A and B in the vicinity of the maximum of the exponent’s argument. Finally, the E term boils down to Z ∞ da db 1 (1 − b)2 1 E = exp Kα(b − + + log a) 2 2a 2 −∞ 4π/Kα X 1 X 2 × exp − αa ωk + jαb ωk . 2 k k
(18)
Substituting the expectation term (18) back in (8), the integrand in the latter becomes P independent of x, therefore the x can be substituted by multiplying with the scalar 2K , and the resulting ω dependent integrand is a Gaussian function. Thus performing Gaussian
integration and exploiting the symmetry in the K-dimensional space, we get Z ∞ da db 1 N (β) = lim K K→∞ π −∞ 4π/Kα 1 (1 − b)2 1 × exp Kα b − + + log a 2 2a 2 r 2π Z ∞ (αb + λ)2 dλ exp − × exp K log . αa −α 2αa √ Using the rescaling (αb + λ)/ αa → λ, the integral (19) becomes Z ∞ da db exp Kg(a, b, β) , N (β) = lim K→∞ −∞ 4π/Kα
(19)
(20)
where the function g(a, b, β) is defined by
1 1 (1 − b)2 1 + log a b− + β 2 2a 2 + log 2Q(t) .
g(a, b, β) ,
(21)
√ √ The definitions of the auxiliary variable t , α(b − 1)/ a and the error function Q(x) , √ R∞ 1/ 2π x dy exp (−y 2 /2) are used. Again, for K → ∞, the double integral in (20) can be
evaluated by the Saddle-Point method. Hence, we find
N (β) ∝ lim exp Kg(a∗ , b∗ , β) , K→∞
(22)
where a∗ and b∗ are found by the saddle-point conditions, which yield the following equations ∂g(a, b, β) (1 − b)2 Q0 (t) = β −1 −1 +t = 0, ∂a a Q(t) 1 Q0 (t) 1 − b ∂g(a, b, β) +√ = β −1 1 − = 0. ∂b a aβ Q(t)
November 29, 2007
(23) (24)
DRAFT
8
The operator Q0 denotes a derivative of Q w.r.t. its argument. This set of saddle-point equations can be solved numerically (iteratively, and it always converges in the examined model [9]) to obtain its fixed-points a∗ , b∗ and t∗ . Finally, substituting (22) the asymptotic capacity, in nat per symbol per transmitting antenna (tx), is now easily obtained C∞ (β) = g(a∗ , b∗ , β) = log 2Q(t∗ ) +
1 ∗ 1 (1 − b∗ )2 1 ∗ b − + + log a , β 2 2a∗ 2
(25)
which forms our pivotal result. In section IV we further discuss the theoretical results and compare them with computer simulations of the complexity-constrained (metastable) MIMO channel. IV. R ESULTS Fig. 1 displays the asymptotic capacity C∞ (25), obtained by solving iteratively the saddlepoint conditions (23)-(24), as a function of the antenna ratio β. Interestingly, for small β . 0.1 values the trivial 1 bit upper bound (of an optimal receiver, i.e., matrix inversion) is practically achieved by this simple hard decision operation. Nevertheless, even for higher non-trivial antenna ratio such a complexity-constrained metastable MIMO setting still yields substantial achievable information rates. Note, in passing, that for a heavily antenna-unbalanced system (i.e., β → ∞) the capacity curve decay coincides with Hopfield model’s capacity (see [8, eq. (12)] for an analytical approximation of this capacity decay to zero.) In order to validate the analytically derived asymptotic capacity C ∞ (β), we evaluated the capacity CK (β) of a MIMO channel with large, yet finite number of transmitting antennas K, using exhaustive search simulations. The number of successful binary codewords, maintaining the channel constraints, was obtained by examining all 2 K possible codewords. The (quenched) average logarithm of the counted number, normalized by the number of transmitters K, gives the capacity CK . Fig. 1 presents the capacity obtained by simulations for K = 25. As can be seen, the empirical (quenched) capacity for finite K deviates only slightly from the analytically obtained asymptotic (annealed) capacity. These results substantiate the analysis of the complexity-constrained metastable MIMO system. As mentioned in Section III, as the total number of metastable MIMO inputs scales exponentially with K, the relevant quantity to be computed is the quenched average < log N >. We made use of the annealed approximation and computed instead the annealed average November 29, 2007
DRAFT
9
C ∞ C
1
K=25
Bit/Symbol/Tx
0.9 0.8 0.7 0.6 0.5 0.4 0.01
Fig. 1.
0.1
Ratio β
1
Asymptotic (annealed) capacity C∞ (solid line), in terms of bit/symbol/tx, as a function of antenna ratio β. Also
drawn is the finite-size (quenched) simulation-averaged capacity C K for K = 25 (empty squares). Vertical bars denote standard deviation in simulation results.
log < N > which, as Jensen’s inequality tells us, provides an upper bound to the former. In the above experiments, we carried out both annealed and quenched finite-size simulations and found that the results obtained with the two methods are essentially identical within error bars, thus for clarity the annealed-based capacity simulation results were omitted from Fig. 1. An asymptotic calculation of K −1 < log N > would require use of the replica method [26]. Bray and Moore [7], [27] found that for infinite-range spin glasses both K −1 < log N > and K −1 log < N > (N being the number of metastable states within an infinitesimal energy band) become identically equal for vanishing off-diagonal order parameters in replica space. They also conducted a stability analysis and found that these diagonal solutions are indeed locally stable, which suggests that the number of metastable states is itself self-averaging. The agreement among both finite-size averages in the metastable MIMO channel, as well as November 29, 2007
DRAFT
10
between them and the asymptotic approximation, suggests that a similar conclusion applies to the number of inputs determined by the complexity constraint (3), or at least that the annealed approximation tightly upper bounds the Shannon capacity of the channel. V. C ONCLUDING R EMARKS We have examined the asymptotic capacity of a MIMO system under a strict input metastability routine. With this routine the complexity of detection is partially outsourced from the receivers back to the transmitters. In order to validate the analytically-derived results, we evaluated the capacity for a large yet finite number of transmitting antennas. Using exhaustive search simulations we verified that convergence is fast, and the asymptotic approximation reasonable for as few as 25 transmitters. Determining the explicit metastable transmissions in a diagrammatic manner (rather than via brute-force enumeration, which becomes infeasible for large K), remains an interesting open research question. R EFERENCES [1] B. M. Hochwald, T. L. Marzetta, and B. Hassibi, “Space-time autocoding,” IEEE Trans. Inform. Theory, vol. 47, pp. 2761–2781, Nov. 2001. [2] T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley and Sons, 1991. [3] D. Sherrington and S. Kirkpatrick, “Solvable model of a spin glass,” Phys. Rev. Lett., vol. 35, pp. 1792–1796, 1975. [4] ——, “Infinite-ranged models of spin glasses,” Phys. Rev. B, vol. 7, pp. 4384–4403, 1978. [5] J. J. Hopfield, Proc. Natl. Acad. Sci., vol. 79, 1982. [6] S. F. Edwards and F. Tanaka, “The ground state of a spin glass,” J. Phys. F: Metal Phys., vol. 10, pp. 2471–2476, 1980. [7] A. J. Bray and M. A. Moore, “Metastable states in spin glasses,” J. Phys. C: Solid St. Phys., vol. 13, pp. L469–L476, 1980. [8] E. J. Gardner, “Structure of metastable states in the Hopfield model,” J. Phys. A: Math. Gen., vol. 19, pp. L1047–L1052, 1986. [9] M. P. Singh, “Hopfield model with self-coupling,” Phys. Rev. A, vol. 64, pp. 051 912–1–051 912–9, 2001. [10] B. R. Vojˇci´c and W. M. Jang, “Transmitter precoding in synchronous multiuser communications,” IEEE Trans. Commun., vol. 46, no. 10, pp. 1346–1355, Oct. 1998. [11] C. B. Peel, B. M. Hochwald, and A. L. Swindleburst, “A vector-perturbation technique for near-capacity multiantenna multiuser communication-part I: channel inversion and regularization,” IEEE Trans. Commun., vol. 53, pp. 195–202, Jan. 2005. [12] ——, “A vector-perturbation technique for near-capacity multiantenna multiuser communication-part II: perturbation,” IEEE Trans. Commun., vol. 53, pp. 537–544, Mar. 2005. [13] N. Sourlas, Nature, vol. 338, pp. 693–695, 1989. [14] O. Shental, I. Kanter, and A. J. Weiss, “Capacity of complexity-constrained noise-free CDMA,” IEEE Commun. Lett., vol. 10, pp. 10–12, Jan. 2006.
November 29, 2007
DRAFT
11
[15] O. Shental and I. Kanter, “Optimum asymptotic multiuser efficiency of pseudo-orthogonal randomly spread CDMA,” in Proc. WIC 27th Symposium on Information Theory in the Benelux (SITB), Noordwijk, The Netherlands, June 2006. [16] R. de Miguel, O. Shental, R. R. Mu¨ ller, and I. Kanter, “Outsourcing the complexity of detection in MIMO channels,” in Proc. IEEE Information Theory Workshop (ITW), Chengdu, China, Oct. 2006. [17] ——, “Information and multiaccess interference in a complexity-constrained vector channel,” J. Phys. A, vol. 40, pp. 5241–5260, 2007. [18] F. Tanaka and S. F. Edwards, “Analytic theory of the ground state properties of a spin glass: I. Ising spin glass,” J. Phys. F: Metal Phys., vol. 10, pp. 2769–2778, 1980. [19] F. R. Waugh, C. M. Marcus, and R. M. Westervelt, “Fixed-point attractors in analog neural computation,” Phys. Rev. Lett., vol. 64, 1990. [20] T. Fukai and M. Shiino, “Large suppression of spurious states in neural networks of nonlinear analog neurons,” Phys. Rev. A, vol. 42, pp. 7459–7466, 1990. [21] E. Korutcheva, “The number of metastable states of a simple perceptron with gradient descent learning algorithm,” J. Phys. A: Math. Gen., vol. 26, pp. L1021–L1027, 1993. [22] A. C. C. Coolen and D. Sherrington, “Order-parameter flow in the fully connected hopfield model near saturation,” Phys. Rev. E, vol. 49, pp. 1921–1934, 1994. [23] M. P. Singh, Z. Chengxiang, and C. Dasgupta, “Fixed points in a hopfield model with random asymmetric interactions,” Phys. Rev. E, vol. 52, pp. 5261–5272, 1995. [24] A. D. Bruce, E. J. Gardner, and D. J. Wallace, “Dynamics and statistical mechanics of the Hopfield model,” J. Phys. A: Math. Gen., vol. 20, pp. 2909–2934, 1987. [25] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics.
New York: Springer-Verlag, 1985.
[26] S. F. Edwards and P. W. Anderson, “Theory of spin glasses,” J. Phys. F: Metal Phys., vol. 5, pp. 965–974, 1975. [27] A. J. Bray and M. A. Moore, “Metastable states in spin glasses with short-ranged interactions,” J. Phys. C: Solid St. Phys., vol. 14, pp. 1313–1327, 1981.
November 29, 2007
DRAFT