IEEE COMMUNICATIONS LETTERS, VOL. 16, NO. 6, JUNE 2012
921
Enhancing the Performance of the SIC-MMSE Iterative Receiver for Coded MIMO Systems via Companding Xiaoming Dai
Abstract—Iterative detection and decoding (IDD) method based on soft interference cancellation and minimum-mean squared-error filtering (SIC-MMSE) has received considerable attention in recent years due to its good performance-complexity tradeoff for coded multiple-input multiple-output (MIMO) systems. The Gaussianity of the a priori and a posteriori loglikelihood ratios (LLRs) computed at the constitute stages of the SIC-MMSE iterative receiver is a presumption for IDD to work. In this letter, the Gaussianity assumption is first shown to be not tight for high rate coded MIMO systems and thus leads to poor performance (for high rate coded MIMO systems). Then a non-linear companding based transformation method is incorporated into the SIC-MMSE iterative receiver to alleviate the non-Gaussianity of the a priori and a posteriori LLRs due to the imperfection of the (high-rate) code and per-stream approximation. Analytical and numerical results show that the proposed transformed SIC-MMSE iterative receiver achieves significant performances gains over the conventional one for coded MIMO systems, in particular, high rate coded ones with even lower computational complexity. Index Terms—Companding, iterative detection and decoding (IDD), soft interference cancellation and minimum-mean squared-error filtering (SIC-MMSE).
I. Introduction
T
HE low-complexity iterative detection and decoding technique based on soft cancellation and minimum-mean squared-error filtering (SIC-MMSE) has received considerable attention recently due to its good performance-complexity tradeoff for coded MIMO systems in academia [1]–[4] as well as in industry [5]. The core idea underlying the SICMMSE is to compute estimates of the transmitted symbols based on the a priori log-likelihood ratio (LLR) obtained from the channel decoder. The estimates are then utilized to calculate soft symbols and cancel the interference in the received signal vector. The residual interference plus noise (RIN) is equalized using the MMSE filter based on the tacit Gaussian distributed assumption (of the RIN), followed by a computation of the per-stream a posteriori LLR. The softin soft-output minimum-mean squared-error filtering parallel interference cancellation (SISO MMSE-PIC) based iterative receiver has achieved better performance than the (noniterative) maximum-likelihood (ML) based detector for low rate coded (rate-1/2) MIMO systems [5] with less detectiondecoding computational complexity. In practical commercial wireless systems, various rate codes are normally applied to opportunistically exploit the channel Manuscript received March 20, 2012. The associate editor coordinating the review of this letter and approving it for publication was J. Choi. X. Dai is with the Datang Wireless Mobile Innovation Center, China Academy of Telecommunication Technology, Xueyuan Road, 29, Beijing, 100080, P. R. China (e-mail:
[email protected];
[email protected]). Digital Object Identifier 10.1109/LCOMM.2012.042512.120618
conditions. For example, code rates ranging from 0.1 to 0.93 are specified in the third generation partnership project (3GPP) long-term-evolution (LTE) system [8]. All the cases discussed in [1]–[5] are restricted to low code rate cases (i.e., lower than or equal to 1/2). For high rate coded MIMO systems, the less reliable a priori information (compared with that of the low rate coded ones) due to the more prominent imperfection of the high rate codes and the per-stream approximation compel the Gaussian distributed (a priori and a posteriori LLR) assumption questionable which has not been noticed before. In this work, we first show that the distribution of the a priori LLR exhibits higher kurtosis (the fourth cumulant divided by the square of the second cumulant) as the code rate increases. The a priori LLR’s heavy-tailedness is further exacerbated by the imperfect channel state information at the receiver side. As a result, the per-stream a posteriori LLR calculated based on the a priori LLR and the MMSE filter of the high rate codes exhibits significant peakedness and heavy-tailedness, which in turn degrades significantly the subsequent channel decoding performance and leads to nonGaussian distributed extrinsic LLR output for the decoder (a priori LLR information for the ensuing soft-interference cancellation). This vicious cycle continues as the iterative decoding carries on which has not been realized before. Based on the analysis of the distributions of the a priori and a posteriori LLRs, we introduce a companding transformation technique to the SIC-MMSE iterative receivers. By compressing the a priori and a posteriori LLRs along with their statistical characteristics, the inherent imprecise estimation of the reliability of the detected information bits caused by the imperfection of high rate codes, per-stream independence approximation, and/or channel estimation error is significantly alleviated without excessive compromise of the reliable ones. Based on the proposed non-linear transformation, the original leptokurtically distributed a priori LLR and the per-stream a posteriori LLR are transformed to more mesokurtically distributed ones, thus significantly enhancing the performance of subsequent channel decoding. The probability density function (pdf) of the a posteriori LLR output of the channel decoder is thus more mesokurtical than that of the non-transformed one. This cycle proceeds as the iterative detection carries on. As a result, the performance of the MMSE-SIC iterative receiver is greatly enhanced most notably for high rate coded cases with channel estimation errors. II. Conventional Soft-In-Soft-Output MMSE PIC Iterative Detection and Decoding Algorithm We consider a space-time bit-interleaved coded modulation (ST-BICM) MIMO multiplexing system with NT transmit and
c 2012 IEEE 1089-7798/12$31.00
922
IEEE COMMUNICATIONS LETTERS, VOL. 16, NO. 6, JUNE 2012
NR = NT receive antennas in this work. The information bits b are encoded (e.g., using a turbo code) with code rate Rc and the resulting coded bit stream c is mapped (using Gray mapping) to a sequence of transmit vectors s = [s1 , · · · , sNT ]T ∈ XNT , where X denotes the set of complexvalued scalar constellation points with the size |X| = 2 Mc . Slightly abusing common terminology, we denote the entries of x as xi,b (i = 1, · · · , NT , b = 1, · · · , Mc ), where the indices i and b refer to the b-th bit in the label of the constellation point corresponding to the i-th entry of s. The standard complex baseband model between the transmitted and received signals is given by y = Hs+n, where y ∈ CNR ×1 is the received vector, H ∈ CNR ×NT denotes channel matrix, s ∈ CNT ×1 represents the transmitted vector, and n ∈ CNR ×1 is the independent and identically distributed (i.i.d.) complex circular Gaussian random noise with zero mean and variance σ2 . A. SISO MMSE-PIC Algorithm 1) Computation of Soft-Symbols: In the first step, the soft symbols sˆi (i = 1, · · · , NT ) for the transmitted symbols si are computed according to [2] Prob[si = s]s (1) sˆi = E[si ] = Mc where Prob[si = s] = b=1 Prob[xi,b = x] denotes the a priori probability of the symbol s ∈ X with xi,b = [s]b referring to the b-th bit associated with the symbol s. The probability of the transmitted bit xi,b is given by Prob[xi,b = x] =
A exp( 12 xLi,b ) 1 A exp( 2 xLi,b )+exp(− 12
A ) xLi,b
A . The a priori information Li,b is set
to zero in the first iteration, i.e., = 0, ∀i, b. The error between the transmitted symbol si and the soft-symbol sˆi is defined as ei = si − sˆi . The reliability of each soft-symbol sˆi is characterized by its variance A Li,b
Ei = Var[si ] = E[|ei |2 ].
(2)
2) Parallel Interference Cancelation: The PIC process considers each stream separately and cancels the interference in y induced by all other streams j i as follows: yˆ i = y − h j sˆ j = hi si + n˜ i (3) j, ji where h j denotes the j-th column of H and n˜ i = j, ji h j e j + n refers to the residual interference plus noise. 3) MMSE Filtering: The NT MMSE filter vectors are computed based on [3] ˜ H + σ2 INR )−1 ˜ iH = hiH (HΛH w
III. Proposed Transforation Method A. A Closer Look at SISO MMSE-PIC The premise of the (6) is that the weighted residual in˜ iH n˜ i is assumed to be Gaussian terference plus noise term w distributed. For low rate coded MIMO systems with Gaussian input and the ideal channel estimation, the presumption is valid to a larger extent. However, the channel estimation error and the imperfection of the high rate code are inevitably present in practical systems and normally the discrete constellation alA is shown phabet is utilized. In this work, the a priori LLR Li,b to exhibit significant non-Gaussianess for high rate coded MIMO systems with/without channel estimation error. The a posteriori LLR will also acquire the non-Gaussianess based on (1), (3)–(6). As a result, the channel decoder is plagued by the heavy-tailed a posteriori LLR (i.e., a priori LLR input to the channel decoder), and generates non-Gaussian distributed extrinsic information output. A D and Li,b for We first analyze the distributions of the Li,b high/low rate coded MIMO systems with/without channel estimation error in the next subsections. A D and Li,b With/Without ChanB. Analysis of Distribution of Li,b nel Estimation Error
With an optimum orthogonal (with respect to time among the transmit antennas) training signal, we may express the ˆ = H + ΔH, where ΔH is the estimated channel matrix as H channel estimation error matrix and is uncorrelated with H and with CN(0, σ2Δh ) entries. The channel estimation error-tosignal ratio (ESR) is defined as the ratio of the energy of the elements of ΔH to the energy of the elements of H, and is given by E{|ΔHi, j |2 } . (7) ESR = E{|Hi, j |2 } With channel estimation error, the residual interference plus noise is expressed as n˜ i = (h j + Δh j )e j + n. (8) j, ji
(4)
˜ ˜ where Λ is an NT × NT diagonal matrix having entries Λi,i = Ei , ji 1, j=i . The i-th result of the filtering process is given by ˜ iH y˜ i = μ˜ i si + w ˜ iH n˜ i z˜i = w
(1) where X(0) i,b and Xi,b designate the sets of candidate symbol vectors corresponding to xi,b = 0 and xi,b = 1, respectively, H 2 ˜ i. ˜ iH and v˜ 2i = Var[˜zi ] = w j, ji h j h j + σ INT w
(5)
˜ iH hi . where μ˜ i = w 4) A Posteriori LLR Computation: The MMSE-PIC iterative receiver approximates the a posteriori LLR by assuming that NT single-input single-output systems in (5) are statis˜ iH n˜ i is tically independent and that the weighted RIN term w Gaussian distributed as ⎛ ⎞ ⎛ ⎞ ⎜⎜⎜ |˜zi − μ˜ i s|2 ⎟⎟⎟ ⎜⎜⎜ |˜zi − μ˜ i s|2 ⎟⎟⎟ D ⎟⎠ − min ⎜⎝ ⎟⎠ Li,b ≈ min(0) ⎜⎝ (6) v˜ 2i v˜ 2i s∈Xi,b s∈X(1) i,b
For BICM systems, the distribution of the probability of D |ˆyi ) calculated by the per-stream a posteriori LLR Prob(Li,b using (6) is mainly determined by the pdf of the error e j . The distribution of the e j is related to the code rate, puncturing pattern, interleaver, generator polynomials, and data block length, which is analytically intractable and is beyond the scope of this work. Furthermore, the impulsiveness of the D |ˆyi ) is exacerbated by non-Gaussian distributed LLR Prob(Li,b the imperfect channel estimation, which degrades severely the performance of Gaussianity-based channel decoder. While D |ˆyi ) is difficult to treat the exact distribution of the Prob(Li,b analytically, we utilize a numerical method to analyze the A D /Li,b ) that characterizes the behavior of its kurtosis (of the Li,b distribution [6].
DAI: ENHANCING THE PERFORMANCE OF THE SIC-MMSE ITERATIVE RECEIVER FOR CODED MIMO SYSTEMS VIA COMPANDING
Kurt=2.9
c
0.01
R =2/3, ESR= −, c
0.02
Kurt=4.6
R =5/6, ESR=−15 dB, Kurt=5.9 c
R =5/6, ESR= −, c
0.015
Kurt=4.9
c
=1, R =2/3, Kurt=3.5
c
a pri = 0.6, a post = 0.8, R =2/3, Kurt=1.4 c
=1, R =2/3, Kurt=5.3
0.01
c
c
0.005
0.005
0 0
a pri = 0.6, a post = 0.8, R =2/3, Kurt=1.3
0.015
=1, R =1/3, Kurt=2.9
0.015
c
0.02
0.01
0.01
c
=1, R =1/3, Kurt=2.3 Probability
c
R =5/6, ESR= −,
= 0.6, = 0.8, a pri a post R =1/3, Kurt=1.2
0.02
c
0.025
c
Kurt=2.4
c
R =5/6, ESR=−15 dB, Kurt=4.0
0.015
Kurt=2.3
R =2/3, ESR=−15 dB, Kurt=5.3
0.025 Probability
R =2/3, ESR= −, Probability
Rc=1/3, ESR= −,
c
= 0.6, = 0.8, a pri a post R =1/3, Kurt=1.1
c
Kurt=1.9
R =2/3, ESR=−15 dB, Kurt=3.5
0.02
0.03
R =1/3, ESR=−15 dB, Kurt=2.9
0.03
c
Rc=1/3, ESR= −,
Probability
R =1/3, ESR=−15 dB, Kurt=2.3
0.025
923
0.005
0.005
10
20
30
40
LA i,l
50
(a) A Priori LLR
60
70
0 0
200
400
LD i,l
600
800
(b) A Posteriori LLR
Fig. 1. PDF of the a priori and a posteriori LLRs for a 4-by-4 MIMO multiplexing system with QPSK modulated and code rates 5/6, 2/3 and 1/3.
1) Kurtosis: The higher order statistics of kurtosis measures the deviations of a probability density from a Gaussian density in terms of its peakedness and heavytailedness. The kurtosis of a real-valued random variable γ is calculated using its second and fourth order moments, and is expressed as kurt(γ) = μμ42 − 3, where μi denotes the i-th central moment 2 (and in particular, μ2 is the variance). Kurtosis is a measure of the peakedness of a distribution, namely, the higher the kurtosis, the lower the concentration of the density function around its mode. A leptokurtic distribution has a more acute peak around the mean (that is, a lower probability than a normally distributed variable of values near the mean) and fatter tails (that is, a higher probability than a normally distributed variable of extreme values). A D and Li,b for a 4-by-4 MIMO Multiplexing 2) Kurtosis of Li,b System: Fig. 1(a) and Fig. 1(b) depict the pdfs of the a priori A D and the per-stream a posteriori Li,b for a 4-by-4 MIMO Li,b multiplexing system with Quadrature Phase Shift Keying (QPSK), respectively. The detailed simulation parameters are specified in Section IV. Fig. 1(a) illustrates that that the a priori LLR’s pdf of code rate-5/6 is more leptokurtic (with higher kurtosis kurt = 4.0 versus kurt = 2.3) than that of the low code rate-1/3 due to the more prominent imperfection and greater discrepancy caused by per-stream independence approximation (of the high rate codes). The distribution of D shown in Fig. 1(b) is thus the resulting a posteriori LLR Li,b more peaked and heavy-tailed (kurt = 5.9 versus kurt = 2.9). A ’s pdf with the For the same code rate, the a priori LLR Li,b ideal channel estimation (ESR = −∞) is more mesokurtic than those with channel estimation error (kurt = 2.9 versus kurt = 4.0 for code rate-5/6, kurt = 1.9 versus kurt = 2.3 for code rate-1/3 etc.,) due to less severe residual interference. D exhibits Thus, the resultant per-stream a posteriori LLR Li,b more normality, i.e., with smaller kurtosis values as shown in Fig. 1(b) (kurt = 4.9 versus kurt = 5.9 for code rate5/6; kurt = 2.3 versus kurt = 2.9 for code rate-1/3 etc.,). A of rate-1/3 code with the ideal As expected, the pdf of the Li,b channel estimation has the lowest kurtosis value 1.9, and the D LLR is the closest to normality (kurt = 2.3) as resulting Li,b illustrated in Fig. 1(b). A D Based on these analysis, the LLRs of the Li,b and Li,b exhibit significant non-Gaussianess for high rate coded MIMO systems, in particular with channel estimation error.
0 0
10
20
30
LA i,l
40
(a) A Priori LLR
50
60
0 0
200
400
LD i,l
600
800
(b) A Posteriori LLR
Fig. 2. PDF of the a priori and a posteriori LLRs of the proposed transformed SIC-MMSE iterative receiver and the conventional one for a 4by-4 MIMO multiplexing system with QPSK modulated and code rates 2/3 and 1/3.
C. Proposed Non-linear Transformation Method Based on the analysis of the a priori and a posteriori LLRs for different rate codes in section III-B2, we introduce a nonA D linear companding based method to transform the Li,b and Li,b given by (9) θ(z) = |z|λ · sign(z) where λ is a companding factor (λ < 1). The key idea of the proposed companding technique (for A ) is to compress (i.e., down-weight) the a priori LLR Li,b A taking into the dubious decoding of the a priori LLR Li,b account its statistical characteristics so that after companding, the estimation error of the unreliable detected information bits due to the high rate code and/or channel estimation error is significantly mitigated without disproportionate compromise of the reliable ones. D is mainly The transforming of the a posteriori LLR Li,b aimed at alleviating the non-Gaussianess introduced by the per-stream approximation and the residual non-Gaussianess of A (which is in general less the transformed a priori LLR Li,b D ). Therefore, severe than the non-transformed a posteriori Li,b the optimized companding factor λ value for the a posteriori D is expected to be larger than that of the a priori LLR LLR Li,b A Li,b which has been verified by numerical results. In this work, we choose λa pri = 0.6 and λa post = 0.8 for the a priori and a posteriori LLR, respectively, to balance between the robustness and efficiency (i.e., cater to a specific code) based on extensive numerical results (not shown due to space constraints). Overall, the performance degradation due to mismatch of the optimized companding factor value for various code rates is marginal. By employing the proposed companding method, the original leptokurtically distributed a priori and a posteriori LLRs are transformed to more Gaussian distributed ones as illustrated in Fig. 2. IV. Simulation Comparison In this section, we evaluate the performance of the proposed transformed MMSE-PIC iterative receiver and the conventional one over the extended vehicular (EVA) model [7]. The simulation parameters are closely conforming the 3GPP standard. We consider a turbo-coded NT = NR = 4 MIMO-OFDM multiplexing system with 1024-point fast Fourier transform (FFT) and 15 KHz subcarrier spacing. At the transmitter,
924
IEEE COMMUNICATIONS LETTERS, VOL. 16, NO. 6, JUNE 2012
V. Conclusion QPSK, Rc = 5/6
QPSK, Rc = 1/3 −1
10 −1
10
10
−3
10
0
=0.6, = 0.8 a pri a post Out−it=1, In−it=2 = 1, Out−it=1, In−it=2 =0.6, = 0.8 a pri a post Out−it=2, In−it=2 = 1, Out−it=2, In−it=2 =0.6, = 0.8 a pri a post Out−it=4, In−it=2 = 1, Out−it=4, In−it=2 2
4
BLER
BLER
−2
−3
10
−4
6
Eb /N0 (dB)
(a)
−2
10
8
10
−1
=0.6, = 0.8 a pri a post Out−it=1, In−it=2 = 1, Out−it=1, In−it=2 =0.6, = 0.8 a pri a post Out−it=2, In−it=2 = 1, Out−it=2, In−it=2 =0.6, = 0.8 a pri a post Out−it=4, In−it=2 = 1, Out−it=4, In−it=2 0
1
2
Eb /N0 (dB)
3
4
(b)
Fig. 3. BLER comparisons of the proposed transformed SIC-MMSE iterative receiver and the conventional one of a 4-by-4 QPSK modulated MIMO multiplexing system with code rates 1/3 and 5/6.
binary information data bits of length 3200 are first encoded by turbo coding with the original code rate Rc = 1/3 (generators polynomials [13, 15]8) and then punctured according to code rate Rc = 5/6 as specified in [8]. The imperfect channel estimation with ESR = − SNR is assumed. The complexity of the MMSE-PIC receiver depends on the number of iterations in the interference cancellation stage as well as on the number of turbo decoding iterations. Note that Out-it in Fig. 3 stands for the MMSE-PIC detection iterations and In-it corresponds to inner iterations of the turbo decoder. Fig. 3(a) and Fig. 3(b) show that the proposed transformed iterative detector with Out-it = 1 and In-it = 2 outperforms its counterpart of the conventional one (i.e., λ = 1) by about 1 and 0.2 dB at BLER = 2 · 10−2 for code rates 5/6 and 1/3, respectively. The more pronounced performance gains for the high rate codes over the low rate ones are due to the more prominent imperfection of high rate codes which is consistent with the analysis in Section III-B2. The proposed transformed iterative detector with Out-it = 2 even performs better than the conventional one with Out-it = 4 by about 0.4 dB at BLER = 1 · 10−2 for code rate 5/6 as shown in Fig. 3(a). This translates to reduced latency which is appealing for delaystringent applications. The non-linear transformation of the a priori and a posteriori LLRs [c.f. (9)] can be implemented efficiently in hardware through the table lookup operation. The calculation cost required of the table lookup opeartion is six real additions based on the results of [9] which is less than 6% of that of a detection-decoding cycle of the SIC-MMSE iterative receiver [3], [10]. (The BCJR based turbo decoder requires 25 × 23 − 3 = 197 real additions per bit per iteration [10]. The detailed computational complexity is not provided in this work due to lack of space.)
In this work, we first showed that the generally accepted Gaussian distributed assumption of residual interference plus noise of the SIC-MMSE iterative receiver is not accurate enough for high rate coded MIMO systems with/without channel estimation error. We then proposed an non-linear transformation companding based technique to the MMSESIC iterative receiver to alleviate the non-Gaussianess of the a priori and a posteriori LLRs. Based on the proposed non-linear transformation method, the original leptokurtically distributed a priori LLR and the per-stream a posteriori LLR are transformed to more mesokurtically distributed ones, thus significantly enhancing performance of the subsequent channel decoding most notably for high rate coded MIMO systems with channel estimation error (which is particularly appealing for practical applications). The greatly enhanced performance for high rate coded MIMO systems makes the low-complexity MMSE-PIC iterative receiver an appealing solution for the future wireless systems. References [1] X. Wang and H. V. Poor, “Iterative (turbo) soft-interference cancellation and decoding for coded CDMA,” IEEE Trans. Commun., vol. 47, no. 7, pp. 1046–1061, July 1999. [2] M. T¨uchler, A. C. Singer, and R. Koetter, “Minimum mean squared error equalization using a priori information,” IEEE Trans. Signal Process., vol. 50, no. 3, pp. 673–983, Mar. 2002. [3] M. Witzke, S. B¨aro, F. Schreckenbach, and J. Hagenauer, “Iterative detection of MIMO signals with linear detectors,” in Proc. 2002 Asilomar Conf. on Signals, Systems and Computers, pp. 289–293. [4] M. Sellathurai and S. Haykin, “Turbo-BLAST for wireless communications: theory and experiments,” IEEE Trans. Signal Process., vol. 50, pp. 2538–546, Oct. 2002. [5] Nokia Siemens Networks, Nokia, R1-084319, “Considerations on SCFDMA and OFDMA for LTE-Advanced Uplink,” Meeting #55, Nov. 2008. [6] D. Zwillinger and S. Kokoska, Standard Probability and Statistics Tables and Formulae, 2nd edition. CRC, 1999. [7] 3rd Generation Partnership Project, “Evolved Universal Terrestrial Radio Access (E-UTRA); User Equipment (UE) Radio Transmission and Reception,” 3GPP TS 36.101 V8.0.0 Sep. 2007. [8] 3rd Generation Partnership Project, “Technical Specification Group Radio Access Network; Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and channel coding,” 3GPP TS 36.212 V9.1.0 Mar. 2010. [9] 3rd Generation Partnership Project, R1-031301, “Complexity comparison of OFDM HS-DSCH receivers and advanced receivers for HSDPA and associated text proposal,” meeting #35, Nov. 2003. [10] I. A. Chatzigeorgiou, M. R. D. Rodrigues, I. J. Wassell, and R. A. Carrasco, “Comparison of convolutional and turbo coding for broadband FWA systems,” IEEE Trans. Broadcast., pp. 494–503, June 2007.