AbstractâAdaptive Turbo frequency-domain channel estima- tion is incorporated with low complexity Turbo space-frequency equalization (TSFE) for ...
4094
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
Low Complexity Adaptive Turbo Frequency-Domain Channel Estimation for Single-Carrier Multi-User Detection Ye Wu, Student Member, IEEE, Xu Zhu, Member, IEEE, and Asoke K. Nandi, Senior Member, IEEE
Abstract—Adaptive Turbo frequency-domain channel estimation is incorporated with low complexity Turbo space-frequency equalization (TSFE) for single-carrier (SC) multi-user detection. The simplified Turbo recursive least square (RLS) channel estimation algorithm provides nearly the same performance as its full complexity version, with a tremendous complexity reduction. With PSK modulations, the simplified Turbo RLS channel estimation reduces to Turbo LMS channel estimation. With a low training overhead, the simplified Turbo RLS channel estimation provides a performance comparable to the case with perfect channel state information (CSI). Index Terms—Channel estimation, frequency domain equalization (FDE), multiple-user detection, Turbo equalization.
B
I. I NTRODUCTION
OTH single carrier (SC) frequency-domain equalization (FDE) [1] and Turbo (iterative) equalization [2] have been shown to be effective for frequency selective fading channels. Compared to orthogonal frequency division multiplexing (OFDM), SC-FDE has a similar structure, but a lower peak-to-average power ratio [1]. Turbo equalization employs joint equalization and decoding with iterative information exchange between the equalizer and the decoder. In [3], a linear minimum mean square error (MMSE) based Turbo equalizer was proposed, which provides close performance to the optimal maximum a posteriori probability (MAP) based Turbo equalizer [2]. By combining the advantages of Turbo equalization and SC-FDE, an MMSE based Turbo spacefrequency equalization (TSFE) structure was proposed for multiple-input multiple-output (MIMO) systems in [4], which significantly outperforms the previous Turbo time-domain equalization (TTDE) [5] with a much lower complexity. An MMSE based Turbo FDE structure was proposed in [6], which however assumes uncorrelation between frequency bins and aims to minimize the cost function on each independent frequency bin. However, most existing work on Turbo FDE assumed symbol-spaced sampling [4], [6]. Oversampling of the received signals is critical in practice, which is adopted to avoid an information loss in discretizing the received signals. In [7], an oversampled frequency-domain Turbo linear equalization (FD-TLE) algorithm was proposed, whose complexity is however impractically high due to inverse of big matrices. FD-TLE also has little performance enhancement with the
Manuscript received June 3, 2007; revised October 28, 2007, March 5, 2008, and May 20, 2008; accepted May 30, 2008. The associate editor coordinating the review of this paper and approving it for publication was G. Vitetta. This work was supported by the Overseas Research Students Award Scheme (ORSAS), UK. The authors are with the Signal Processing and Communications Group, Department of Electrical Engineering and Electronics, The University of Liverpool, Liverpool L69 3GJ, UK (e-mail: {ye.wu, xuzhu}@liv.ac.uk). Digital Object Identifier 10.1109/T-WC.2008.070570
increase of the number of iterations, as it utilizes only the first-order statistics of signals in Turbo equalization. In practice, channels are time-varying and unknown to receivers. Therefore, adaptive channel estimation is desirable. Adaptive frequency-domain channel estimation was proposed in [8] for single-input multiple-output systems with SC-FDE, and was extended to the case of MIMO systems [9]. However, the updates of channel estimation in [8] and [9] were based on hard decisions of signals, which introduce significant error propagation. Turbo channel estimation [10], [11] was shown to be more robust to channel variations than hard decision based channel estimation, by using soft decisions on signals from the decoder/equalizer iteratively. In [12], a modified Turbo recursive least squares (RLS) channel estimation scheme was proposed, which outperforms the Turbo least mean squares (LMS) channel estimation at a comparable complexity with the phase shift keying (PSK) modulation employed. However, most previous work on Turbo channel estimation is performed in the time domain, which requires a tremendous complexity for highly dispersive channels. Frequency-domain channel estimation generally converges faster than time-domain channel estimation [13]. In [14], a Turbo frequency-domain channel estimation scheme based on Slepian expansion [15] was proposed for multi-carrier (MC) code division multiple access (CDMA) systems. Since the energy of each symbol is distributed over the whole signal bandwidth for SC-FDE, channel estimation for MC systems can not be directly incorporated with SC-FDE [8]. To the best of our knowledge, no work has been reported in the literature on Turbo frequency-domain channel estimation for SC systems. In this letter, we investigate adaptive Turbo frequencydomain channel estimation for SC multi-user detection. Compared to Turbo time-domain channel estimation [12] which is performed on each symbol, our proposed Turbo frequencydomain channel estimation is performed on each block, and its complexity for each block is comparable to the complexity of the former algorithm for each symbol, thanks to the fast Fourier transform (FFT) and inverse FFT (IFFT). Compared to Turbo frequency-domain channel estimation for MC systems [14], our proposed channel estimation scheme for SC systems is performed on each independent block instead of hundreds of blocks. Therefore, the proposed Turbo frequency-domain channel estimation for SC systems requires a much lower complexity than both Turbo time-domain channel estimation [12] and Turbo frequency-domain channel estimation for MC systems [14]. It is also incorporated with a low complexity oversampled TSFE structure, where oversampling of the received signals is employed to prevent the loss of useful energy. The complexity of TSFE increases linearly with the
c 2008 IEEE 1536-1276/08$25.00
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
number of samples per block, and is much lower than the complexities of TTDE [5] and FD-TLE [7], which increase nonlinearly with the number of samples per block. In particular, a simplified Turbo RLS channel estimation algorithm is proposed, which provides nearly the same performance as Turbo RLS channel estimation, with a tremendous complexity reduction. In the scenario of PSK modulations, the simplified Turbo RLS channel estimation requires the same complexity as Turbo LMS channel estimation. The simplified Turbo RLS channel estimation requires a low training overhead to achieve a performance comparable to the case with perfect channel state information (CSI). We reserve . for integer flooring. Let (.)T and (.)H denote the transpose and complex-conjugate transpose of a matrix/vector, respectively. E(X) and Cov(X, Y) = E(XY H )− E(X)E(YH ) respectively denote the expectation and covariance operators. Diagonal and block diagonal matrices are denoted by diag(.) and DIAG(.), respectively, with elements/matrices on the diagonal listed in the parentheses. Also define operators vec and mat as: T vec(a1 · · · aQ ) = aT1 · · · aTQ ⎤ ⎡ a1 0 · · · 0 ⎢ 0 a2 · · · 0 ⎥ ⎥ ⎢ mat(a1 · · · aQ ) = ⎢ . .. . . .. ⎥ ⎣ .. . . . ⎦ 0
0
· · · aQ
II. S YSTEM M ODEL We consider a time division multiple access (TDMA) based SC system with Nt users and a Nr -antenna receiver, as depicted in Fig. 1. For the nth (n = 1, · · · , Nt ) user, an information bit sequence bn is encoded into a terminated recursive systematic convolutional (RSC) code sequence cn with a memory Mc (Mc bits are tailed to the information bit sequence bn which forces the encoder to the all-zero state using the encoder circuit). Each binary code sequence cn is interleaved and mapped to a block of M S-ary data symbols din (i = 0, · · · , M − 1) according to the symbol alphabet α = {α1 , · · · , αS }, where αs (s = 1, · · · , S) has unit symbol energy and a symbol period of T . The overall channel memory is assumed to be N , lumping the effects of the transmit filter, receive filter and physical channel. To implement the SC-FDE block transmission, each data block is prepended with a cyclic prefix (CP), which is the replica of the last N symbols in the block, and is discarded at the receiver to prevent inter-block interference. Assuming Ns samples per symbol period, the received signals at each receive antenna are arranged in blocks with each consisting of Ns M samples. Within each block, the mth (m = 0, · · · , Ns M − 1) sample at the lth receive antenna can be expressed as +1)−1 Nt Ns (N
n=1
s hiln dm−i/N + nm n l
antenna. nm l is the additive white Gaussian noise (AWGN) at the mth sampling time with the single-sided power spectral density N0 . The oversampled received signals are transferred into the frequency domain by FFT. The signal on the mth (m = 0, · · · , Ns M − 1) frequency bin at the lth receive antenna is given by Xlm =
Nt
m m Hln Dn + Nlm
(2)
n=1
where Xlm =
Ns
M−1
xil e−j2πmi/(Ns M)
i=0 Ns (N +1)−1 m Hln =
hiln e−j2πmi/(Ns M)
i=0
Dnm =
M−1
din e−j2πmi/M
i=0
and Nlm =
Ns
M−1
nil e−j2πmi/(Ns M) .
i=0
Furthermore, we define Xm =
Nt
m m Hm n Dn + N
(3)
n=1
where aq (q = 1, · · · , Q) denotes a column vector.
xm l =
4095
(1)
i=0
where hiln denotes the ith (i = 0, · · · , Ns (N + 1) − 1) sample of the continuous-time channel path gain at time instant iT /Ns between the nth transmit antenna and the lth receive
T m m T m where Xm = X1m · · · XN , Hm , and n = H1n · · · HNr n r m T Nm = N1m · · · NN . r At the receiver, the mean μin and variance vni of din are computed before equalization, using the a priori information P (din = αs ) [3]:
αs P (din = αs ) (4) μin = E(din ) = αs ∈α
vni = Cov(din , din ) = |αs |2 P (din = αs ) − |μin |2 (5) αs ∈α III. OVERSAMPLED T URBO S PACE -F REQUENCY E QUALIZATION The iterative receiver with oversampled TSFE and adaptive Turbo channel estimation is depicted in the right part of Fig. 1. The equalizer output signals d˜in (n = 1, · · · , Nt ; i = 0, · · · , M − 1) are passed to the Gaussian log-likelihood ratio (LLR) estimator [3] for estimation of the extrinsic LLRs, which are deinterleaved and then input into Nt decoders as the a priori information. ˆ n (n = 1, · · · , Nt ) The estimate of information bit sequence b is generated by the nth decoder. The decoder outputs are interleaved to produce the intrinsic LLRs, which are fed back to the channel estimator and equalizer iteratively. = We define X = mat(X0 · · ·XNs M−1 ), Dn ˆ n = mat(H ˆ 0 · · ·H ˆ Ns M−1 ), diag(Dn0 · · ·DnNs M−1 ), and H n n ˆ m (m = 0, · · · , Ns M − 1) denotes the estimate of where H n m Hn . The frequency-domain equalizer weight vector with respect to dik is denoted by Uik = vec(Wki,0 · · ·Wki,Ns M−1 ), where Wki,m (m = 0, · · · , Ns M − 1) is a weight vector of
4096
Fig. 1.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
Block diagram of the SC MIMO system with oversampled TSFE and adaptive Turbo channel estimation at the receiver.
size Nr × 1 on the mth frequency bin. The equalizer output signal with respect to dik is given by:
Nt
1 H ˆ n E(Dn ) f i H Ui d˜ik = (6) X− Ns M k n=1
T s M) where f i = ej2π0i/(Ns M) · · ·ej2π(Ns M−1)i/(N , and E(Dn ) = diag E(Dn0 )· · ·E(DnNs M−1 ) , with E(Dnm ) = M−1 i −j2πmi/M (m = 0, · · · , Ns M − 1) denoting the i=0 μn e DFT of μin . Different from the Turbo FDE in [6] which assumes uncorrelated frequency bins, the proposed TSFE utilizes the correlation between frequency bins to improve the performance. The equalizer coefficients are derived based on the MMSE criterion, by minimizing the MSE cost function given by 2 (7) Jki = E d˜ik − dik The linear weight vector Uik is expressed as Uik
=
1+
i 1−vk Ns M
−1
ˆ kf 0 H ˆ H Ωi −1 H ˆ kf 0 f 0H H
Ωi
(8)
k
where Ωi =
Nt M−1
1
ˆ n f i−Ns m f i−Ns m H H ˆ H +N0 I (9) vm H n Ns M n=1 m=0 n
Since the frequency-domain linear filter weight Uik in (8) is different for each symbol dik (i = 0, · · · , M − 1) within a data block, the complexity for weight calculation is prohibitive. To reduce the computational burden, a direct and effective approach is to implement the block processing, i.e., to make Uik independent of the time index i. This can be achieved by replacing vni (n = 1, · · · , Nt ; i = 0, · · · , M − 1) in (8) by M−1 1 v n = M i=0 vni , which is the average of vni within a block i (9) reducesto a block diagonal matrix [3], [4]. Thus, Ω in 0 Ns M−1 , where as Ω = DIAG R · · · R R
m
=
Nt
ˆm ˆ mH + N 0 I vn H n Hn
(10)
n=1
As a result, Uik reduces to Uk = vec(Wk0 · · ·WkNs M−1 ), where m−1
Wkm
=
ˆm H k Ns M−1 ˆ mH m−1 ˆ m Hk R Hk m=0 R
1+
1−v k Ns M
(11)
It was shown in [4] that the above approximation has very little impact on performance. The resulting equalizer output d˜ik in (6) can be expressed as
Nt
1 i H ˜ ˆ X− dk = Hn E(Dn ) f i U Ns M k n=1 +
1 ˆ kf 0 μi UH H Ns M k k
(12)
IV. A DAPTIVE T URBO F REQUENCY-D OMAIN C HANNEL E STIMATION In this section, we propose adaptive Turbo frequencydomain channel estimation, which is incorporated with TSFE in Section III. All the elements perform in the frequency domain. Assume that each data frame consists of a training sequence of nTrain blocks and a data sequence of nData blocks, and all blocks are synchronized. The receiver first operates in the training mode to obtain the initial channel estimates. Letting q denote the block index, in the code-aided channel estimation mode, Turbo channel estimation is based on the soft decisions E(Dnm (q) ) on the LLRs of signals from each iteration, and the estimates are passed to the next iteration of TSFE. The updates of channel estimation can be based on the LMS or RLS criterion. (q) (q) (q) (q) s M−1 , where Γm Define Γl = Γ0l · · · ΓN = l l m (q) m (q) · · · HlN (l = 1, · · · , Nr ; m = 0, · · · , Ns M − 1) Hl1 t denotes the channel frequency response vector with respect to the lth receive antenna and the mth frequency bin that (q) consists of elements from all transmit antennas. Let τ l = h0l1
(q)
N (N +1)−1 (q)
· · · hl1s
· · · h0lNt
(q)
N (N +1)−1 (q)
· · · hlNst
de-
note the channel impulse response vector of length to the lth receive antenna. Define Nt Ns (N + 1), with respect ˜ = F0 · · · FNs M−1 , where Fm = mat(Om · · ·Om ) is F an Nt Ns (N + 1)×Nt block Toeplitz matrix with Om = −j2π0m/(N M) T (q) s e · · ·e−j2π[Ns (N +1)−1]m/(Ns M) . Thus, Γl can be expressed as: (q) (q) ˜ (13) Γl = τ l F (q) (q) (q) (q) , Nl Letting Xl = Xl0 · · · XlNsM−1 = (q) (q) Ns M−1 (q) 0 (q) (q) 0 N M−1 ˜ , D Nl · · · Nl = mat(D · · ·D s ),
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
T m (q) where Dm (q) = D1m (q) · · · DN , we have t (q)
Xl
(q) ˜ ˜ (q) (q) D + Nl = τl F
(q)
The solution to τˆ l
The channel estimation accuracy can be measured by the (q) normalized MSE between the channel frequency response Γl (q) ˆ and its estimate Γ for each block, expressed as: l Nr (q) 2 ˆ (q) (q) l=1 EΓl − Γl M SE = (15) Nr (q) 2 l=1 EΓl A. Turbo LMS Channel Estimation We first investigate the Turbo LMS channel estimation, which is an extension of the hard-decision based LMS channel estimation in [9]. In the training mode, Turbo LMS channel estimation is to minimize: (q) (q) (q) ˜ ˜ (q) 2 D J(ˆ τ l ) = E Xl − τˆ l F (q)
with respect to τˆ l
(q)
(16)
(q)
that is the estimate of τ l . This produces (q−1)
τˆ l
= τˆ l
(q) ˜ (q)H ˜ H F + μel D
(17)
=
(q) Xl
−
(q−1) ˜ ˜ (q) τˆ l FD
(18)
(q)
(q−1)
= τˆ l
˜H ˜ (q) )H F + μ¯ el E(D (q)
(19)
Ns M−1 (q)
0 (q)
˜ (q) ) = mat(E(D )· · ·E(D )) with where E(D T (q) m (q) m (q) m (q) ¯l is given by ) = D1 · · · E(DNt ) , and e E(D (q)
(q)
¯l e
= Xl
(q−1) ˜
˜ (q) ) FE(D
− τˆ l
(20)
B. Turbo RLS Channel Estimation To enhance the convergence behavior of channel estimation, we now investigate the RLS channel estimation, which aims to minimize the following cost function in the training mode: (q) J(ˆ τl )
=
q
(i) (q) ˜ ˜ (i) 2 D λq−i Xl − τˆ l F
i=1
(l = 1, · · · , Nr )
(21)
(q)
with respect to τˆ l . λ denotes the forgetting factor. The (q) solution to τˆ l is given by (q)
τˆ l where
(q) H
= Ψl
Φ(q)
−1
(22)
˜ (q)H F ˜H ˜D ˜ (q) D Φ(q) = λΦ(q−1) + F (q)
Ψl
(q−1)
= λΨl
(q) H
˜D ˜ (q) X +F l
(23) (24)
In the code-aided channel estimation mode, the cost function is given by q
(q) (i) (q) ˜ ˜ (i) 2 D λq−i E Xl − τˆ l F J(ˆ τl ) = i=1
(l = 1, · · · , Nr )
(25)
is still given by (22) with ˜ (q)H F ˜H ˜ ˜ (q) D = λΦ(q−1) + FE D
(26)
(q−1)
= λΨl H (q)H ˜ (q)H F ˜ H τˆ (q−1) + E(D ˜ E D ˜ (q) D ˜ (q) )¯ +F e (27) l l
The above Turbo RLS channelestimation can be simplified ˜ (q)H in (26) and (27) is a diago˜ (q) D by assuming that E D ˜ (q) D ˜ (q)H nal matrix, since the off-diagonal elements in E D average out to zero so long as λ is close to 1 [12]. In ˜ (q) D ˜ (q)H M I, the scenario of PSK modulations, E D regardless of the LLRs. Hence, (26) and (27) respectively reduce to: M ˜ ˜H Ns M 2 FF I (28) Φ(q) 1−λ 1−λ (q)
(q−1)
Ψl
λΨl H (q)H ˜ MF ˜ H τˆ (q−1) + E(D ˜ (q) )¯ +F el l
(29)
Using (22), (28) and (29), we obtain τˆ l
In the code-aided channel estimation mode, the cost function is still given by (16) and its solution is given by: τˆ l
(q)
Ψl
(q)
where μ is the step size, and (q) el
Φ(q)
(14)
(l = 1, · · · , Nr )
4097
(q−1)
= τˆ l
+
1 − λ (q) ˜ (q) H ˜ H ¯ E(D ) F e Ns M 2 l
(30)
Comparing (19) with (30), it can be deduced that with PSK modulations the simplified Turbo RLS channel estimation reduces to Turbo estimation. For non-PSK LMS channel ˜ (q)H M I does not hold any more, ˜ (q) D modulations, E D ˜ (q) D ˜ (q)H is a diagonal however, the assumption that E D matrix still helps reduce the complexity of Turbo RLS channel estimation. V. C OMPLEXITY A NALYSIS We first demonstrate the normalized complexity of the proposed simplified Turbo RLS channel estimation (which is equivalent to Turbo LMS channel estimation in the code-aided channel estimation mode) in terms of the number of complex multiplications used for a data frame (which accounts for the training mode and code-aided channel estimation mode), compared to the complexities of the standard Turbo RLS channel estimation. We employ a sampling rate of Ns = 2 (i.e., 2 samples per symbol period), a block size of M = 128, Nt = 4 users, Nr = 4 receive antennas, an overall channel memory of N = 25, QPSK modulations and 5 iterations. Each data frame comprises a training sequence of nTrain = 12 blocks and a data sequence of nData = 300 blocks for both the channel estimation schemes. The updates of Φ(q) (q) in (26) and Ψl in (27) account for the most complexity of the Turbo RLS channel estimation. Compared to standard Turbo RLS channel estimation described in subsection IVB whose complexity is on the order of M 3 , the simplified Turbo RLS channel estimation has a complexityon the order ˜ (q)H in (26) and ˜ (q) D of M due to the assumption that E D (27) is diagonal. It is demonstrated that the simplified Turbo RLS channel estimation achieves a complexity reduction of 33 times over Turbo RLS channel estimation. We also investigate the complexity of the proposed
4098
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
QPSK,M=128,N =2,N =4,N =4,N=25,σ=5T s
0
t
QPSK,M=128,N =2,N =4,N =4,N=25,σ=5T,f =200Hz
r
s
0
10
t
r
d
10
1 iteration 1 iteration
−1
−1
10
−2
average BER
average BER
10
10
−3
10
−2
10
−3
10
5 iterations 5 iterations −4
−4
10
−5
10
−1
10 TSFE FD−TLE [7] TTDE [5] 0
−5
1
2 SNR(dB)
3
4
5
Fig. 2. Performance of TSFE, FD-TLE, and TTDE with M = 128 symbols per block, Ns = 2 samples per symbol period, Nt = 4 users, Nr = 4 receive antennas, an RMS delay of σ = 5 T , perfect CSI and no Doppler effect. TABLE I N OMALIZED C OMPLEXITY OF TSFE, FD-TLE AND TTDE WITH M = 128 S YMBOLS PER B LOCK , Ns = 2 S AMPLES PER S YMBOL P ERIOD , Nt = 4 U SERS ,Nr = 4 R ECEIVE A NTENNAS , C HANNEL M EMORY N = 25 AND QPSK M ODULATION Receiver TSFE FD-TLE [7] TTDE [5]
1 iteration 1 2084 7.9
2 iterations 2.1 4171 16.8
5 iterations 5.1 10431 44.1
oversampled TSFE, compared to the complexities of FD-TLE [7] and TTDE [5]. Solution of the equalizer coefficients plays a critical role in the whole complexity, for which TSFE calculate the inverses of Ns M matrices of size Nr ×Nr , and therefore requires approximately Ns M Nr3 /3 complex multiplications. While FD-TLE performs iterative FDE by minimizing the cost function which involves Ns M samples, and therefore requires approximately Ns3 M 3 Nr3 /3 complex multiplications for matrix inversion. TTDE uses a time-domain linear filter spanning (N + 1) symbol periods, and therefore the order of Ns3 (N + 1)3 Nr3 /3 complex multiplications are needed for calculating equalizer coefficients. A numerical example of the normalized complexity is shown in Table I, with the same configuration as in the demonstration of the channel estimation algorithms. Thanks to block processing, the low complexity TSFE with 5 iterations saves around 2000 and 9 times of complexity over FD-TLE and TTDE, respectively. VI. S IMULATION R ESULTS Simulations were carried out with the same setup as in Table I and a symbol rate of 5 M-Baud (i.e., a symbol period of T = 0.2 μs). We choose a rate 1/2, memory Mc = 2 terminated RSC encoder with generator (1 + D + D2 , 1 + D2 ) to generate the error-correcting code (ECC) bits. Both the transmit and receive filters use a raised-cosine pulse with a roll-off factor of 0.35. The physical channel is modeled by following the exponential power delay profile [16] with a root
10
−1
Perfect CSI Turbo RLS Simplified Turbo RLS 0
1
2 SNR(dB)
3
4
5
Fig. 3. Performance of adaptive Turbo frequency-domain channel estimation based oversampled TSFE with M = 128 symbols per block, Ns = 2 samples per symbol period, Nt = 4 users, Nr = 4 receive antennas, an RMS delay of σ = 5 T and a Doppler spread of fd = 200 Hz.
mean square (RMS) delay spread of σ = 1 μs (i.e., σ = 5 T ). The overall channel is of memory N = 25. The linear filter of TTDE [5] has a decision delay Nd = 1, which is optimized by using the scheme in [16]. The signal-to-noise ratio (SNR) is defined as the spatial average ratio of the received signal power to noise power. Due to QPSK modulation, the proposed simplified Turbo RLS channel estimation is equivalent to Turbo LMS channel estimation in the code-aided channel estimation mode. The bit error rate (BER) performance of the oversampled TSFE is shown in Fig. 2, assuming perfect channel state information (CSI) and no Doppler effect. With 5 iterations, TSFE has a performance gain of around 2 dB over FD-TLE [7] at BER = 10−3 , with a complexity reduction of around 2000 times over the latter as shown in Table I. This is because FDTLE utilizes only the first-order statistics of signals for Turbo equalization, while TSFE benefits from both the first-order and second-order statistics. TSFE also outperforms TTDE due to the higher frequency diversity achieved by TSFE, while requiring a complexity of around 9 times less than the latter with 5 iterations. Fig. 3 demonstrates the BER performance of the Turbo RLS channel estimation schemes incorporated with oversampled TSFE. The channel is assumed to be block fading, i.e., the CSI is constant over a block, and all the channel taps are mutually independent with a Gaussian Doppler spectrum [17]. We assume a Doppler spread of fd = 200 Hz (i.e., fd T = 4 × 10−5 ) here. Both Turbo RLS and simplified Turbo RLS channel estimation schemes use a forgetting factor of λ = 0.94, and nTrain = 12 training blocks and nData = 300 data blocks are employed in a frame, leading to a training overhead of only 3.8%. The simplified Turbo RLS channel estimation scheme provides nearly the same performance as Turbo RLS channel estimation, with a complexity reduction of around 33 times over the latter as discussed in Section V. Compared to the case with perfect CSI, the simplified Turbo
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
QPSK,M=128,Ns=2,Nt=4,Nr=4,N=25,σ=5T,SNR=5dB,fd=200Hz
outperforms the so-called LMS-SCE channel estimation [4] in terms of the convergence speed. The complexity of TSFE increases linearly with the number of samples per symbol period and the number of subcarriers, which is much lower than the complexities of FD-TLE [7] and TTDE [5].
0 −2 −4 Channel estimation MSE (dB)
4099
−6
Hard LMS−SCE [9]
−8
R EFERENCES [1] H. Sari, G. Karam, and I. Jeanclaude, “Transmission techniques for digital terrestrial TV broadcasting,” IEEE Commun. Mag., vol. 33, pp. 100–109, Feb. 1995.
−10
Turbo RLS
−12 −14 −16 −18 −20
Simplified Turbo RLS 0
20
40 60 Number of blocks
80
100
Fig. 4. Learning curves of adaptive Turbo frequency-domain channel estimation with M = 128 symbols per block, 5 iterations, Ns = 2 samples per symbol period, Nt = 4 users, Nr = 4 receive antennas, SNR = 5 dB, an RMS delay of σ = 5 T and a Doppler spread of fd = 200 Hz.
RLS channel estimation has a performance loss of only 1 dB at BER = 10−4 with 5 iterations. Fig. 4 shows the convergence performance of the two RLS channel estimation algorithms in terms of the normalized channel estimation MSE defined in (15) versus the number of blocks. The same configuration as in Fig. 3 is employed except that SNR = 5 dB. A so-called LMS structured channel estimation (LMS-SCE) algorithm [9] with a step size of μ = 2 × 10−6 is used for comparison, which was proven to have a faster convergence speed than other frequency-domain channel estimation algorithms [9]. RLS channel estimation schemes have a higher convergence speed with only 12 training blocks needed to achieve the steady state, while LMSSCE [9] requires 24 training blocks. The simplified Turbo RLS channel estimation provides nearly the same steady-state MSE as Turbo RLS channel estimation, which is around 2 dB lower than that achieved by LMS-SCE. VII. C ONCLUSION We have proposed adaptive Turbo frequency-domain channel estimation schemes for SC multi-user detection, incorporated with low complexity TSFE. The simplified Turbo RLS channel estimation provides nearly the same performance as Turbo RLS channel estimation, while achieving a tremendous complexity reduction over the latter. It requires a low training overhead below 4% to provide a performance comparable to the case with perfect CSI. Turbo RLS channel estimation also
[2] C. Douillard, M. Jezequel, C. Berrou, A. Picart, P. Didier, and A. Glavieux, “Iterative correction of intersymbol interference: Turbo equalization,” Eur. Trans. Telecommun., vol. 6, pp. 507–511, Sept./Oct. 1995. [3] M. T¨uchler, A. C. Singer, and R. Koetter, “Minimum mean squared error equalization using a priori information,” IEEE Trans. Signal Processing, vol. 50, pp. 673–683, Mar. 2002. [4] Y. Wu, X. Zhu, and A. K. Nandi, “Low complexity adaptive Turbo space-frequency equalization for single-carrier multi-input multi-output systems,” IEEE Trans. Wireless Commun., vol. 7, pp. 2050–2056, June 2008. [5] M. S. Yee, M. Sandell, and Y. Sun, “Comparison study of single-carrier and multi-carrier modulation using iterative based receiver for MIMO system,” in Proc. IEEE VTC’04 Spring, vol. 3, Milan, Itlay, May 2004, pp. 1275–1279. [6] M. T¨uchler and J. Hagenauer, “Linear time and frequency domain Turbo equalization,” in Proc. IEEE VTC’01 Fall, vol. 4, Oct. 2001, pp. 2773– 2777. [7] F. Pancaldi and G. M. Vitetta, “Block channel equalization in the frequency domain,” IEEE Trans. Commun., vol. 53, pp. 463–471, Mar. 2005. [8] M. Morelli, L. Sanguinetti, and U. Mengali, “Channel estimation for adaptive frequency-domain equalization,” IEEE Trans. Wireless Commun., vol. 4, pp. 2508–2518, Sept. 2005. [9] Y. Wu, X. Zhu, and A. K. Nandi, “Adaptive layered space-frequency equalization for MIMO frequency selective channels,” in Proc. EUSIPCO’05, Antalya, Turkey, Sept. 2005. [10] S. Song, A. C. Singer, and K.-M. Sung, “Turbo equalization with an unknown channel,” in Proc. IEEE ICASSP’02, vol. 3, Orlando, FL, May 2002, pp. 2805–2808. [11] M. T¨uchler, R. Otnes, and A. Schmidbauer, “Performance of soft iterative channel estimation in turbo equalization,” in Proc. IEEE ICC’02, vol. 3, May 2002, pp. 1858–1862. [12] R. Otnes and M. T¨uchler, “Iterative channel estimation for Turbo equalization of time-varying frequency-selective channels,” IEEE Trans. Wireless Commun., vol. 3, pp. 1918–1923, Nov. 2004. [13] J. J. Shynk, “Frequency-domain and multirate adaptive filtering,” IEEE Trans. Signal Processing, vol. 9, pp. 14–35, Jan. 1992. [14] T. Zemen, F. Mecklenbr¨auker, J. Wehinger, and R. R. M¨uller, “Iterative joint time-variant channel estimation and multi-user detection for MCCDMA,” IEEE Trans. Wireless Commun., vol. 5, pp. 1469–1478, June 2006. [15] D. Slepian, “Prolate spheroidal wave functions, Fourier analysis, and uncertainty-V: the discrete case,” The Bell System Technical J., vol. 57, pp. 1371–1430, May/June 1878. [16] X. Zhu and R. D. Murch, “Layered space-time equalization for wireless MIMO systems,” IEEE Trans. Wireless Commun., vol. 2, pp. 1189–1203, Nov. 2003. [17] W. N. Furman and J. W. Nieto, “Understanding HF channel simulator requirements in order to reduce HF modem performance measurement variability,” in Proc. 6th Nordic Shortwave Conf. HF, vol. 3, Aug. 2001, pp. 6.4.1–6.4.13.