Iterative Equalization and Decoding for Unsynchronized OFDMA Uplink Transmissions Man-On Pun† , Michele Morelli and C.-C. Jay Kuo
Abstract— For the uplink transmission of a coded OFDMA system, we present an iterative receiver that performs joint frequency offset acquisition, channel estimation and maximum a posteriori (MAP)-based decoding for each active user in this work. The proposed receiver attempts to separate users’ combined signals by resorting to the space-alternating generalized expectation-maximization (EM) algorithm. Each separated user’s signal is then passed to an expectation-conditional maximization (ECM)-based processor that jointly performs frequency acquisition, channel estimation and decoding at each iteration. As compared with conventional OFDMA systems that employ harddecision equalization and decoding, the proposed receiver can exploit the soft-decision feedback derived from the MAP decoder to provide more reliable synchronization, channel estimation and interference suppression. Simulations indicate that the proposed scheme provides accurate decoding for unsynchronized OFDMA uplink transmissions over frequency-selective fading channels.
I. I NTRODUCTION Orthogonal frequency-division multiple-access (OFDMA) has recently attracted much attention as a promising multiplexing technique for future broadband wireless communications, including the fourth generation (4G) cellular networks. In an OFDMA system, several users simultaneously transmit their own data by modulating an exclusive set of orthogonal subcarriers. Two critical issues in the design of an OFDMA uplink system are frequency synchronization and channel estimation. Similarly to OFDM, OFDMA is sensitive to carrier frequency offsets (CFOs) caused by oscillator mismatches and/or Doppler shifts. Inaccurate CFO estimation results in loss of orthogonality among subcarriers, thereby leading to severe performance degradation. In addition, knowledge of the channel response of each user is indispensable for coherent detection of transmitted data. Frequency and channel estimation is particularly challenging in the uplink transmission of an OFDMA system due to the existence of multiple CFOs and transmission channels. The CFO estimation problem for OFDMA uplink transmissions has been studied by a few researchers recently [1]– [6]. The conventional approach is that the base station (BS) performs CFO estimation using one of the methods in [1]– [4] while leaving the frequency correction task to mobile terminals. The reason is that the correction of one user’s offset would misalign others [3]. In practice, the estimates computed by the BS are fed back to the corresponding senders through M.O. Pun and C.-C. Jay Kuo are with the Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089, USA. M. Morelli is with the Department of Information Engineering, University of Pisa, 56126 Pisa, Italy. † Author for all correspondence. email:
[email protected].
1-4244-0063-5/06/$20.00 ©2006 IEEE
a downlink control channel, and they are exploited by users to adjust their transmit carrier frequencies. After all users have been synchronized, the BS starts to detect the users’ data. The main drawback of this approach is that in a time-varying scenario the BS must periodically provide users with updated CFO estimates, which may result in an excessive amount of transmission overhead and some outdated information due to feedback delay. A promising alternative to the conventional approach is achieved by the use of advanced signal processing techniques that perform frequency correction directly at the BS, i.e., without the need of returning frequency estimates back to active users [5], [6]. Clearly, these schemes require the knowledge of users’ CFOs and channel responses at the BS. The latter are assumed perfectly known in [5] while they are estimated through EM techniques in [6]. It is worth noting that the receivers in [5], [6] are specifically designed only for uncoded OFDMA uplink transmissions. Since most practical systems employ error correction codes, it is of paramount interest to extend the work in [6] to coded OFDMA systems. A straightforward way to implement such coded OFDMA uplink receivers is to perform hard-decision equalization followed by conventional data decoding. However, this approach is expected to perform poorly as it does not exploit any information regarding the likelihood of each detected symbol (also referred to as soft information). Inspired by the turbo principle, a number of turbo processing techniques have recently been developed to improve channel estimation [7] or interference suppression [8] by taking advantage of the soft information associated with the decoded data. Here, we adopt a similar strategy and propose an iterative receiver for OFDMA uplink transmissions where the soft-decision feedback from a MAP decoder is exploited to perform frequency synchronization, channel estimation and soft multiple access interference (MAI) suppression jointly. Simulations are used to highlight the effectiveness of the proposed scheme. II. S IGNAL M ODEL F OR OFDMA U PLINK T RANSMISSIONS We consider the uplink of an OFDMA network, in which K mobile terminals (MT) simultaneously communicate with the BS. The structure of the kth MT is depicted in Fig. 1. The binary data stream sent by the kth MT, ak , k = 1, 2, · · · , K, is trellis-encoded into code bits bk . Block interleaving is then applied to bk to prevent burst errors. The adjacent ϑ bits of the interleaved data ck are mapped to symbol sk taken from a 2ϑ point modulation constellation. Finally, the resulting symbol stream sk is input into an OFDM modulator.
ak
Fig. 1.
Convolutional Encoder
bk
Block Interleaver
ck
Mapper
sk
uk OFDM Modulator To Channel
The block diagram of the kth MT structure.
∆τk = (µk + δk )Ts ,
We use N and Nk to denote the total number of subcarriers and the number of subcarriers assigned to the kth user, respectively. The OFDM modulator first divides sk into segments of length Nk . Then, each segment is modulated onto an OFDMA block of length N . We call sk (n) the nth block of frequencydomain symbols sent by the kth MT. The jth entry of sk (n), say sk,j (n), is non-zero if and only if the jth subcarrier is modulated by the kth MT, with j ∈ {0, 1, · · · , N − 1}. This means that sk (n) has only Nk non-zero elements. The corresponding time-domain vector is given by xk (n) = F H sk (n),
during the nth OFDM block. For convenience, we decompose ∆τk into an integer part and a fractional part with respect to sampling period Ts , i.e.,
(1)
where F is the N -point discrete Fourier transform (DFT) for 0 ≤ p, q ≤ matrix with entries [F ]p,q = √1N exp −j2πpq N H
N − 1 and (·) denotes the Hermitian transposition. A cyclic prefix (CP) of length Ng is appended in front of xk (n) to eliminate the interblock interference (IBI) effect. The resulting vector uk (n) (of length NB = N + Ng ) is then transmitted over the channel. For simplicity, the channel impulse response (CIR) is assumed to be static over an OFDMA block, even though it may vary from block to block. Then, we call def T ξk (n) = [ξk (n, 0), ξk (n, 1), · · · , ξk (n, Lk − 1)] the discretetime baseband CIR of the kth user during the nth block under the assumption that the channel length Lk remains constant over all blocks. Since Lk is usually unknown, we replace ξk (n) by the following Lξ -dimensional vector in practice, T def , (2) ξk (n) = ξkT (n) 0T(Lξ −Lk ) where Lξ ≥ max {Lk } is a design parameter that depends on k the maximum expected channel delay spread. The waveform arriving at the BS is the superposition of signals from all active users. The discrete-time input of the BS receive filter is divided into adjacent segments of length NB , each corresponding to a received OFDMA block (in the BS time reference). The samples belonging to the nth block are serial-to-parallel (S/P) converted to form r(n). Next, the CP is removed and the remaining samples are collected into an N -dimensional vector y(n). We consider a quasi-synchronous system where each user achieves timing and frequency acquisition through a downlink synchronization channel before initiating the uplink transmission [3]. In this way, frequency errors in the uplink are mainly due to Doppler shifts and/or estimation errors occurring at MTs while timing errors only result from the (two-way) line-of-sight propagation delay and are limited to ∆τmax = 2R/c, where R is the cell radius and c the speed of light. In the following, we use ∆τk to denote the timing error (with respect to the BS time-reference) of the kth user and call ∆fk (n) the kth CFO (normalized to the subcarrier spacing)
µk = int {τk /Ts } ,
0 ≤ δk < 1.
As explained in [3], the fractional part can be incorporated into the CIR and, accordingly, it is not considered further. Without loss of generality, we concentrate on the nth received block and omit temporal index n for notational simplicity. Then, letting k = 2π∆fk /N and assuming Ng ≥ Lξ + µmax (with µmax = int {2R/cTs } ) so as to avoid IBI, we have y=
K
Γ(k )F H D(sk )W hk + v,
(3)
k=1
where j j(N −1)k ; • Γ(k ) = diag 1, e k , · · · , e • D(sk ) = diag {sk,0 , sk,1 , · · · , sk,N −1 } is a diagonal matrix with sk on its main diagonal; T T 0µk ξkT 0TLh −Lξ −µk • hk = is a vector of dimension Lh = Lξ + µmax that encapsulates both the timing error and the kth channel response; • W is an N × Lh matrix with elements [W ]p,q = e−j2πpq/N , for 0 ≤ p ≤ N − 1 and 0 ≤ q ≤ Lh − 1 (The columns of W are a scaled version of the first Lh columns of F in practice); • v is circularly symmetric white Gaussian noise with zeromean and covariance matrix σv2 IN . III. I TERATIVE D ETECTION AND F REQUENCY S YNCHRONIZATION Since timing errors µk ’s do not appear explicitly in the signal mode as shown in Eq. (3), timing estimation is not strictly necessary in the system. Thus, we only concentrate on the joint T T estimation of = [1 , 2 , · · · , K ] , h = hT1 , hT2 , · · · , hTK T and s = sT1 , sT2 , · · · , sTK based on received vector y in the sequel. Unfortunately, the joint maximum likelihood (ML) estimation of , h and s turns out to be prohibitively complex in practical implementations [6]. To circumvent this obstacle, we follow the same approach of [6] and propose an iterative scheme where a space-alternating generalized expectationmaximization (SAGE)-based processor is first used to extract the contribution of each user, say yˆk (k = 1, 2, · · · , K), from received vector y. Each yˆk is then exploited to estimate k , hk and sk in a joint fashion using an expectation-conditional maximization (ECM) approach. A. SAGE-Based Signal Decomposition The SAGE algorithm is applied in such a way that the parameters of a single user are updated at a time. This leads to a procedure consisting of iterations and cycles. In particular, K cycles make an iteration and each cycle updates the parameters of a given user. To illustrate the proposed procedure, we call (i) ˆ (i) (i) ˆk the estimates of k , hk and sk after the ith ˆk , h k and s (0) ˆ (0) iteration, respectively. Given initial estimates ˆk , h and k (0) sˆk , we compute ˆ , k )F H D(ˆ sk )W h zˆk = Γ(ˆ k (0)
(0)
(0)
(0)
k = 1, 2, · · · , K. (4)
(i)
(i)
(i)
Ym,j
CFO Compensation
(i) m
y
DFT
Data Detector
(i)
d L(Ym,j |cm,j )
L(Y m,j |bdm,j )
Block De-Interleaver
MAP Decoder
L(am,j |Y m,j )
d
d
E{s m,j }
CFO & Channel Estimation (i+1)
(i+1)
Block Interleaver
(i+1)
The block diagram of the ECM-based MAP decoder.
Then, during the mth cycle of the ith iteration (with m = 1, 2, · · · , K), the estimated contribution of the mth user to the received vector y is obtained as [6] (i) =y− yˆm
m−1
(i)
zˆk −
k=1
where
(i)
L(bm,j |Y m,j )
(i)
L(cm,j |Y m,j )
sm
hm
m
Fig. 2.
Soft Symbol Estimator
am
u l
K
(i−1)
zˆk
conditioned on cdm,j is given as
(i) P r Yˆm,j cdm,j = +1
(i) L Yˆm,j cdm,j = log
(i) P r Yˆm,j cdm,j = −1 (i) (i,c) 2
(5)
exp
k=m+1
is zero if u < l.
= log
Substituting Eq. (3) into Eq. (5) yields (i) (i) yˆm = Γ(m )F H D(sm )W hm + ηm ,
(6)
where m−1
(i)
zk − zˆk
k=1
exp
−
)˜ sm,j |
2 (i) ση
ˆ (i) −H ˆ m,j (θˆ(i,c) )˜ |Y sm,j |2 m m,j
, (9)
2 (i) ση
d ∀˜ sm,j ∈S−1
B. ECM-based MAP Decoder
(i) ηm =v+
d ∀˜ sm,j ∈S+1
ˆ ˆ l,k (θˆ |Y −H m m,j
−
+
K
(i−1)
zk − zˆk
,
(7)
k=m+1
and zk = Γ(k )F D(sk )W hk is the signal received from (i) the kth user. Note that ηm is a disturbance term that accounts for thermal noise and residual MAI after the ith SAGE iteration. Assuming that users’ symbols are independent and identically distributed with zero-mean, it follows from the cen(i) tral limit theorem that the entries of ηm are nearly Gaussian 2 (i), distributed with zero-mean and variance ση2 (i) = σv2 +σMAI 2 where σMAI (i) is the total power of the last two terms in the RHS of Eq. (7). The ML estimates of m , hm and sm can be obtained from (i) observations yˆm by resorting to the ECM algorithm [6]. For T def T sm m this purpose, we use θm = to denote the T (i,c) (i,c)T (i,c) ˆ parameters to be estimated and θm = sˆm ˆm H
the estimate of θm at the cth ECM and the ith SAGE iterations. (i,0) (i−1) (i,0) (i−1) and ˆm = ˆm , the ECMAfter initializing sˆm = sˆm based MAP decoder proceeds as follows [6], [7]. (i,c) The estimated CFO, ˆm , is first used to compute vector (i) yˆm . (8) Yˆm(i) = F ΓH ˆ(i,c) m Then, we use cdm,j (d = 0, 1, · · · , ϑ − 1) to denote the (d + 1)th code bit mapped onto symbol sm,j (taken from a 2ϑ (i) point modulation constellation). Recalling that ηm are nearly (i) Gaussian distributed, the log-likelihood ratio (LLR) of Yˆm,j
where Sαd (α = +1, −1) is the entire set of constellation (i,c) ˆ m θˆm is the symbols corresponding to cd = α while H least-squared (LS) estimate of the channel frequency response (i,c) given θˆm , which reads [6] ˆ m,LS θˆ(i,c) , ˆ m,LS θˆ(i,c) = W h (10) H m m where
ˆ m,LS θˆ(i,c) h m P sˆ(i,c) m
(i,c) (i,c) P −1 sˆm W H D H sˆm Ym(i) (11) , = W H Em sˆ(i,c) W, (12) m
(i,c) 2 (i,c) sm,j ; j = 0, 1, · · · , N − 1 . = diag ˆ and Em sˆm =
To reduce the computational complexity, Eq. (9) can be evaluated using the max-log approximation [7]
(i) (i) (i,c) ˆ m,j (θˆm L Yˆm,j cdm,j ≈ max −|Yˆm,j − H )˜ sm,j |2 d ∀˜ sm,j ∈S+1
−
max
d ∀˜ sm,j ∈S−1
(i,c) ˆ m,j (θˆm −|Yˆm,j − H )˜ sm,j |2 . (i)
(13)
Note that the ση2 (i) term has been dropped in Eq. (13) since the frequent re-normalization process during the MAP decoding removes the effect of any common factors in Eq. (13). (i) The output of the data detector, L Yˆm,j cdm,j , is (i) then de-interleaved to yield L Yˆm,j bdm,j . Upon receiving (i) L Yˆm,j bdm,j , the MAP decoder generates conditional LLRs
(i)
(i) L bdm,j Yˆm,j and L am,j Yˆm,j by resorting to the BCJR algorithm [9], where am,j is a vector that collects the uncoded
ˆ (i) d bits mapped onto sm,j . Then, L bm,j Ym,j is interleaved
and exploited to compute the expected value of sm,j . Assuming that sm,j is taken from a QPSK constellation (d = 0, 1), (i,c+1) is updated with the expected value of sm,j given by sˆm,j [7]
(i) exp L c0m,j Yˆm,j −1 1 (i,c+1)
= E {sm,j } = √ sˆm,j
(i) 2 exp L c0 Yˆ +1 m,j m,j
(i) exp L c1m,j Yˆm,j −1 .
+j (14)
(i) exp L c1m,j Yˆm,j +1 (i,c+1)
is employed to update the CFO Next, the estimate sˆm,j estimate as [6] (i)H (i,c) H (i,c+1) (i,c) ˆ ˆ ˆm Γ ˆm ˆm y H F D s m,LS θm (i,c+1) (i,c) ˆm = ˆm + (i,c) (i,c+1) (i,c) , (i)H
ˆm y
Γ ˆm
ˆm F HD s
ˆ ˆ H m,LS θm
(15) (i,c) (i,c) (i,c) (i,c) = ΨΓ ˆm , Γ ˆm = Ψ2 Γ ˆm where Γ ˆm and Ψ = 2π N · diag {0, 1, · · · , N − 1}. (i,c+1) (i,c+1) Finally, ˆm and sˆm,j are substituted into Eqs. (10) and (11) to update the channel estimates. After C iterations, where C is a design parameter, we terminate the ECM process and update the SAGE processor with ˆ (i) , sˆ(i) = ˆ(i,C) , h ˆ m,LS θˆ(i,C) , sˆ(i,C) . (16) , h ˆ(i) m m m m m m C. Initialization It is well known that a good initialization scheme is essential to EM-type algorithms. Hence, the problem of obtaining initial (0) ˆ (0) (0) ˆk before the SAGE procedure arises. estimates ˆk , h k and s In our simulation, initial frequency acquisition is performed as discussed in [4], where an OFDMA training block is placed at the beginning of each uplink frame. CFO estimates ˆ(0) are then employed to restore orthogonality among subcarriers by resorting to the LS scheme proposed by Cao, Tureli, Yao ˆ (0) and Honan (CTYH) in [5]. Initial channel estimates h k (k = 1, 2, · · · , K) are obtained using the pilot-aided estimator as described in [10] while eight pilots are assumed to be uniformly placed within each subchannel at a distance of 1/(8Ts ) from each other. Initial data decisions are eventually obtained by Eqs. (13) and (14). IV. S IMULATION R ESULTS A. System Parameters The simulated system has N = 128 subcarriers and a signal bandwidth of 1.429 MHz corresponding to a sampling period of Ts = 0.7 µs. The useful part of each OFDMA block has length T = N Ts = 89.6 µs while the inter-carrier spacing is 1/T = 11.16 kHz. We consider an interleaved CAS where each user is provided with a set of 32 subcarriers (called a subchannel) uniformly spaced over the signal bandwidth. In this way, the maximum number of active users in each OFDMA block is Kmax = 4. We assume a fully-loaded system T where K = Kmax . Users’ CFOs are = ρ · [1, −1, 1, −1] , where ρ is modeled as a deterministic parameter belonging to
interval [0, 0.5] (referred to as the CFO attenuation factor). We use a rate-1/2 convolutional code with the generator polynomials 5, 7 (in hexadecimal). An 8 × 8 block interleaver is employed to scramble the coded bits belonging to the same OFDM block. Unless otherwise specified, interleaved bits are mapped onto QPSK symbols through a Gray map. We assume a cell radius of R = 0.3 km so that the maximum two-way propagation delay (normalized to Ts ) is µmax = int {2R/cTs } = 3. Delays µk are independently generated at the beginning of each uplink frame. They are taken from the set {0, 1, 2, 3} with equal probability and kept constant over the frame. The channel responses ξk (n) have length Lξ = 5, corresponding to a CIR duration of 3.5 µs. This means that each hk (n) has dimension Lh = Lξ +µmax = 8. A cyclic prefix of length Ng = 8 is used to avoid IBI so that the duration of an extended OFDMA block (including the cyclic prefix) is TB = (N + Ng )Ts = 95.2 µs. Channel taps ξk (n, l) are modeled as statistically independent narrow-band Gaussian processes with zero-mean and autocorrelation function E {ξk (n, l)ξk∗ (n + m, l)} = σξ2k (l)J0 (2πmBD TB ) ,
(17)
where l ∈ {0, 1, · · · , 4}, BD is the Doppler bandwidth, J0 (x) is the zeroth-order Bessel function of the first kind and 2 (18) σξ2k (l) = E |ξk (n, l)| = βk · exp(−l). In Eq. (18), β1 is chosen such that the signal power of 2 = 1, while user #1 is normalized to unity, i.e., E ξ1 parameters βk (k ≥ 2) affect the signal-to-interference ratio. The Doppler bandwidth is related to carrier frequency f0 and mobile velocity v through BD = f0 v/c. Letting f0 = 2 GHz and v = 60 km/h, we obtain Bd ≈ 110 Hz, corresponding to 1% of subcarrier spacing. Unless otherwise specified, the number C of ECM iterations is set to 3 while the number Ni of SAGE iterations is varied throughout the simulation to assess its impact on the system performance. Without loss of generality, we only provide results for user #1. B. Performance Assessment Case 1: BER performance for coded QPSK Fig. 3 shows the BER of the proposed receiver as a function of Eb /N0 , where users have equal power with ρ = 0.3 and the number of iterations is Ni = 5. The curve labeled “Ideal” is obtained with perfect knowledge of CFOs and ˆ m,LS = channel responses, i.e., we let ˆm = m and h hm for m = 1, 2, 3, 4. This provides a benchmark for the BER performance since users’ signals at the DFT output are perfectly orthogonal and no interference is present in this case. At an error rate of 10−3 , the performance gain of the proposed receiver after five iterations (Ni = 5) over the initial data decisions is more than 4dB while a loss of 4dB is incurred with respect to the ideal system. Since the CTYH method is used for the initialization purpose, the BER measured over the initial data decisions actually corresponds to the error rate of CTYH. For comparison, we also simulate the performance
−2
0
10
10
Proposed (Initial) Proposed (N =1) i
Proposed (Ni=5)
−3
10 Frequency Estimate MSE
of a hard-decision EM-based receiver (HDEMBR) that makes hard decisions in Eq. (14) followed by hard-decision Viterbi decoding. As shown in Fig. 3, HDEMBR performs poorly since the hard-decision making process incurs the loss of information.
−4
10
−5
−1
10
10
−2
10 Coded BER
−6
10 −3
10
0.05
0.1
0.15
0.2
0.25 ρ
0.3
0.35
0.4
0.45
0.5
Fig. 5. The MSE of the frequency estimate as a function of attenuation factor ρ.
Ideal
−4
10
0
Proposed (Initial) Proposed (N =5) i
−5
10
HDEMBR (N =5) i
−6
10
Fig. 3.
0
2
4
6
8
10 12 Eb/N0 (dB)
14
16
18
20
The BER performance vs. Eb /N0 for coded QPSK and ρ = 0.3.
Case 2: Performance with different CFO attenuation factors Next, we assess the performance of the proposed receiver as a function of attenuation factor ρ. The following MSE indicators have been used.
2
(i,C) (i) MSE = E 1 − 1 , (19) 2 (i) (i,C) − h1 (20) MSEh = E h1,LS θˆ1 (i) MSEh ,
with C = 3. Figs. 4 and 5 illustrate and respectively, for i = 1 and 5, where users have equal power with Eb /N0 = 15 dB. We see that the performance deteriorates as ρ increases due to the increased amount of ICI and MAI. Note that the initial frequency and channel estimates labeled “Proposed (Initial)” are obtained using the data-aided schemes in [4] and [10], respectively. A considerable improvement is observed by passing from i = 1 to 5, especially in the frequency estimation accuracy. MSE(i)
2
10
Proposed (Initial) 1
Proposed (Ni=1)
10
Proposed (Ni=5) Channel Estimate MSE
0
10
−1
10
−2
10
−3
10
−4
10
0
0.05
0.1
0.15
0.2
0.25 ρ
0.3
0.35
0.4
0.45
0.5
Fig. 4. The MSE of the channel estimate as a function of attenuation factor ρ.
V. C ONCLUSION An iterative receiver architecture for coded OFDMA uplink transmissions was proposed in this work, where EM-type algorithms are employed to perform frequency synchronization, channel estimation and MAP decoding in a joint fashion. The interference induced by residual CFOs is first re-constructed based on the soft-decision feedback derived from a MAP decoder and then mitigated directly at the BS without the need of returning CFO estimates back to mobile units to achieve frequency synchronization. Simulations indicated that the proposed receiver is very robust against even large CFO values and significantly outperforms conventional iterative receivers employing hard-decision feedback. R EFERENCES [1] J.J. van de Beek, P.O. Borjesson, M.L. Boucheret, D. Landstrom, J.M. Arenas, O. Odling, M. Wahlqvist, and S.K. Wilson, “A time and frequency synchronization scheme for multiuser OFDM”, IEEE JSAC, vol. 17, no. 11, pp. 1900–1914, Nov. 1999. [2] Z. Cao, U. Tureli, and Y. D. Yao, “Efficient structure-based carrier frequency offset estimation for interleaved OFDMA uplink”, in Proc. of ICC 2003, pp. 3361–3365, 2003. [3] M. Morelli, “Timing and frequency synchronization for the uplink of an OFDMA system”, IEEE Trans. Commun., vol. 52, no. 2, pp. 296–306, Feb. 2004. [4] M.O. Pun, S.H. Tsai, and C.-C. Jay Kuo, “Joint maximum likelihood estimation of carrier frequency offset and channel for uplink OFDMA systems”, in Proc. of Globecom 2004, Dallas, TX, vol. 6, pp. 3748 – 3752, November 2004. [5] Z. Cao, U. Tureli, Y. D. Yao, and P. Honan, “Frequency synchronization for generalized OFDMA uplink”, in Proc. of Globecom 2004, Dallas, TX, pp. 1071 – 1075, 2004. [6] M.O. Pun, M. Morelli, and C.-C. Jay Kuo, “A novel iterative receiver for uplink OFDMA”, in Proc. of Globecom 2005, St. Louis, MI, vol. 5, pp. 2669 – 2673, November 2005. [7] S.Y. Park, Y.G. Kim, and C.G. Kang, “Iterative receiver for joint detection and channel estimation in OFDM systems under mobile radio channels”, IEEE Trans. Vehicular Technology, vol. 53, no. 2, pp. 450 – 460, March 2004. [8] A.S. Gallo, G.M. Vitetta, and E. Chiavaccini, “A BEM-based algorithm for soft-in soft-output detection of co-channel signals”, IEEE Trans. Wireless Commun., vol. 3, no. 5, pp. 1533 – 1542, September 2004. [9] L.R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate”, IEEE Trans. Inform. Theory, vol. IT-20, pp. 284–287, March 1974. [10] M. Morelli and U. Mengali, “A comparison of pilot-aided channel estimation methods for OFDM systems”, IEEE Trans. Signal Processing, vol. 49, no. 12, pp. 3065–3073, December 2001.