Channel Estimation for Adaptive Frequency ... - Semantic Scholar

9 downloads 0 Views 305KB Size Report
Both schemes exploit train- ... section describes the signal model and introduces basic ... the transmit filter and the physical channel) during the mth data block ...
2508

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 5, SEPTEMBER 2005

Channel Estimation for Adaptive Frequency-Domain Equalization Michele Morelli, Member, IEEE, Luca Sanguinetti, Student Member, IEEE, and Umberto Mengali, Life Fellow, IEEE

Abstract—Frequency-domain equalization (FDE) is an effective technique for high-rate wireless communications because of its reduced complexity compared to conventional time-domain equalization (TDE). In this paper, we consider adaptive FDE for single-carrier (SC) systems with explicit channel and noise-power estimation. The channel response is estimated in the frequency domain following two different approaches. The first operates independently on each frequency bin while the second exploits the fading correlation across the signal bandwidth. Leastmean-square (LMS) and recursive-least-square (RLS) algorithms are employed to update the channel estimates. The noise power is estimated using a low-complexity algorithm based on ad hoc reasoning. Compared to other existing receivers employing adaptive FDE, the proposed schemes have better error-rate performance and can be used even in the presence of relatively fast fading. Index Terms—Channel estimation, channel tracking, frequency-domain equalization (FDE).

I. I NTRODUCTION

F

UTURE wireless communication systems are expected to support high-speed and high-quality multimedia services. In these applications, the received signal is typically affected by frequency-selective fading and channel equalization is required to mitigate the resulting intersymbol interference (ISI) [1]. The classical approach in single-carrier (SC) systems is time-domain equalization (TDE) [2]. However, the number of operations per signaling interval grows linearly with the number of interfering symbols or, equivalently, with the data rate. As a result, conventional time-domain equalizers are not suitable for high-speed transmissions with channel delay spreads extending over tens of symbol intervals. A promising alternative to TDE is the frequency-domain equalization (FDE) [3]–[6]. Using the fast Fourier transform (FFT) in conjunction with FDE leads to substantial computational saving with respect to conventional TDE. Also, adaptive algorithms generally converge faster and are more stable in the frequency domain [4]. Compared to orthogonal frequency division multiplexing (OFDM), SC systems with FDE (SC-FDE) have similar performance and complexity [5]–[7], but the latter are less sensitive to carrier-frequency uncertainties and nonlinear distortions, thereby allowing the use of lowManuscript received November 24, 2003; revised April 14, 2004 and July 28, 2004; accepted August 22, 2004. The editor coordinating the review of this paper and approving it for publication is H. Li. This work was supported by the Istituto di Elettronica e di Ingegneria dell’Informazione e delle Telecomunicazioni (IEIIT) of the Italian National Research Council (CNR). The authors are with the Department of Information Engineering, University of Pisa, I-56126 Pisa, Italy (e-mail: [email protected]). Digital Object Identifier 10.1109/TWC.2005.853896

cost power amplifiers. SC-FDE systems equipped with multiple receive antennas have been discussed in [3] and [8], while the possibility of achieving transmit diversity using Alamouti’s space–time block coding [9] has been explored in [10]. Finally, novel FDE structures for SC multiple-input multiple-output (MIMO) systems are investigated in [11]. Mobile communication systems operating over time-varying fading channels require adaptive signal processing to track the channel variations at the receiver. Adaptive FDE schemes based on the minimum mean square error (MMSE) criterion have been investigated by Clark in [3]. They operate according to least-mean-square (LMS) or recursive-least-square (RLS) adaptation rules and do not require explicit channel estimation. Compared with their time-domain counterparts, they have better stability and shorter acquisition times. However, their performance over fast-fading channels is unsatisfactory when a single receive antenna is employed (i.e., without space diversity). In the present paper, we return to the problem discussed by Clark, but we consider adaptive SC-FDE schemes in which estimates of the channel response are exploited to compute the equalizer coefficients according to the MMSE criterion. The channel response is estimated in the frequency domain using two different approaches. The first assumes independently faded frequency bins and is referred to as unstructured channel estimation (UCE). The second is called structured channel estimation (SCE) and effectively exploits the fading correlation between adjacent bins. Both schemes exploit training symbols to get initial channel estimates, whereas LMS or RLS algorithms are employed to track channel variations. It is worth noting that in SC-FDE systems, the energy of each symbol is distributed over the whole signal bandwidth and pilots cannot be placed on preassigned frequency bins. Accordingly, channel estimation cannot be accomplished with the same methods employed in OFDM applications, where pilots are typically inserted in both time and frequency dimensions, and channel estimates are obtained by interpolation (see [12] and [13] and references therein). As explained later, computing the equalizer coefficients requires knowledge of the noise power. The latter is estimated in the frequency domain using a maximum likelihood (ML) approach. The resulting scheme has good performance but it is computationally demanding. Therefore, we also consider a simpler solution based on heuristic arguments. Simulation results indicate that the proposed SC-FDE schemes outperform Clark’s adaptive detectors (CADs) (especially in a fast-fading environment) without a significant increase in complexity. Diversity combining using multiple

1536-1276/$20.00 © 2005 IEEE

MORELLI et al.: CHANNEL ESTIMATION FOR ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION

Fig. 1.

2509

(a) SC-FDE transmitter. (b) SC-FDE receiver.

receive antennas is also considered. It is shown that this guarantees dramatic performance improvements. The rest of the paper is organized as follows. The next section describes the signal model and introduces basic notations. In Section III, the concept of FDE with explicit channel and noise-power estimation is discussed. Several channelestimation schemes operating in the frequency domain are proposed in Section IV, whereas the problem of the noisepower estimation is addressed in Section V. Simulation results are discussed in Section VI and some conclusions are offered in Section VII. II. S YSTEM M ODEL A. SC-FDE Transmitter Fig. 1(a) shows the transmitter of the SC-FDE system under investigation. The input symbols, belonging to a phaseshift keying (PSK) or quadratic-amplitude modulation (QAM) constellation, are partitioned into adjacent blocks of length N and each block is preceded by a cyclic prefix longer than the channel impulse response (CIR). The prefix serves to eliminate interblock interference and makes the linear convolution of the symbols with the channel look like a circular convolution, which is essential for FFT-based demodulation. We denote cm = [cm (0) cm (1) · · · cm (N − 1)]T as the mth block of symbols [the superscript (·)T means transpose operation] and assume that {cm (n)} are independent and identically distributed (i.i.d.) with zero mean and unit variance. After insertion of the NG -point cyclic prefix, cm is fed to a linear modulator with impulse response g(t) and signaling interval T . The complex envelope of the transmitted signal is

s(t) =

∞ 

N −1 

cm (n)g(t − nT − mTB )

(1)

m=−∞ n=−NG

where m counts the transmitted blocks, n counts the data symbols within a block, TB = (N + NG )T is the duration of the cyclically extended data block, and cm (n) = cm (n + N ) for −NG ≤ n ≤ −1. We assume that g(t) has a root-raisedcosine Fourier transform with some roll-off α.

B. SC-FDE Receiver The receiver has P diversity branches and its block diagram is sketched in Fig. 1(b). The complex envelope of the received waveform at the pth antenna is denoted r(p) (t) and is expressed by (p)

r(p) (t) = sR (t) + η (p) (t)

(2)

(p)

where sR (t) is the signal component and η (p) (t) is thermal noise. The latter is modeled as a circularly symmetric Gaussian (p) process with two-sided power spectral density 2N0 (possibly different from branch to branch). As in [3], we assume negligible channel variations over a block (slow fading). Then, (p) denoting hm (t) as the CIR at the pth antenna (encompassing the transmit filter and the physical channel) during the mth data block, we have (p)

sR (t) =

∞ 

N −1 

cm (n)h(p) m (t − nT − mTB ).

(3)

m=−∞ n=−NG

In order to produce a discrete-time signal, the waveform from each antenna is fed to a low-pass filter (LPF) and is sampled at a rate of 2/T to avoid aliasing distortion. For the sake of simplicity, the LPF is taken with a brick-wall transfer function of bandwidth 1/T . Note that the rectangular shape is not strictly necessary and could easily be made more realistic [14]. For example, we may employ a root-raised-cosine function with a suitable roll-off such that the signal component is passed undistorted and the noise samples at the filter output are uncorrelated. After carrier-frequency and block-timing synchronization [not shown in Fig. 1(b)], the cyclic prefix is discarded and the received samples are arranged in blocks of 2N elements. We (p) (p) (p) (p) denote xm = [xm (0) xm (1) · · · xm (2N − 1)]T as the mth (p) block of samples at the pth antenna and assume that hm (t) has (p) support (0, LT ), with L ≤ NG . Then, the entries of xm are found to be x(p) m (k) =

N −1 

(p) cm (n)h(p) m (k − 2n) + wm (k),

n=1−L

k = 0, 1, . . . , 2N − 1 (4)

2510

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 5, SEPTEMBER 2005

Fig. 2. Frequency-domain equalizer/combiner with explicit channel and noise-power estimation. (p)

(p)

(p)

where hm () is the sample of hm (t) at t = T /2 and wm (k) (p) (p) is white Gaussian noise with variance σ 2 = 4N0 /T . (p) Each block xm (p = 1, 2, . . . , P ) is transformed in the frequency domain using a 2N -point discrete Fourier transform (DFT) unit and the DFT outputs are passed to the channel equalizer/combiner to form the N -dimensional vector Y m = [Ym (0) Ym (1) · · · Ym (N − 1)]T (details on the channel equalizer/combiner are given in the next section). After an N -point inverse DFT (IDFT), the time-domain samples y m = [ym (0) ym (1) · · · ym (N − 1)]T are finally fed to a threshˆm = [ˆ cm (0) old device that delivers the data decisions c cˆm (1) · · · cˆm (N − 1)]T corresponding to the mth block. C. Channel Model We consider a multipath channel with Np distinct paths and we assume that the P receive antennas are arranged in a uniform linear array (ULA) with interelement spacing d. Then, the baseband impulse response of the system at the pth antenna takes the form ξ

(p)

(t, τ ) =

Np 

a (t)e j(p−1)ω (t) δ(τ − τ (t))

(5)

=1

where δ(t) is the Dirac delta function, τ (t) is the delay of the th path, a (t) is the corresponding complex amplitude, and ω (t) is defined as ω (t) =

2π d sin [ϕ (t)] . λ

The CIR at the pth antenna is the convolution of ξ (p) (t, τ ) with the transmit pulse g(t). Recalling that the gains {a (t)} are practically constant over a block, we have

h(p) m (t)

=

Np 

a (mTB )e j(p−1)ω g(t − τ ).

(7)

=1 (p)

Note that the length of hm (t) (expressed in symbol intervals) is L = int{(τmax + Tg )/T }, where Tg is the duration of g(t), τmax = max {τ } is the maximum path delay, and int(x) denotes the maximum integer not exceeding x. Since τmax is usually unknown, in practice, L is estimated as taking the maximum expected value of τmax .

III. F REQUENCY -D OMAIN E QUALIZATION (FDE) Fig. 2 illustrates an FDE scheme with explicit channel and noise-power estimation. The DFT output at the pth branch is (p) (p) (p) T denoted X (p) m = [Xm (0) Xm (1) · · · Xm (2N − 1)] , with 2N −1 j2πnk 1  (p) (p) Xm (n) = √ xm (k)e− (2N ) . 2N k=0

(8)

(p)

Substituting (4) into (8) and bearing in mind that hm (t) has duration LT produces

(6)

In the above equation, λ is the free-space wavelength and ϕ (t) is the direction of arrival (DOA) of the th path. In the following, we assume that the path delays and the DOAs do not change significantly with time, i.e., we set τ (t) ≈ τ and ϕ (t) ≈ ϕ . Conversely, the path gains are modeled as narrowband, independent, and complex-valued Gaussian processes with zero mean and average power σ2 = E{|a (t)|2 }.

(p) (p) (p) (n) = Cm (n)Hm (n) + Wm (n) Xm

(9)

(p)

where Hm (n) is the DFT of the channel response 2L−1 j2πn 1  (p) (p) Hm (n) = √ hm ()e− (2N ) , 2N =0

0 ≤ n ≤ 2N − 1 (10)

MORELLI et al.: CHANNEL ESTIMATION FOR ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION

TABLE I COMPUTATIONAL COMPLEXITY PER DETECTED SYMBOL

while Cm (n) is defined as

Cm (n) =

N −1 

cm (k)e−

2511

j2πnk N

0 ≤ n ≤ 2N − 1.

,

(11)

k=0

Finally, the quantity 2N −1 j2πnk 1  (p) (p) Wm (n) = √ wm (k)e− (2N ) , 2N k=0

0 ≤ n ≤ 2N − 1 (12) (p)

is additive white Gaussian noise (AWGN) with variance σ 2 . (p) Bearing in mind that g(t) [and hence, hm (t)] is bandlimited to (p) |f | ≤ (1 + α)/2T , it is seen that Hm (n) is 0 for Nα ≤ n ≤ 2N − Nα , with Nα = 1 + int[N (1 + α)/2]. Vector X (p) m is fed to the pth channel equalizer (a bank of (p) 2N complex-valued multipliers {Fm (n); 0 ≤ n ≤ 2N − 1}) and is then combined with the other branch outputs to form

Zm (n) =

P 

0 ≤ n ≤ 2N − 1. (13)

(p) (p) Xm (n)Fm (n),

p=1

Computing the 2N -point IDFT of Zm (n) produces the equalized sequence in the time domain

zm (k) =

2N −1 

j2πnk (2N )

Zm (n)e

0 ≤ k ≤ 2N − 1

,

(14)

n=0

from which the decision statistics y m = [ym (0) ym (1) · · · ym (N − 1)]T are obtained by decimation, i.e., taking ym (k) = zm (2k) for k = 0, 1, . . . , N − 1. As shown in Fig. 2, this is tantamount to passing the samples {Zm (n)} to an aliasing operator that produces the quantities [3] 0 ≤ n ≤ N − 1 (15)

Ym (n) = Zm (n) + Zm (n + N ),

and computing y m as the N -point IDFT of {Ym (n)}. This approach can be explained observing that decimating in the time domain corresponds to an aliasing operation in the frequency domain. The MSE at the input of the decision device is E{|ym (k) − cm (k)|2 }, where the expectation is taken over the transmitted data sequence and additive noise (i.e., the MSE is defined for a static channel). Assuming i.i.d. data symbols with zero mean and unit variance, the optimum equalizer coefficients minimizing the MSE are computed with ordinary manipulations and read [2] 

(p)

N Hm (n) σ2

(p) (n) = Fm N

1+

(p)

P    =1

∗

IV. C HANNEL E STIMATION

2 ,

Hm (n+iN ) ()

Note that the denominator in (16) is independent of p so that (p) (p) the equalizer coefficients are proportional to [Hm (n)]∗ /σ 2 . Therefore, from (13), it is seen that Zm (n) is a maximum ratio (p) combination of {Xm (n); p = 1, 2, . . . , P }. (p) From (16), we see that computing Fm (n) requires knowledge of the channel response and the noise power at each branch. In practice, these quantities are unknown. A way out is discussed in [3], where the equalizer coefficients are updated in the frequency domain using LMS or RLS algorithms without explicit channel and noise-power estimation. As indicated in Fig. 2, here, we propose an alternative approach in which the (p) (p) (p) (p) ˆm ˆ 2 and H (n), are emestimates of σ 2 and Hm (n), say σ ployed in (16) to approximate the equalizer coefficients. Some (p) (p) ˆm (n) are discussed in the methods to compute σ ˆ 2 and H next section. The computational load of the FDE is assessed as follows. The DFT operator in (8) needs N log2 (2N ) complex products and 2N log2 (2N ) complex additions for each diversity branch. Also, a total of 2N P complex multiplications and 2N P − N complex additions are involved in the computation of Zm (n) and Ym (n) in (13) and (15), respectively. The IDFT of {Ym (n)} needs (N/2) log2 (N ) complex products and N log2 (N ) complex additions. Finally, computing the equalizer coefficients in (16) requires 5N P real products and 4N P real additions. The overall operations per detected symbol are summarized in the first line of Table I. In writing these figures, we have borne in mind that a complex product amounts to four real products plus two real additions, while a complex addition is equivalent to two real additions. The results of Table I indicate that the computational load involved in the FDE is proportional to P log2 N . For comparison, we recall that the complexity of a time-domain equalizer with P diversity branches is on the order of P L [3]. Since in typical applications the block length N is about 5L (corresponding to an overhead of 20%), we see that in highly dispersive channels (where large values of L are expected), FDE may achieve significant computational savings with respect to TDE.

We begin by rewriting (9) in matrix form

i () σ2

0 ≤ n ≤ 2N − 1;

p = 1, 2, . . . , P. (16)

(p) (p) X (p) m = C mH m + W m

(17)

2512

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 5, SEPTEMBER 2005

(p)

(p)

(p)

T where H (p) m = [Hm (0) Hm (1) · · · Hm (2N − 1)] , C m is a diagonal matrix

C m = diag { Cm (0)

Cm (1)

(p)

···

Cm (2N − 1) }

(p)

(18)

(p)

T is a and W (p) m = [Wm (0) Wm (1) · · · Wm (2N − 1)] (p) Gaussian vector with zero mean and covariance matrix C W = (p) σ 2 I 2N (I 2N is the identity matrix of order 2N ). From (10), we see that H (p) m can also be written as (p) H (p) m = F hm (p)

(p)

(19)

(p)

T where h(p) m = [hm (0) hm (1) · · · hm (2L − 1)] is the CIR vector at the pth antenna and F is a 2N × 2L matrix with entries

1 − j2πn e (2N ) , [F ]n, = √ 2N

ˆ (p) is unbiased and has the following mean square estimation H m error (MSEE)  2

(p) ˆ (p) (p) (24) E H m − H = 4σ 2 BL TB

0 ≤ n ≤ 2N −1;

0 ≤  ≤ 2L−1.

where BL TB = µN/[2(2 − µN )] is the noise equivalent bandwidth [15, p. 126] of the recursion (22), normalized to 1/TB . The complexity of LMS-UCE is assessed as follows. In the decision-directed mode, the entries of C m are computed from ˆm through an N -point FFT involving (N/2) log2 N complex c products and N log2 N complex additions. Also, evaluating ˆ (p) CH m C m H m in the right-hand side (RHS) of (23) needs 6N real products plus N real additions. Completing the compu(p) tation of em requires 2N complex products and additions. Finally, updating the channel estimates in the RHS of (22) needs 2N complex additions. The overall operations per detected symbol are summarized in the second line of Table I.

(20)

In the following, we discuss two iterative schemes for estimating the channel frequency response at each diversity branch. The first, termed UCE, is based on model (17) and considers the entries of H (p) m as unknown independent parameters. The second takes advantage of the correlation between adjacent frequency bins and effectively exploits the structure of H (p) m shown in (19). For this reason, it is called the SCE. As we shall see, both UCE and SCE require knowledge of the transmitted data symbols. To this end, we assume that the data blocks are organized in frames, and each frame is preceded by some training blocks. During the data section of the frame, the channel estimators are switched to a decision-directed mode and the transmitted symbols are replaced by data decisions.

B. RLS Unstructured Channel Estimation (RLS-UCE) The RLS-UCE aims at minimizing the exponentially weighted sum m 2  (p)   (p) ˜ (p) ˜ = λm−i X i − C i H JRLS-UCE H , i=0

p = 1, 2, . . . , P

(25)

where 0 < λ < 1 is the forgetting factor. The minimum is ˆ (p) , with H ˆ (p) satisfying the recursive ˜ (p) = H achieved for H m m equation ˆ (p) + K m e(p) , ˆ (p) = H H m+1 m m

p = 1, 2, . . . , P

(26)

A. LMS Unstructured Channel Estimation (LMS-UCE) (p)

LMS-UCE employs the LMS algorithm to minimize the cost function 

 (p)  (p) 2 (p) ˜ ˜ X = E − C , JLMS-UCE H H m m m m p = 1, 2, . . . , P

(21)

˜ (p) ( ·  denotes Euclidean norm). This with respect to H m produces ˆ (p) H m+1

=

ˆ (p) H m

+ µe(p) m ,

where em is defined in (23) and Km = diag{Km (0) Km (1) · · · Km (2N − 1)}. The term Km (n) is the Kalman gain over the nth frequency bin and it is expressed by Km (n) =

(22)

ˆ (p) is the estimate of H (p) , µ is the step size, and e(p) where H m m m is given by H (p) ˆ (p) e(p) m = C m X m − C mH m

(23)

with (·)H denoting Hermitian transpose. The performance of LMS-UCE over a static channel (i.e., (p) ) is assessed in Appendix A, assuming i.i.d. data H (p) m =H symbols with zero mean and unit variance. It turns out that

0 ≤ n ≤ 2N − 1 (27)

with Sm (n) satisfying the recursion Sm+1 (n) =

p = 1, 2, . . . , P

Sm (n) , λ + |Cm (n)|2 Sm (n)

1 Sm (n) 1 − Km (n) |Cm (n)|2 , λ 0 ≤ n ≤ 2N − 1.

(28)

Note that Km (n) and Sm (n) do not depend on the index p, which means that they are the same at each antenna. The overall operations required by RLS-UCE per detected symbol are shown in the third line of Table I. In writing this figures, we have taken into account that Km (n) and Sm (n) are real quantities that need to be computed only for 0 ≤ n ≤ N − 1. This is a consequence of the identities Km (n + N ) = Km (n) and Sm (n + N ) = Sm (n), which are easily derived from (11), (27), and (28).

MORELLI et al.: CHANNEL ESTIMATION FOR ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION

ˆ (p) , from (34), we get ˆ (p) = F h Bearing in mind that H m m

C. LMS Structured Channel Estimation (LMS-SCE) LMS-SCE aims at estimating h(p) m by looking for the minimum of 

 (p)  (p) 2 (p) ˜ ˜ JLMS-SCE hm = E X m − C m F hm , p = 1, 2, . . . , P

(29)

p = 1, 2, . . . , P

p = 1, 2, . . . , P.

(31)

Following the same arguments as in Appendix A, it can be shown that LMS-SCE is unbiased and its MSEE is given by  2 4Lσ 2(p) B T ˆ (p) L B (p) . E H − H = m N

D. RLS Structured Channel Estimation (RLS-SCE) In this case, we look for the minimum of JRLS-SCE

m 2   (p) ˜ (p) ˜ (p) = λm−i X i − C i F h h ,

(33)

˜ (p) . As shown in Appendix B, this leads to the with respect to h recursion ˆ (p) + R−1 F H e(p) , ˆ (p) = h h m m+1 m m (p)

where em defined as

p = 1, 2, . . . , P

(34)

is still given in (23) and Rm ∈ C2L×2L is

Rm =

m  i=0

λm−i F H C H i C iF .

Rm = λRm−1 + F H C H mC mF

(37)

V. N OISE -P OWER E STIMATION We assume that the noise power is constant over a frame (p) and, therefore, the noise variance σ 2 at each antenna can be estimated frame by frame, exploiting the available training blocks. In the sequel, we discuss two methods for estimating (p) σ 2 . The first is based on ML reasoning while the second is derived from an ad hoc argument.

Collecting (17) and (19) yields (p) (p) X (p) m = C m F hm + W m

(38)

where W (p) m is Gaussian distributed with zero mean and covari(p) ance matrix σ 2 I 2N . Recalling that C m is known (training (p) block), the joint ML estimates of σ 2 and h(p) m , based on (p) the observation of X m , are found by maximizing the loglikelihood function 2  (p)  (p) (p)  1 (p) 2 ˜ ˜ (p) X = −2N ln π˜ σ − Λ σ ˜2 , h −C F h m m m m σ ˜ 2(p) (39) (p)

(p)



ˆ (p) = D H D m −1 D H X (p) h m m m m

(40)

(p)

˜ . Keeping σ ˜2 with respect to the trial values σ ˜ 2 and h m 2(p) ˜ (p) ˜ (p) fixed and maximizing Λ(˜ σ , hm ) with respect to h m produces

with D m = C m F . Next, substituting (40) into (39) and maxi(p) (p) mizing with respect to σ ˜ 2 gives the ML estimate of σ 2

i=0

p = 1, 2, . . . , P

Note that, although Rm can be computed recursively as

A. ML-Based Estimation (MLBE) (32)

Note that the only difference between LMS-SCE and LMSUCE is the presence of the matrix F F H in (31), which performs a better noise filtering by taking into account that h(p) m has the duration L < N . This leads to a reduction of the MSEE by a factor N/L, as seen by comparing (32) with (24). For L = N , LMS-SCE boils down to LMS-UCE since, in this case, we have F F H = I 2N . The third line in Table I shows the overall operations involved in LMS-SCE. In writing this line, we have borne in mind that (p) (p) F F H em is efficiently computed by feeding em to a 2N -point IDFT unit, setting to 0 the last 2N − 2L outputs, and finally passing the resulting vector to a 2N -point DFT.



p = 1, 2, . . . , P. (36)

(30)

(p) ˆ (p) is the CIR estimate at the where em is defined in (23) and h m mth step. Premultiplying both sides of (30) by F and bearing in mind (19) produces

ˆ (p) + µF F H e(p) , ˆ (p) = H H m+1 m m

ˆ (p) = H ˆ (p) + F R−1 F H e(p) , H m+1 m m m

there seems to be no recursive way to compute R−1 m in (36). This makes RLS-SCE prohibitively complex and, for this reason, it is not considered in the sequel.

˜ (p) . This leads to the recursion with respect to h m ˆ (p) = h ˆ (p) + µF H e(p) , h m+1 m m

2513

(35)

(p)

2 σ ˆML =

1 2N

⊥ (p) 2 D m X m

(41)

H −1 H where D ⊥ m = I 2N − D m (D m D m ) D m is the orthogonal ⊥ complement of D m . Note that D m depends on the training symbols (through C m ) and can be precomputed and stored in the receiver. It can be shown that  (p)  N − L (p) 2 = σ2 (42) E σ ˆML N (p)

2 is a biased estimator. On the which means that σ ˆML other hand, averaging (41) over the available training blocks

2514

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 5, SEPTEMBER 2005

TABLE II COMPUTATIONAL COMPLEXITY OF THE NOISE-POWER ESTIMATOR PER FRAME

In practical applications, L is smaller than N (say L ≈ N/5) and MLBE outperforms the AHE since Γ is greater than unity. However, AHE is much simpler to implement (the overall operations are shown in the second line of Table II). Note that Γ grows to infinity as α approaches unity, meaning that AHE fails with a large excess bandwidth. In these circumstances, the MLBE is indispensable.

(to improve the estimation accuracy) produces the following unbiased estimate

VI. S IMULATION R ESULTS

(p)

2 σ ˆMLBE =

1 2NT (N − L)

N T −1

⊥ (p) 2 D m X m

(43)

m=0

with NT being the number of training blocks. In the sequel, (43) is referred to as the MLBE. Its variance is computed using ordinary manipulations and reads  (p)  2 = var σ ˆMLBE

(p) 2 σ2 2NT (N − L)

.

(44)

The overall operations required by MLBE for each frame are shown in the first line of Table II. The major complexity arises (p) from the computation of vectors {D ⊥ m X m } in (43), which is cumbersome for large values of N . For this reason, it is worth looking for a simpler suboptimal solution. B. Ad Hoc Estimation (p)

Returning to (9) and recalling that Hm (n) = 0 for n ∈ I = {Nα , Nα + 1, . . . , 2N − Nα } yields (p) (p) (n) = Wm (n), Xm

n ∈ I;

p = 1, 2, . . . , P

(45)

(p)

where Wm (n) are statistically independent Gaussian random (p) variables with zero mean and variance σ 2 . Inspection of (45) (p) suggests the following estimate of σ 2 (p)

2 σ ˆAHE =

N T −1    1  (p) 2 Xm (n) NT NI m=0

(46)

n∈I

with NI being the cardinality of I. Since (46) is not based on an optimality criterion, it is referred to as ad hoc estimator (AHE) in the sequel. With standard calculations, it is found that AHE is unbiased, with variance (p) 2  (p)  σ2 2 . (47) var σ ˆAHE = NT NI To compare the accuracy of AHE and MLBE, we introduce 2(p) 2(p) the ratio Γ = var{ˆ σAHE }/var{ˆ σMLBE }. Recalling that NI = 2(N − Nα ) + 1 and Nα ≈ N (1 + α)/2, from (44) and (47), we have

 L 2 1− N Γ≈ . (48) 1−α

Computer simulations have been run to assess the performance of an SC-FDE receiver employing the proposed channel and noise-power-estimation schemes. The system parameters are as follows. A. System Parameters The transmitted symbols belong to a quaternary PSK (QPSK) constellation and are related to the information bits through a Gray map. The modulation pulse g(t) is a root-raised-cosine function with roll-off = 0.35 and duration Tg = 6T . Assuming a maximum expected path delay of 10 µs, this corresponds to a CIR length of L = int(10R + 6), where R = 1/T is the signaling rate in megabaud. The length NG of the cyclic prefix is set equal to L and results in an effective bit rate of Rb = 2RN/(L + N ) (ignoring the overhead due to training blocks). The carrier frequency is f0 = 2 GHz (corresponding to a wavelength λ = 15 cm) and the interelement spacing in the antenna array is d = 2λ. Each frame is made of 100 blocks [3] and is preceded by a preamble of five blocks for channel and noise-power estimation. As indicated in (7), the CIR at the pth antenna is generated with six paths (Np = 6). At the start of each frame, a new set of path delays, complex gains, and DOAs are randomly generated. The path delays and DOAs are uniformly distributed within [0, 10 µs] and [−60◦ , 60◦ ], respectively, and are kept constant over a frame. The path gains with power σ2 = exp(−/2) (0 ≤  ≤ 5) vary independently of each other within a frame. They are generated by passing complex-valued and statistically independent white Gaussian processes through a third-order lowpass Butterworth filter. The 3-dB bandwidth of the filter is taken as a measure of the Doppler rate fD = f0 v/c, where v is the mobile speed and c denotes the speed of light. As in a welldesigned system the channel coherence time is much larger than the block duration, we assume that the path gains are static within a single block [3]. Simulations results are given for R = 2 Mbaud, L = 26, and N = 128. The mobile velocity and the number of diversity branches are given different values to assess their impact on the system performance. The optimal selection of the step size µ and the forgetting factor λ for the adaptive channel estimators depends on the fading rate. Simulations indicate that for mobile speeds between 25 and 140 km/h, a good choice of the adaptation parameters is µ = 3 × 10−3 for LMSUCE, µ = 6 × 10−3 for LMS-SCE, and λ = 0.5 for RLS-UCE. As mentioned earlier, RLS-SCE is not considered due to its complexity. The noise power spectral density is the same at each (p) diversity branch (i.e., we set N0 = N0 for p = 1, 2, . . . , P ).

MORELLI et al.: CHANNEL ESTIMATION FOR ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION

Fig. 3.

Performance of the noise-power estimators.

2515

Fig. 4.

BER versus Eb /N0 with a single-antenna receiver and v = 25 km/h.

Fig. 5.

BER versus Eb /N0 with a single-antenna receiver and v = 70 km/h.

(p)

Finally, bearing in mind that hm (t) is bandlimited to (p) ˆm |f | ≤ (1 + α)/2T , H (n) is forced to 0 for n ∈ I. B. Performance Assessment We begin by comparing the performance of the noise-power estimators. Fig. 3 shows the accuracy of AHE and MLBE versus 1/σ 2 in the case of a single-antenna receiver. Marks indicate simulations while solid lines represent analytical results as given by (44) and (47). Good agreement is observed between simulations and theory. As expected, MLBE gives the best results. However, extensive simulations (not shown for space limitations) indicate that using MLBE instead of AHE does not produce significant improvements in the error-rate performance. For this reason, MLBE is not considered further due to its complexity. The system performance has been assessed in terms of bit error rate (BER) versus Eb /N0 , where Eb is the energy per bit. Fig. 4 illustrates the BER of a receiver employing the proposed channel estimators. The mobile velocity is 25 km/h (corresponding to fD = 47 Hz) and the receiver is equipped with a single antenna (P = 1). The curve labeled ICI (ideal channel information) corresponds to a perfect knowledge of the channel response and noise power and serves as a benchmark. The performance of CADs [3], using either the LMS (LMS-CAD) or RLS (RLS-CAD) adaptation rules, is also shown for comparison. Finally, the curve labeled LMS-TDE indicates the performance of a conventional T /2-spaced timedomain equalizer employing the LMS algorithm. For an error probability of 10−3 , LMS-UCE and RLS-UCE have similar performance and are approximately 3.5 dB from ICI. LMS-SCE gives the best results while LMS-TDE and LMS-CAD have poor performance due to their limited tracking capabilities. It is likely that the convergence rate of LMS-CAD can be improved by using a different step size for each frequency bin, as suggested in [4]. Figs. 5 and 6 show analogous results with mobile speeds of 70 and 140 km/h. Note that the simulation results with ICI do not depend on the fading rate, as the channel is assumed

constant within each block. We see that the BER deteriorates as the mobile speed increases and all detectors exhibit an error floor. LMS-SCE is always superior but it becomes unsatisfactory as the mobile speed increases. The performance of LMSTDE (not shown in the figures) is similar to that of LMS-CAD. Figs. 7 and 8 illustrate simulations obtained in the same operating conditions of Figs. 4 and 5, except that four antennas are now employed. We see that the multiple antennas dramatically improve the system performance. For an error probability of 10−3 and a mobile speed of 70 km/h, the loss of LMS-SCE with respect to ICI is 1.5 dB while it is 5 dB with either LMS-UCE or RLS-UCE. When the mobile speed increases to 140 km/h, LMS-SCE is 2.5 dB from ICI while both LMS-UCE and RLS-UCE exhibit a floor. Clark’s detectors and LMS-TDE cannot track fast fading and have the worst performance. Fig. 9 shows the learning curves of a receiver employing the proposed channel estimators. The MSE at the detector input is computed by averaging over 1000 simulation runs. The mobile speed is 70 km/h and Eb /N0 is set to 15 dB. Four antennas are employed at the receiver. We see that RLS-UCE achieves an

2516

Fig. 6.

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 5, SEPTEMBER 2005

BER versus Eb /N0 with a single-antenna receiver and v = 140 km/h.

Fig. 7. BER versus Eb /N0 with four receiving antennas and v = 70 km/h.

Fig. 8.

BER versus Eb /N0 with four receiving antennas and v = 140 km/h.

Fig. 9. Learning curves with four receiving antennas, v = 70 km/h, and Eb /N0 = 15 dB.

MSE of approximately 2 × 10−2 after only two training blocks while LMS-UCE takes more than ten blocks to converge. LMS-SCE has the lowest MSE in the steady state, but its acquisition time is longer than that of RLS-UCE. Fig. 10 shows the computational complexity of the various detection and channel-estimation schemes expressed in millions of floating operations per second (FLOPS) versus the bit rate Rb (expressed in megabits per second). The curves are computed from Tables I and II with P = 1 (single-antenna receiver), assuming that AHE is employed for noise-power estimation. We see that FDE affords substantial computational savings with respect to a conventional TDE, especially at high bit rates. Also, the frequency-domain equalizer with LMS-SCE is only slightly more complex than the other schemes. VII. C ONCLUSION We have discussed three channel-estimation schemes for adaptive FDE in SC systems. They exploit a sequence of training blocks placed at the beginning of each data frame and

Fig. 10.

Complexity of the proposed schemes.

MORELLI et al.: CHANNEL ESTIMATION FOR ADAPTIVE FREQUENCY-DOMAIN EQUALIZATION

operate in an iterative fashion. Two of them, LMS-UCE and RLS-UCE, assume independently faded frequency bins while the third scheme, LMS-SCE, uses a structured approach that improves the quality of the channel estimates. In addition to channel-state information, frequency-domain MMSE equalization requires knowledge of the noise power. To this purpose, a simple algorithm based on ad hoc reasoning has been proposed. The performance of all these schemes has been investigated analytically and by simulation. It has been found that LMSSCE outperforms the other methods. The price to pay is a slight increase in complexity, which, however, is still much smaller than that of a conventional TDE. Compared with other existing schemes based on adaptive FDE, the proposed methods have better performance due to their enhanced tracking capabilities. In particular, a four-branch receiver employing LMS-SCE can handle a mobile speed of 140 km/h with only a 3-dB loss with respect to an ideal system with perfect channel knowledge.

In this appendix, we highlight the major steps leading to the performance of LMS-UCE. For simplicity, we assume that the channel is static and we drop the superscript (·)(p) designating the diversity branch. We begin by computing the conditional ˆ m }. To this purpose, we substitute (17) expectation E{em |H into (23) to obtain (A1)

ˆm =H −H ˆ m is the estimation error at the mth where ∆H step and {W m } are statistically independent Gaussian vectors with zero mean and covariance matrix σ 2 I 2N . Then, using the identity E{C H m C m } = N (which is valid for i.i.d. data symbols with zero mean and unit variance) produces ˆ m } = N × ∆H ˆ m. E{em |H

(A2)

From above, we see that em may be thought of as the sum ˆ m plus some zero-mean disturbance term η m . of N × ∆H Accordingly, recursion (22) may be rewritten as ˆ m+1 = (1 − µN ) × ∆H ˆ m − µη m ∆H

(A3)

H ˆ with η m = (C H m C m − N × I)∆H m + C m W m . Since in ˆ ˆ the steady state H m ≈ H (i.e., ∆H m ≈ 0), it is reasonable to approximate η m as

ηm ≈ C H mW m.

(A4)

ˆ m may be viewed as the Inspection of (A3) reveals that ∆H response to η m of a digital filter with impulse response  −µ(1 − µN )k−1 , k ≥ 1 pk = (A5) 0, otherwise. Thus, (A3) becomes ˆm= ∆H

 i

pi η m−i .

Recalling that η m has zero mean, from (A6), we see that ˆ m is an unbiased estiˆ m } = 0, meaning that H E{∆H mate of H. Returning to (A4), we observe that vectors {η m } are independent for different values of m and have covariance matrix Cη = σ 2 N × I 2N . Putting these facts together, from (A6), we have      H 2 2 ˆ m ∆H ˆ = σ N E ∆H p × I 2N . (A7) i

m

i

Next, substituting (A5) into (A7) and using the identity ˆ m 2 = tr{∆H ˆ m ∆H ˆ H } produces ∆H m   ˆ m 2 = E ∆H

2µN 2 σ . 2 − µN

(A8)

At this stage, we introduce the noise equivalent bandwidth of the filter pk [15, p. 126]

A PPENDIX A

H ˆ em = C H m C m ∆H m + C m W m

2517

(A6)

BL =

µN . 2(2 − µN )TB

(A9)

Then, collecting (A8) and (A9) yields (24) in the text. A PPENDIX B In this appendix, we derive an iterative procedure to minimize ˜ = JRLS-SCE (h)

m 

˜ 2 λm−i X i − C i F h

(B1)

i=0

˜ We begin by setting the gradient of with respect to h. ˜ ˜ =h ˆ m+1 . This produces JRLS-SCE (h) to zero and solving for h ˆ m+1 = dm Rm h

(B2)

where Rm = dm =

m  i=0 m 

λm−i F H C H i C iF

(B3)

λm−i F H C H i X i.

(B4)

i=0

Next, we observe that Rm and dm may be computed iteratively as Rm = λRm−1 + F H C H mC mF dm = λdm−1 + F

H

CH mX m.

(B5) (B6)

ˆ m+1 with [h ˆ m+1 − h ˆ m] + h ˆ m in (B2) and Then, replacing h using (B5) and (B6) yields   ˆ m+1 − h ˆ m ] + λRm−1 + F H C H C m F h ˆm Rm [h m = λdm−1 + F H C H mX m.

(B7)

2518

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 5, SEPTEMBER 2005

ˆ m = dm−1 , (B7) reFinally, bearing in mind that Rm−1 h duces to ˆ m+1 − h ˆ m ] = F H C H [X m − C m F h ˆ m] Rm [h m

(B8)

from which (34) in the text follows easily. R EFERENCES [1] J. G. Proakis, Digital Communications, 2nd ed. New York: McGrawHill, 1989. [2] S. U. H. Qureshi, “Adaptive equalization,” Proc. IEEE, vol. 73, no. 9, pp. 1349–1387, Sep. 1985. [3] M. V. Clark, “Adaptive frequency-domain equalization and diversity combining for broadband wireless communications,” IEEE J. Sel. Areas Commun., vol. 16, no. 8, pp. 1385–1395, Oct. 1998. [4] J. J. Shynk, “Frequency-domain and multirate adaptive filtering,” IEEE Signal Process. Mag., vol. 9, no. 1, pp. 14–35, Jan. 1992. [5] D. Falconer, S. L. Ariyavisitakul, A. Benyamin-Seeyar, and B. Eidson, “Frequency domain equalization for single-carrier broadband wireless systems,” IEEE Commun. Mag., vol. 40, no. 4, pp. 58–66, Apr. 2002. [6] H. Sari, G. Karam, and I. Jeanclaude, “Frequency domain equalization of mobile radio and terrestrial broadcast channels,” in Proc. Global Telecommunications (GLOBECOM), San Francisco, CA, Nov.–Dec. 1994, pp. 1–5. [7] A. Czylwik, “Comparison between adaptive OFDM and single carrier modulation with frequency domain equalization,” in Proc. IEEE Vehicular Technology Conf. (VTC), New York, Spring 1998, vol. 2, pp. 865–869. [8] G. Kadel, “Diversity and equalization in frequency domain—A robust and flexible receiver technology for broadband mobile communication systems,” in Proc. Vehicular Technology Conf. (VTC), Phoenix, AZ, May 1997, vol. 2, pp. 894–898. [9] S. Alamouti, “A simple transmit diversity technique for wireless communications,” IEEE J. Sel. Areas Commun., vol. 16, no. 8, pp. 1451–1458, Oct. 1998. [10] N. Al-Dhahir, “Single-carrier frequency-domain equalization for spacetime-coded transmissions over broadband wireless channels,” in Proc. Personal Indoor and Mobile Radio Communications (PIMRC), San Diego, CA, Sep./Oct. 2001, pp. B143–B146. [11] X. Zhu and R. D. Murch, “Novel frequency-domain equalization architectures for a single-carrier wireless MIMO system,” in Proc. Vehicular Technology Conf. (VTC), Vancouver, BC, Canada, Sep. 2002, pp. 874–878. [12] Y. Li, L. J. Cimini, Jr., and N. R. Sollenberger, “Robust channel estimation for OFDM systems with rapid dispersive fading channels,” IEEE Trans. Commun., vol. 46, no. 7, pp. 902–915, Jul. 1998. [13] Y. Le, “Pilot-symbol-aided channel estimation for OFDM in wireless systems,” IEEE Trans. Veh. Technol., vol. 49, no. 4, pp. 1207–1215, Jul. 2000. [14] H. Meyr, M. Oerder, and A. Polydoros, “On sampling rate, analog prefiltering and sufficient statistics for digital receivers,” IEEE Trans. Commun., vol. 42, no. 12, pp. 3208–3214, Dec. 1994. [15] U. Mengali and A. N. D’Andrea, Synchronization Techniques for Digital Receivers. New York: Plenum, 1997.

Michele Morelli (M’04) received the Laurea degree (cum laude) in electrical engineering and the “Premio di Laurea SIP” degree from the University of Pisa, Pisa, Italy, in 1991 and 1992, respectively, and the Ph.D. degree in electrical engineering from the Department of Information Engineering, University of Pisa, in 1995. In September 1996, he was a Research Assistant at the Centro Studi Metodi e Dispositivi per Radiotrasmissioni (CSMDR), Italian National Research Council (CNR), Pisa, Italy. Since 2001, he has been with the Department of Information Engineering, University of Pisa, where he is currently an Associate Professor of Telecommunications. His research interests are in wireless communication theory, with emphasis on equalization, synchronization, and channel estimation in multiple-access communication systems.

Luca Sanguinetti (S’04) received the Laurea degree (cum laude) in information engineering from the University of Pisa, Pisa, Italy, in 2002, and is currently working toward the Ph.D. degree in information engineering in the Department of Information Engineering, University of Pisa. In 2004, he was a Visiting Ph.D. Student at the German Aerospace Center (DLR), Oberpfaffenhofen, Germany. His research interests span the areas of communications and signal processing, estimation, and detection theory. Current research topics focus on transmitter and receiver diversity techniques for single- and multiuser fading communication channels, antenna array processing, channel estimation and equalization, multiple-input multiple-output (MIMO) systems, multicarrier systems, and linear and nonlinear prefiltering for interference mitigation in multiuser environments.

Umberto Mengali (M’69–SM’85–F’90–LF’03) received the degree in electrical engineering from the University of Pisa, Pisa, Italy, and the Libera Docenza degree in telecommunications from the Italian Education Ministry, Italy, in 1971. Since 1963, he has been with the Department of Information Engineering, University of Pisa, where he is a Professor of Telecommunications. In 1994, he was a Visiting Professor at the University of Canterbury, New Zealand, as an Erskine Fellow. His research interests are in digital communications and communication theory, with emphasis on synchronization methods and modulation techniques. He coauthored the book Synchronization Techniques for Digital Receivers (Plenum Press, 1997). Prof. Mengali is a member of the Communication Theory Committee and was the Editor of the IEEE TRANSACTIONS ON COMMUNICATIONS from 1985 to 1991 and of the European Transactions on Telecommunications from 1997 to 2000. He has served on the technical program committees of several international conferences and was Co-Chair of the 2004 International Symposium on Information Theory and Applications (ISITA).