SNR(dB). Bit Error Rate. Viterbi Detector: N=1, L=5, P=7, TIR=0.1, 500 Runs. NDD: fd=100Hz. NDD: fd=200Hz. DD: fd=100Hz. DD: fd=200Hz. TM: fd=100Hz.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
3877
Doubly-Selective Channel Estimation Using Data-Dependent Superimposed Training and Exponential Basis Models Jitendra K. Tugnait Fellow, IEEE, and Shuangchi He
Abstract— Channel estimation for single-user frequencyselective time-varying channels is considered using superimposed training. The time-varying channel is assumed to be wellapproximated by a complex exponential basis expansion model (CE-BEM). A periodic (non-random) training sequence is arithmetically added (superimposed) at low power to the information sequence at the transmitter before modulation and transmission. In existing first-order statistics-based channel estimators, the information sequence acts as interference resulting in a poor signal-to-noise ratio (SNR). In this paper a data-dependent superimposed training sequence is used to cancel out the effects of the unknown information sequence at the receiver on channel estimation. A performance analysis is presented. We also consider the issue of superimposed training power allocation. Several illustrative computer simulation examples are presented. Index Terms— Channel estimation, doubly-selective channels, ISI channels, superimposed training.
I. I NTRODUCTION ONSIDER a doubly-selective (time- and frequencyselective) SIMO (single-input multiple-output) FIR (finite impulse response) linear channel with N outputs, discretetime impulse response {h(n; l)} and scalar input {s(n)}. Then the symbol-rate, channel noise-free output x(n) and noisy output y(n) are given by (n = 0, 1, · · · , T − 1)
C
x(n) :=
L
h(n; l)s(n − l), y(n) := x(n) + v(n).
(1)
l=0
In a CE-BEM representation [1], [7] it is assumed that (Q−1)/2
h(n; l) =
2πq , T
hq (l)ejωq n , wq :=
q=−(Q−1)/2
(2)
L := τd /Ts , Q := 2fd T Ts + 1,
keep up with the changing channel [6]. This wastes resources. In superimposed training one takes s(n) = b(n) + c(n),
(3)
where {b(n)} is the information sequence and {c(n)} is a training (pilot) sequence added (superimposed) at low power to the information sequence at the transmitter before modulation and transmission over the same channel. There is no loss in data transmission rate unlike TM training, but some useful power is wasted in superimposed training. Periodic superimposed training has been discussed in [2], [9], [11] for time-invariant channels, and in [8] and [10] for time-varying (CE-BEM based) channels. The CE-BEM representation of doubly-selective channels has been used in [1], [5]-[7], among others. Objectives and Contributions: An approach followed in [10] and [8] is to first estimate the channel h(n; l) from observed noisy observations using the first-order statistics of the data, CE-BEM and knowledge of the superimposed training, and then use the estimated channel to detect the information sequence. In this approach, the information sequence acts as interference resulting in a poor signal-to-noise ratio (SNR). In this letter, inspired by the time-invariant channel results of [2], we consider a data-dependent superimposed training sequence to cancel out the effects of the unknown information sequence at the receiver on channel estimation. Notation: Superscripts H, ∗, † and T denote the complex conjugate transpose, complex conjugation, Moore-Penrose pseudo-inverse and transpose operations, respectively. δ(τ ) is the Kronecker delta and IN is the N × N identity matrix. The symbol ⊗ denotes the Kronecker product, and tr(A) is the trace of matrix A. The symbol 0L denotes a null column of size L. The notation y = O(x) means that there exists some finite real number b > 0 such that |y/x| ≤ b.
where, for an observation record length of T Ts sec. with symbol interval Ts and T symbols in the given block, the II. DATA -D EPENDENT S UPERIMPOSED T RAINING BASED underlying continuous-time channel has a delay spread of τd S OLUTION sec. and Doppler spread of fd Hz. In (2), hq (l)’s are fixed over the data block of T symbols. We first consider the approach of [10] and assume the In conventional time-multiplexed (TM) training-based ap- following: proaches to channel estimation, for time-varying channels, one (H1) The time-varying channel {h(n; l)} satisfies (2) with has to send a training signal frequently and periodically to wq = 2πq T . The information sequence {b(n)} is zeromean, white with E{|b(n)|2 } = σb2 . The measurement Manuscript received May 10, 2006; revised December 30, 2006 and May noise {v(n)} is zero-mean, white, uncorrelated with 31, 2007; accepted July 16, 2007. The associate editor coordinating the review of this letter and approving it for publication was G. Vitetta. This work {b(n)}, with E{v(n + τ )[v(n)]H } = σv2 IN δ(τ ). The was supported by the NSF under Grant ECS-0424145. A preliminary version superimposed training sequence c(n) = c(n+mP ) ∀m, n of this paper was presented at the 2006 Conf. on Informations Sciences & is a non-random periodic sequence with period P . Systems, Princeton University, Princeton, NJ, March 2006. The authors are with the Department of Electrical & Computer Engineering, 200 Broun Hall, Auburn University, Auburn, AL 36849 USA (e-mail: {heshuan, tugnajk}@eng.auburn.edu). Digital Object Identifier 10.1109/TWC.2007.060246. c 2007 IEEE 1536-1276/07$25.00
3878
1 P
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
P −1 jαm n Then c(n) = , ∀n, where cm := m=0 cm e P −1 −jαm n c(n)e and α := 2πm/P . It follows that m n=0 E{y(n)} = (Q−1)/2
P −1
q=−(Q−1)/2 m=0
L l=0
cm hq (l)e
−jαm l
=:dmq
ej(ωq +αm )n .
(4)
+
L
h(n; l)b(n − l) + v(n) e−j(ωq +αm )n .
T −1 By (4) and (11), T1 n=0 E{y(n)}e−j(ωq +αm )n = dmq . T −1 Define wmq := T1 n=0 v(n)e−j(ωq +αm )n . Then by (H1) and (11), E{wmq } = 0, H E{wmq wm } = T −1 σv2 IN δ(m − m1 )δ(q − q1 ), 1 q1
Define Dm := [dTm(−(Q−1)/2) , dTm(−(Q−3)/2) , · · · , dTm((Q−1)/2) ]T , (5) Hl := [hT−(Q−1)/2 (l) hT−(Q−3)/2 (l) · · · hT(Q−1)/2 (l)]T , (6)
H HH · · · HH H := HH , 0 1 L
H H · · · DH D := D0 DH , (7) 1 P −1 ⎤ ⎡ 1 1 ··· 1 ⎥ ⎢ 1 e−jα1 · · · e−jα1 L ⎥ ⎢ (8) V := ⎢ . . ⎥, . . . . .. ⎦ ⎣ .. .. 1 e−jαP −1 · · · e−jαP −1 L
smq
ˆ mq = dmq + wmq + smq , d L T −1 1 := h(n; l)b(n − l) e−j(ωq +αm )n . T n=0
br :=
KP −1 1 2πr b(n)e−jωr n , ωr := , KP n=0 KP
(Q−1)/2
ˆ l) = h(n;
ˆ q (l)ejωq n . h
(10)
q=−(Q−1)/2
Now we motivate our data-dependent superimposed training scheme. Consider (9). It has contributions from the information sequence {b(n)} unknown at the receiver, the superimposed training {c(n)} known at the receiver, and noise v(n). −1 ˆ mq to d Our aim is to null-out the contribution of {b(n)}Tn=0 for 0 ≤ m ≤ P − 1 and −(Q − 1)/2 ≤ q ≤ (Q − 1)/2. For a given value of P , we will pick the record length T such that T P −1 = K ≥ Q where K > 0 is an integer. It then follows Q−1 that (0 ≤ m1 , m2 ≤ P − 1, − Q−1 2 ≤ q1 , q2 ≤ 2 ) T −1 1 j(−αm +αm −ωq +ωq )n 1 2 1 2 e = δ(m1 − m2 )δ(q1 − q2 ) T n=0 (11) since (αm + ωq ) = (αn + ωk ) iff m = n and q = k. It follows from (1)-(3) and (9), that (T = KP ) T −1 ˆ mq = 1 {E{y(n)} d T n=0
KP −1
br ejωr n .
(15)
r=0
Then smq can be expressed as (Q−1)/2
smq =
−1 L KP hq1 (l)e−jωr l br A(q1 , r, q, m)
q1 =−(Q−1)/2 l=0
r=0
(16)
where
Define
ˆ m ’s. ˆ as in (7) with Dm ’s replaced with D and define D ˆ = ˆ Then we have the channel coefficient estimate H = C † D H −1 H ˆ (C C) C D. The channel estimate is then given by
(14)
Clearly, the information sequence’s contribution given by smq ˆ mq , hence above, interferes with the estimation of dmq from d with channel estimation from the observations. Consider the discrete Fourier transform (DFT) of {b(n)} :
b(n) =
T ˆT ˆT ˆT ˆ m := [d D m(−(Q−1)/2) , dm(−(Q−3)/2) , · · · , dm((Q−1)/2) ]
(13)
l=0
C := (diag{c0 , c1 , · · · , cP −1 }V) ⊗ IN Q . It then follows that CH = D. It is shown in [10] that if P ≥ L + 1, then rank(C) = N Q(L + 1); hence, we can determine ˆ mq of dmq is given the hq (l)’s uniquely. In [10] an estimate d as T −1 1 ˆ dmq = y(n)e−j(ωq +αm )n . (9) T n=0
(12)
l=0
A(q1 , r, q, m) :=
T −1 1 j(ωq +ωr −ωq −αm )n 1 e T n=0
= δ((q1 + r − q − mK) mod T ).
(17)
Therefore, if we can make br = 0 for r = q + mK − q1 , − Q−1 ≤ q, q1 ≤ Q−1 2 2 , m = 0, 1, · · · , P − 1, then smq = 0. We do so by modifying {c(n)} based on {b(n)} (at the transmitter). Define a set Ω := {r | − (Q − 1) ≤ r − Km ≤ Q − 1, m = 0, 1, · · · , P − 1} . Define a data-dependent superimposed training c˜(n) over the block {n = 0, 1, · · · , T − 1} such that c˜(n) := c(n) − be (n) and be (n) :=
KP −1
br ej
2πrn KP
. (18)
r=0,r∈Ω
Note that c˜(n) is no longer periodic with period P . Transmit c˜(n) + b(n) = c(n) + [b(n) − be (n)]. The model (1)-(3) holds with c(n) replaced with c˜(n). By construction, the DFT of b(n) − be (n) over the block n = 0, 1, · · · , T − 1 vanishes at frequencies in the set Ω. Also the DFT of b(n − l) − be(n − l) over the block n = 0, 1, · · · , T − 1 vanishes at frequencies in the set Ω provided that a cyclic prefix of length Lp ≥ L is used. A cyclic prefix of length Lp is added at the transmitter by choosing s(−i) = s(T − i), i = 1, 2, · · · , Lp ≥ L where
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
3879
s(i) = c˜(i)+b(i). This allows linear convolution in (1) to equal which leads to circular convolution (implicit in the DFT operation) over the ˆ = tr Σh . MSE = tr cov{ H} (26) 1 block length n = 0, 1, · · · , T − 1 = KP − 1. We summarize our channel estimation solution in Table I. In order to evaluate MSE2 further, we make the following Data Detection: Now the “information sequence” is assumption: {b(n) − be (n)} whereas we are interested in {b(n)}. We will (H2) Assume that the time-varying channel {h(n; l)} is zerofollow an iterative solution, similar to the time-invariant results mean, complex Gaussian with of [2]. The first step in our solution is to use the estimated channel to detect {b(n)} via Viterbi algorithm (ignoring be (n) E h(n; l)hH (n1 ; l1 ) = Rh (n − n1 ; l)δ(l − l1 ) • but accounting for the known {c(n)}). Use the detected {b(n)} to estimate {be (n)}, and iterate the detection procedure (but Let rh (n−n1 ; l) := tr Rh (n−n1 ; l). Assumption (H2) reprenot channel estimation) with known {c(n)} and estimate sents a WSSUS (wide-sense stationary uncorrelated scattering) fading channel. Under (H2), MSE2 can be simplified as {ˆbe (n)} from the previous iteration. L T −1 1 τ (1 − ) [rh (τ ; l) (T − Q)rh (0; l) − MSE2 = III. P ERFORMANCE A NALYSIS T T τ =1 l=0 We now analyze the performance of the channel estimator presented in Section II. We note that for an arbitrary channel sin πτTQ . (27) +rh (−τ ; l)] h(n; l), the following is always true: sin πτ T (T −1)/2 h(n; l) = hq (l)ejωq n , n = 0, 1, · · · , T −1, (19) For the Jakes’ channel model (used in simulations), we have rh (n1 − n2 ; l) = N σh2 (l)J0 (2πfd Ts (n1 − n2 ))
q=−(T −1)/2
T −1
where hq (l) = T1 n=0 h(n; l)e−jωq n . Our model (2) is then an approximation to (19). Let (Q−1)/2
hBEM (n; l) =
hq (l)ejωq n ,
(20)
q=−(Q−1)/2
so that
σh2 (l) = N −1 rh (0; l),
where J0 (·) denotes the zero-th order Bessel function of the first kind.
eBEM (n; l) := h(n; l) − hBEM (n; l) − Q+1 2
=
IV. T RAINING P OWER A LLOCATION
T −1
hq (l)e
jωq n
q=− T −1 2
+
2
hq (l)e
jωq n
.
(21)
q= Q+1 2
ˆ BEM (n; l) Then we only estimate the hBEM (n; l) part as h using (10). The mean square error in channel estimation is MSE :=
T −1 L 1 ˆ BEM (n; l) E hBEM (n; l) − h T n=0 l=0 (22) +eBEM (n; l)2
= MSE1 +
In this section we consider the issue of superimposed training power allocation under (H1) and (H2). Corresponding results for time-invariant channels are in [11]. Removing the estimated time-varying from the received data, define L mean ˆBEM (n; l) c (n − l). Then we have y ˜ (n) := y (n) − l=0 h (29) (see next page). The power of the “signal” part xs (n) at time n is 2 (n) := E xs (n)2 σxs =
σb2
l=0
(23)
T −1 L 1 ˆ BEM (n; l)2 . E hBEM (n; l) − h T n=0 l=0 (24) By the results of Section II, we have
ˆ = H, E{H}
=
T
σb2
l=0
MSE1 +
L
2 E hBEM (n; l) .
(31)
Taking its time-average we can show that 2 σ ¯xs :=
Using (12)-(14) and the fact that smq = 0, by design, it can be shown that (C H C)−1 =: Σh
(30)
l=0
ˆ := E{[H ˆ − H][H ˆ − H]H } = C † cov{D}C ˆ †H (25) cov{H}
ˆ = cov{H}
2 ˆ E hBEM (n; l) + O(1/T 2 )
ˆ q (l) where O(1/T 2 ) term accounts for dependence between h and {b(n)} (see [4]). Furthermore L 2 ˆ E h (n; l) σb2 BEM
MSE1 :=
σv2
L l=0
T −1 L 1 H E eBEM (n; l)eBEM (n; l) T n=0
=: MSE1 + MSE2 ,
(28)
=
σb2
MSE1 +N
L l=0
T −1 1 2 σ (n) T n=0 xs
σh2 (l)
− MSE2 + O(1/T 2 ).
(32)
3880
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
TABLE I Summary of the Proposed Channel Estimator At the transmitter, we are given b(n) for 0 ≤ n ≤ T − 1 with T chosen as T = KP , K > Q. Calculate the DFT 2πrn KP −1 1 b(n)e−j KP , 0 ≤ r ≤ KP − 1. br := KP n=0
1)
To eliminate interference with channel estimation at the receiver, we need to set br ’s to zero for r ∈ Ω := {r | − (Q − 1) ≤ r − Km ≤ Q − 1, m = 0, 1, · · · , P − 1}. Define the data-dependent superimposed training c˜(n) as in (18). Use a cyclic prefix of length Lp ≥ L and transmit.
2)
ˆ = C†D ˆ in Section II stays the same for data-dependent superimposed The channel estimation part given by H training because we still use periodic {c(n)} at the receiver, and we do not know be (n) or b(n) at the receiver. It ˆ mq for 0 ≤ m ≤ P − 1 and −(Q − 1)/2 ≤ is easily established that now there is no contribution of {b(n)} to d q ≤ (Q − 1)/2.
3)
y ˜ (n) =
L
ˆ BEM (n; l) b (n − l) + h
l=0
=:xs (n)
L L ˆBEM (n; l) [b (n − l) + c (n − l)] − h (n; l) − h h (n; l) be (n − l) + v (n) (29) l=0
T −1 1 2 E w (n) . T n=0
g1 = −
l=0
σv2 M σb2 (L + 1) N Q + σc2 MSE2 −2 MSE2 T T where M := |Ω|= number of elements in the set Ω. We define the training power overhead β as +
(34)
= [σc2 ][σb2 + σc2 ]−1 .
(35)
(36)
Our objective is to maximize SNRd (β) w.r.t. β under the constraint of a fixed transmitted power: PT := σb2 + σc2 is fixed. Then σc2 = PT β and σb2 = PT (1 − β). Incorporating these constraint-carrying variables in (36) via (32) and (34), we obtain an unconstrained cost SNRd (β) = [f1 β 2 + f2 β + f3 ][g1 β 2 + g2 β + g3 ]−1 , (37) L σh2 (l))]PT , f1 = [MSE2 −N ( l=0 L
f2 = N (
σh2 (l))PT − MSE2 PT − a1 σv2 ,
l=0
σ2 f3 = a1 σv2 , a1 := c2 MSE1 , σv
σv2 (L + 1) N Q − g1 , T
g3 = a1 σv2 . We seek the optimum value of β by setting the first derivative of the unconstrained cost to zero: d [SNRd (β)] dβ (f1 g2 − f2 g1 ) β 2 + 2 (f1 g3 − f3 g1 ) β + f2 g3 − f3 g2 (g1 β 2 + g2 β + g3 )
2
= 0.
The above quadratic in β has two roots, of which the root lying in [0, 1] is (38) (see top of next page).
Then the equalization SNR of (29) as a function of β is (implicitly) obtained as 2 2 SNRd (β) = [¯ σxs (β)][¯ σw (β)]−1 .
g2 = −a1 σv2 + MSE2 PT + N σv2 +
=
P P 1 2 1 2 |c (n)| ][ E |s (n)| ]−1 P n=1 P n=1
L N M PT 2 M PT ( MSE2 , σh (l)) + 2 T T l=0
(33)
Its expression turns out to be (neglecting O(1/T 2 ) terms) ! L " N M σb2 2 2 2 2 σ ¯w = σb MSE +N σv + σh (l) T
β := [
=:w(n)
The time-averaged noise power in (29) is defined as 2 σ ¯w :=
l=0
V. S IMULATION E XAMPLES We consider several different simulation examples. Channel 1. We took N = 1, 2, or 3, and L = 5 (6 taps) in (1). For different l’s, h(n; l)’s are mutually independent and spatially white, and for a given l, we follow the modified Jakes’ model [12] to generate h(n; l) with a specified Doppler spread fd and symbol interval Ts . We scale {h(n; l)}n to achieve an exponential power delay profile given by E{|h(n; l)|2 } = e−0.2l/(L+1) . We consider a system with carrier frequency of 2GHz, symbol interval Ts = 25μs. Channel 2. Here we take N = 1, L = 2 (3-tap channel), a uniform power delay profile and Ts =25 μs. The rest is as for Channel 1. Channel 3. Here we take N = 1, L = 1 (2-tap channel), a uniform power delay profile and Ts =200 μs; the rest is as for Channels 1 and 2. For fd =100 Hz, the normalized Doppler spread fd Ts =0.02 and for fd =250 Hz, it is 0.05 . Additive noise in each example was zero-mean complex white Gaussian. The (receiver) SNR refers to the energy per bit over one-sided noise spectral density with both information
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
# −f1 f2 g2 g3 − 2f1 f3 g1 g3 − f2 f3 g1 g2 + f22 g1 g3 + f1 f3 g22 + f12 g32 + f32 g12 . f1 g2 − f2 g1
s :=
[bT1 , cT1 , bT2 , cT2 , · · ·
, bTJ , cTJ ]T
(39)
where bj is a column of Nb information symbols and cj is a column of Nc training symbols leading to T = J(Nb + Nc ). [6] has shown that (39) is an optimum structure with Nc = 2L + 1, J = Q and cj = [0TL , γ, 0TL ]T , γ > 0. For a fair comparison, for given T , we keep the training-toinformation rate overhead of (2L + 1)J[T − (2L + 1)J]−1 = Nc /Nb as well as the training-to-information power ratio of γ 2 Jσb−2 [T − (2L + 1)J]−1 = γ 2 [σb2 Nb ]−1 (with J = Q whenever possible) pertaining to TM training scheme of [6] to be equal to the training-to-information power ratio of σc2 /σb2 for the training-based schemes where σc2 := Psuperimposed −1 2 P n=1 |c(n)| . Example 1 Here we consider Channel 1 with T =840 (or 847 for TM) bits, fd = 100 or 200Hz; the results are shown in Fig. 1 based on 500 Monte Carlo runs. For superimposed training we picked TIR=σc2 /σb2 =0.1 and for TM training we picked Nc = 11 = 2L + 1, J = 7 and Nb = 110. The normalized channel mean-square error (NCMSE) in channel estimation is NCMSE := Mr T −1 L ˆ (i) (i) 2 Mr−1 i=1 n=0 l=0 |h (n; l) − h (n; l)| M T −1 L r (i) 2 Mr−1 i=1 n=0 l=0 |h (n; l)| ˆ (i) (n; l) is the estimated where h(i) (n; l) is the true channel, h channel at the i-th Monte Carlo run, and there are total Mr runs. The corresponding detection results are based on Viterbi algorithm utilizing the estimated channel. It is seen that the proposed data-dependent (DD) superimposed training yields superior results compared to the non-DD (NDD) superimposed training, and furthermore it is competitive with TM training without incurring the 10% training overhead penalty resulting in a data transmission rate loss. For fd =100 and 200 Hz, we pick Q =7 or 11, respectively, per (2). When Q = 11, for TM training, we cannot satisfy Nc /Nb =TIR=0.1 for J ≥ Q; we had to settle for J = 7 leading to loss of parameter identifiability (more unknowns than equations). In Fig. 1 we also show the results for J = 11 leading to a reduced Nb with Nc /Nb =0.167. The performance clearly improves and it is better than that of DD superimposed training, but at the cost of 16.7% reduction in the transmission rate. Fig. 2 shows the detection results (based on estimated channel and Viterbi algorithm) for multiple receivers when fd = 100 Hz: N =
(38)
Viterbi Detector: N=1, L=5, P=7, TIR=0.1, 500 Runs 10
5
Normalized Channel MSE (dB)
and superimposed training sequence counting toward the bit energy. The information sequence was BPSK (binary). We took the superimposed training sequence period P = 7, 4 or 2 (for channels 1, 2 and 3, respectively) in (H1); it is given by c(n) = σc ejπn(n+ν)/P where ν = 1 if P is odd, and ν = 2 if P is even, as in [9]. The average transmitted power σc2 in c(n) was 0.1 of the power in b(n): a training-to-information power ratio (TIR) of 0.1 . We also provide comparisons with the −1 , approach of [6] where s, a column composed of {s(n)}Tn=0 is arranged as
0
−5
−10
NDD: f =100Hz d
NDD: fd=200Hz −15
DD: f =100Hz d
DD: fd=200Hz −20
TM: f =100Hz d
TM: f =200Hz d
TM: f =200Hz, TIR=0.167 −25 0
d
5
10
15
20
25
30
SNR(dB) Viterbi Detector: N=1, L=5, P=7, TIR=0.1, 500 Runs
0
10
−1
10
Bit Error Rate
βopt =
−f1 g3 + f3 g1 −
3881
NDD: f =100Hz d
−2
10
NDD: fd=200Hz DD: f =100Hz d
DD: fd=200Hz TM: fd=100Hz TM: fd=200Hz −3
10
0
TM: fd=200Hz, TIR=0.167 5
10
15
20
25
30
SNR(dB)
Fig. 1. NCMSE (normalized channel mean-square error) or BER vs SNR for fd =100 or 200 Hz. 6-tap Jakes’ channel with exponential power delay profile. Record length = 840 bits. TM: time-multiplexed training of [6]; DD: proposed data-dependent superimposd training; NDD: non-data-dependent superimposd training of [10].
1, 2, 3. Again we see that DD superimposed training is better than TM training. Example 2: Performance Analysis Here we consider Channel 2. In Fig. 3 we show the channel MSE versus Doppler spread where we compare our theoretical expressions with simulation-based MSE results and ±σ bounds. The mean-square channel estimation error for NDD training also has two components MSE1 and MSE 2 where MSE2 =MSE2 and (following [3]) MSE1 = L 2 2 2 −1 H ˜ −1 ˜ [σb ( l=0 σh (l)) + σv ]T tr{(C C) }. The agreement is good between the theoretical and simulations-based results. Example 3. Now we consider Channel 3 with a normalized Doppler spread of 0.02 or 0.05 and T = 840 (or somewhat higher for TM training); corresponding values of Q are 35 and 85, respectively. Now the performance of all schemes is worse because of a large number of unknowns to be estimated, however, DD superimposed training still outperforms TM training when we enforce the constraint Nc Nb−1 =TIR=0.1
3882
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
Viterbi Detector: L=5, fd=100, P=7, TIR=0.1, T=840, 500 Runs
0
Viterbi Detector: N=1, L=1, P=2, TIR=0.1, 500 Runs
0
10
10
−1
10
−1
Bit Error Rate
Bit Error Rate
10
−2
10
−3
10
−4
10
DD: N=1 DD: N=2 DD: N=3 TM: N=1 TM: N=2 TM: N=3 NDD: N=1 NDD: N=2 NDD: N=3
0
NDD: fd=100Hz NDD: fd=250Hz −2
10
DD: f =100Hz d
DD: fd=250Hz TM: fd=100Hz TM: f =250Hz d
TM: fd=100Hz, TIR=0.143 −3
5
10
15
20
25
10
30
TM: f =250Hz, TIR=0.429 d
0
5
10
15
SNR(dB)
Fig. 2. BER vs SNR for the 6-tap Jakes’ channel with exponential power delay profile with fd =100 Hz and N =1,2, or 3.
25
30
Fig. 4. BER vs SNR for the 2-tap Jakes’ Channel 3 with uniform power delay profile with fd =100 or 250 Hz and N =1.
N=1, L=2, T=400, P=4, SNR=25dB, Q=2⎡fdTTs⎤+1
Viterbi Detector: N=1, L=1, T=420, fd=100Hz, Ts=200μs, Q=19
5
0.4
0
0.35
analytical simulated averaged simulation averaged simulation ±σ
−5 0.3 −10 −15
Optimal β
Normalized Channel MSE (dB)
20
SNR(dB)
−20 −25
0.2 0.15
−30 simulated: NDD analytical: NDD simulated: DD theoretical: DD simulated ± σ: NDD simulated ± σ: NDD
−35 −40 −45 −50 0
0.25
20
40
60
80
100
120
140
160
180
0.1 0.05
200
fd(Hz)
Fig. 3. Theoretical and simulation-based channel MSEs: Channel 2; varying Q with fd following (2).
because we could not get J ≥ Q. With Nc = 2L + 1 = 3, we also show in Fig. 4 the results for J = 35 and 85 for fd =100 and 250 Hz, respectively, leading to reduced Nb ’s with Nc Nb−1 =0.143 or 0.429; the performance clearly improves and it is better than that of DD superimposed training, but at the cost of 16.7% or 42.9% reduction in the transmission rate. Example 4: Training Power Allocation. Here the channel is as in Example 3 with fd = 100 Hz and our objective is optimum training power allocation for DD superimposed training following the results of Section IV. We consider N = 1 and T = 420 bits. With total transmitted power σc2 + σb2 = 1, we vary β (defined in (35)) to maximize the “equalization” SNR defined in (36). Fig. 5 shows the optimal β obtained from simulation results (based on 500 runs) and from analytical expression (38): the two show a good agreement. Note that empirical β is a random variable. To show agreement with (38) we also plot ±σ bounds obtained from simulations. Because the BER vs β curves (not shown) for this example turn out to be quite “flat,” the σ values are relatively large even for high SNRs. VI. C ONCLUSIONS Channel estimation for single-user frequency-selective timevarying CE-BEM channels using superimposed training was
0 10
12
14
16
18
20
22
24
26
28
30
SNR (dB)
Fig. 5. Channel 3: Optimal theoretical and empirical βs versus received signal SNR. Curve labeled “analytical” follows (38). Curve labeled “simulated” is based on averaged BER (pick β corresponding to lowest average BER) whereas the curve labeled “averaged simulation” is based on average β (in each run pick the “best” β and then average over 500 runs). Also shown are the ±σ bounds around the “averaged simulation” curve.
considered. In existing first-order statistics-based channel estimators, the information sequence acts as interference resulting in a poor signal-to-noise ratio (SNR). In this paper a datadependent superimposed training sequence is used to cancel out the effects of the unknown information sequence at the receiver on channel estimation. Several illustrative computer simulation examples showed that the proposed approach outperformed a non-data-dependent approach. R EFERENCES [1] G. B. Giannakis and C. Tepedelenlio˘ g lu, “Basis expansion models and diversity techniques for blind identification and equalization of timevarying channels,” Proc. IEEE, vol. 86, pp. 1969-1986, Oct. 1998. [2] M. Ghogho, D. McLernon, E. Alameda-Hernandez, and A. Swami, “Channel estimation and symbol detection for block transmission using data-dependent superimposed training,” IEEE Signal Processing Lett., vol. 12, pp. 226-229, Mar. 2005. [3] S. He and J. K. Tugnait, “On bias-variance trade-off in superimposed training-based doubly selective channel estimation,” in Proc. 2006 Conf. Information Sciences Syst., Mar. 2006, pp. 1308-1313. [4] S. He, J. K. Tugnait and X. Meng, “On superimposed training for MIMO channel estimation and symbol detection,” IEEE Trans. Signal Processing, vol. 55, part 2, pp. 3007-3021, June 2007. [5] G. Leus, “Semi-blind channel estimation for rapidly time-varying channels,” in Proc. 2005 IEEE Int. Conf. Acoust., Speech, Signal Processing, Mar. 2005, vol. III, pp. 773-776.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 6, NO. 11, NOVEMBER 2007
[6] X. Ma, G. B. Giannakis and S. Ohno, “Optimal training for block transmissions over doubly selective wireless fading channels,” IEEE Trans. Signal Processing, vol. 51, pp. 1351-1366, May 2003. [7] X. Ma and G. B. Giannakis, “Maximum-diversity transmissions over doubly-selective wireless channels,” IEEE Trans. Inform. Theory, vol. 49, pp. 1832-1840, July 2003. [8] X. Meng and J. K. Tugnait, “Superimposed training-based doublyselective channel estimation using exponential and polynomial bases models,” in Proc. 2004 Conf. Information Sciences Syst., Mar. 2004, pp. 621-626. [9] A. G. Orozco-Lugo, M. M. Lara, and D. C. McLernon, “Channel estimation using implicit training,” IEEE Trans. Signal Processing, vol.
3883
52, pp. 240-254, Jan. 2004. [10] J. K. Tugnait and W. Luo, “On channel estimation using superimposed training and first-order statistics,” in Proc. 2003 IEEE Intern. Conf. Acoustics, Speech, Signal Proc., Apr. 2003, vol. 4, pp. 624-627. [11] J. K. Tugnait and X. Meng, “On superimposed training for channel estimation: Performance analysis, training power allocation, and frame synchronization,” IEEE Trans. Signal Processing, vol. 54, pp. 752-765, Feb. 2006. [12] Y. R. Zheng and C. Xiao, “Simulation models with correct statistical properties for Rayleigh fading channels,” IEEE Trans. Commun., vol. 51, pp. 920-928, June 2003.