Multiuser/MIMO Doubly Selective Fading Channel Estimation Using ...

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 59, NO. 3, MARCH 2010

1341

Multiuser/MIMO Doubly Selective Fading Channel Estimation Using Superimposed Training and Slepian Sequences Jitendra K. Tugnait, Fellow, IEEE, and Shuangchi He

Abstract—We consider doubly selective multiuser/multipleinput–multiple-output (MIMO) channel estimation and data detection using superimposed training. The time- and frequencyselective fading channel is assumed to be well described by a discrete prolate spheroidal basis expansion model (DPS-BEM) using Slepian sequences as basis functions. A user-specific periodic (nonrandom) training sequence is arithmetically added (superimposed) at low power to each user’s information sequence at the transmitter before modulation and transmission. A two-step approach is adopted, where, in the first step, we estimate the channel using only the first-order statistics of the observations. In this step, however, the unknown information sequence acts as interference, resulting in a poor signal-to-noise ratio (SNR). We then iteratively reduce the interference in the second step by employing an iterative channel-estimation and data-detection approach, where, by utilizing the detected symbols from the previous iteration, we sequentially improve the multiuser/MIMO channel estimation and symbol detection. Simulation examples demonstrate that, without incurring any transmission data rate loss, the proposed approach is superior to the conventional time-multiplexed (TM) training for uncoordinated users, where the multiuser interference in channel estimation cannot be eliminated and is competitive with the TM training for coordinated users, where the TM training design allows for multiuser-interference-free channel estimation. Index Terms—Basis expansion models (BEMs), discrete prolate spheroidal (DPS) sequences, doubly selective fading channels, multiple-input–multiple-output (MIMO) systems, multiuser channel estimation, superimposed training.

I. I NTRODUCTION

T

HIS paper is concerned with channel estimation and data detection for multiple-input–multiple-output (MIMO) doubly selective (time- and frequency-selective) fading channels using superimposed training for both single and multiple users. The increasing demand for high-speed reliable wireless communications over the limited radio-frequency spectrum has spurred increasing interest in MIMO systems to achieve Manuscript received October 7, 2008; revised December 8, 2009. First published December 18, 2009; current version published March 19, 2010. This work was supported by the National Science Foundation under Grant ECS-0424145 and Grant ECCS-0823987. This paper was presented in part at the IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, HI, April 2007. The review of this paper was coordinated by Dr. K. Hooli. J. K. Tugnait is with the Department of Electrical and Computer Engineering, Auburn University, Auburn, AL 36849 USA (e-mail: tugnajk@eng. auburn.edu). S. He is with the School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: heshuangchi@ gatech.edu). Digital Object Identifier 10.1109/TVT.2009.2038786

higher transmission rates [32]. To exploit the enormous capacity potential of MIMO communications, accurate knowledge of channel state information is often a prerequisite for many physical-layer approaches. In (conventional) time-multiplexed (TM) training-based approaches, training sequences (known to the receiver), with one per transmit (Tx) antenna, are TM with the information sequence and transmitted. This incurs a loss in spectral efficiency, decreasing the effective data transmission rate. At the receiver, one estimates the channel via least-squares and related approaches. For time-varying channels, one has to frequently and periodically send training signals to keep up with the changing channel. This wastes resources. An alternative is to estimate the channel based solely on noisy data exploiting statistical and other properties of the information sequences; this is the blind channel-estimation approach [9]. However, blind estimation typically requires longer data records, entails higher computational complexity, and has more stringent identifiability requirements [47]. More recently, superimposed training-based approaches have been explored, where the training sequence is added (superimposed on) at low power to the information sequence before modulation and transmission [6], [14]. In contrast with TM training, there is no loss in data transmission rate; on the other hand, some useful power is wasted in superimposed training sequences, which could have otherwise been allocated to information sequences. In this paper, we consider doubly selective MIMO channel estimation using superimposed training with K Tx antennas (inputs) and N receive (Rx) antennas (outputs). Our approach applies to both the case of K independent users with one antenna each and the case of one user with coordinated transmissions through K antennas (spatial multiplexing and/or space-time coding) [32]; a combination of the two cases also falls within the scope of this paper. In the simulation results presented in this paper, we compare and contrast the results of superimposed training with those of TM training. In the case of TM training, we will distinguish between two cases: 1) the case of “coordinated” users, where the users are allowed to coordinate the placement of the training symbols in the respective transmitted block of symbols, resulting in the elimination of the multiuser interference in channel estimation at the receiver (as in [50]), and 2) the case of the “uncoordinated” users, where the users are not allowed to (or cannot) coordinate the placement of the training symbols, resulting in the presence of multiuser interference in channel estimation at the receiver. The latter case is likely to arise, for example, when there are

0018-9545/$26.00 © 2010 IEEE

1342


multiple independent users with one transmit antenna each. As will become clearer later, such a distinction does not arise in our superimposed training schemes. In communications, channel variations with time arise due to relative motion between the transmitter and the receiver, and due to oscillator drifts and phase noise coupled with multipath effects [33]. These variations can be captured by statistical models where the time-varying finite impulse response (FIR) channels have their channel taps modeled as uncorrelated stationary random processes (Rayleigh or Ricean fading) [33]. Recently, basis expansion models (BEMs) have widely been investigated to represent doubly selective channels in wireless applications [3], [10], [22], [35], [51], where the time-varying taps are expressed as superpositions of time-varying basis functions in modeling Doppler effects, which are weighted by time-invariant coefficients. Candidate basis functions include complex exponential (Fourier) functions [10], [17], [22], polynomials [3], and discrete prolate spheroidal (DPS) sequences [51], etc. This paper is concerned with time-varying channels described by BEMs, particularly BEMs using Slepian sequences (DPS sequences) as basis functions; further details may be found in Section II. An alternative Gauss–Markov model (which is typically a low-order autoregressive (AR) model) has widely been used (see [18], [20], [21], and the references therein), which works well as long as the channel does not fade too fast. When TM periodically transmitted training is used, channel tracking may not perform well during information symbol transmissions (data sessions) since the information data are unknown. During data sessions, channel estimates can only be obtained based on the results from the previous training session [20]. This strategy is not appropriate for a fastfading channel. Potential solutions lie in exploiting the detected symbols for channel tracking; for instance, in [18] and [21], joint channel estimation and data detection is implemented via extended Kalman filtering and/or turbo techniques. Although channel tracking can be improved by such means during data sessions, error propagation due to incorrect detections can be pronounced for fast-fading channels. Past work based on AR modeling includes [5], where the performance bounds, including minimum mean square error (MMSE) and bit error rate (BER), of TM or superimposed training-based estimation for time-varying flat-fading channels are considered. Under the same overall power allocation, it was shown in [5] that superimposed training performs better for fast-fading channels, which confirms the intuition that the constant presence of training offers considerable benefit. Similar conclusions have been drawn in [4] via a mutual information and capacity analysis for flat-fading time-selective MIMO channels. It is shown in [4] that, when superimposed training is used, if one reestimates the channel by using the detected symbols, one can achieve a channel capacity that is greater than that possible in the case of conventional TM training. Channel reestimation using detected symbols allows one to alleviate the deleterious effects of the interfering data power during the original estimation of the channel using only superimposed training. A channel reestimation approach is also followed in this paper, except that we also consider frequency-selective channels (not just flat fading) and specific approaches rather

than the theoretical upper bound on performance in the form of channel capacity. Periodic superimposed training has been discussed in [7], [31], and [45] for time-invariant channels, in [44] and [46] for time-varying single-input–multiple-output (SIMO) channels based on the complex exponential BEM (CE-BEM), and in [13] for time-varying SIMO channels based on DPSBEM, all for single-carrier systems. Superimposed trainingbased channel estimation has been considered in [8], [14], and [25] for time-invariant MIMO systems and in [26] and [34] for CE-BEM-based time-variant MIMO single-carrier systems. Reference [26] is an earlier conference version of this paper focused on channels modeled by CE-BEM. The firstorder statistics-based approach of [26] has differently been derived (it is based on some large sample considerations), compared with this paper, and more significantly, the lemma in [26, p. 409] is incorrect. In this paper, we have followed a different (least-squares) approach to derive the first-order statistics-based channel estimator, and we use the more accurate DPS-BEM. In [34], an iterative turbo algorithm is proposed (following [1]) for coded MIMO systems operating in doubleselective environments and employing superimposed training. Reference [34] uses the less accurate CE-BEM, whereas we exploit the more accurate DPS-BEM. Reference [34] performs iterative channel estimation, equalization, and decoding, whereas in this paper, we only consider iterative channel estimation and equalization without considering channel error-correction coding. No comparisons with TM training-based approaches have been provided in [34], whereas we do so in this paper. The problem of superimposed training design has been addressed in [14] and [30] for time-invariant MIMO systems. Superimposed training for orthogonal frequency-division multiplexing (OFDM) systems has been investigated in [15], [16], [29], and references therein for time-invariant systems. Iterative approaches using detected symbols to improve performance have also been employed in [15], [16], and [29]. An OFDM modulator usually employs a cyclic prefix or zero padding to mitigate interblock interference. In [28] (and earlier conference papers by the same authors), it was suggested that the zero sequence (or the cyclic prefix) be replaced with a known pseudorandom postfix (PRP) sequence, leading to a PRP-OFDM system. This allows the receiver to exploit an additional piece of information for channel estimation, i.e., prior knowledge of part of the transmitted block. Advantages of using the PRP technique include improved bandwidth efficiency, because the pilot overhead is avoided, and possible use of low-complexity first-order statistics-based channel-estimation approaches. Various aspects of PRP-OFDM channel estimation and equalization, including iterative enhancements and semiblind implementations, have been investigated in [23], [27], [28], [49], and references therein. In [23], it is concluded that the first-order statistics-based channel estimator outperforms the second-order statistics-based channel estimator. In [49], the previously proposed multisymbol encapsulated OFDM approach of [48] has been modified by replacing the traditional cyclic prefix with a pseudorandom sequence to enhance bandwidth efficiency. All these PRP-OFDM schemes can be interpreted as using superimposed training in the frequency domain

TUGNAIT AND HE: MULTIUSER/MIMO DOUBLY SELECTIVE FADING CHANNEL ESTIMATION USING SUPERIMPOSED TRAINING

since the PRP can be seen as being superimposed on the zero sequence of zero padding. Time-selective channels have not been considered in these papers; in contrast, our contribution considers both frequency- and time-selective channels, albeit for single-carrier systems in the time domain. Objectives and Contributions As shown in [51], DPS-BEM outperforms other commonly used BEMs (such as CE-BEM [10], [22], oversampled CEBEM [19], and polynomial BEM [3]) in approximating a Jakes’ channel over a wide range of Doppler spreads for the same number of parameters; hence, we focus on DPS-BEMs in this paper. We first extend the first-order statistics-based approach of [13] pertaining to SIMO doubly selective channels to MIMO systems, where Slepian sequences are used to model the doubly selective channel. To this end, we exploit the superimposed training sequence design of [14] so that the problem of channel estimation is approximately decoupled across various users. This is true, irrespective of any timing synchronization among the training sequences of various users at the transmitting end, in contrast to the uncoordinated users’ case for TM training. However, in the first-order statistics-based approach based on superimposed training, the information sequences from all users are viewed as interference in channel estimation. We therefore consider an iterative joint channel estimation and data detection approach where information sequences are exploited to enhance channel estimation and BER performances, instead of being viewed as interference. Two variations are investigated: In deterministic maximum likelihood (DML), we use a Viterbi detector, whereas in another variation, a Kalman detector is used to reduce computational complexity. Such techniques have been considered for the CE-BEM SIMO systems in [24] and the DPS-BEM SIMO systems in [13] using the Viterbi detector; Kalman-filtering-based detectors have not been considered in these papers. (As noted earlier, iterative approaches have been used by others in various contexts.) Note also that one could view the objective function in (63) as our primary optimization cost function, which we seek to “reliably” initialize via the first-order statistics-based channel estimator of Section III-B. The rest of this paper is organized as follows: Section II introduces the channel model. The first-order statistics-based channel estimator using superimposed training and DPS-BEM is the subject of Section III. Section IV focuses on performance analysis of this estimator. Iterative joint channel estimation and data detection is discussed in Section V to reduce the information-induced interference. The DML approach via Viterbi detector is outlined in Section V-A, a Kalman detectorbased simplification is presented in Section V-B, and computational complexity issues are addressed in Section V-C. Computer simulation examples are presented in Section VI, and Section VII concludes this paper. Notation Superscripts H, ∗, T , and † denote the complex conjugate transpose, complex conjugation, transpose, and Moore–Penrose

1343

pseudoinverse operations, respectively. IN is the N × N identity matrix, the (m1 , m2 )th entry of a matrix C is denoted by [C]m1 ,m2 , tr(A) is the trace of a square matrix A, 0m×n denotes an m × n null matrix, and 0m denotes an m-column null vector. The symbols ·, ·, and δ(·) stand for integer ceiling, integer floor, and the Kronecker delta function, respectively. The symbol E{·} denotes expectation, and ⊗ denotes the Kronecker product. cov(x, y) denotes the covariance matrix of random vectors x and y: cov(x, y) = E{xyH } − E{x}E{yH }. Var(x) is the variance of the random variable x. The notation y = O(x) means that there exists some finite real number b > 0, such that |y/x| ≤ b. II. C HANNEL M ODEL Consider a doubly selective multiuser/MIMO FIR linear channel with K inputs (users) and N outputs, with the kth user’s transmitted symbol sequence denoted by {sk (n)} and the kth user’s discrete-time baseband impulse response denoted by {hk (n; l)}. Then, the symbol-rate channel noise-free output x(n) and noisy output y(n) are given by x(n) :=

L K

hk (n; l)sk (n − l)

(1)

k=1 l=0

y(n) = x(n) + v(n),

n = 0, 1, . . . , T − 1.

(2)

A parsimonious representation of time-varying channels is provided by BEMs, where one assumes that hk (n; l) =

Q

hqk (l)uq (n),

n = 0, 1, . . . , T − 1

(3)

q=1

where uq (·) is the scalar qth basis function (q = 1, . . . , Q), N -column vectors hqk (l) remain invariant during this data −1 (q = 1, 2, . . . , Q) block, and the basis functions {uq (n)}Tn=0 are common to all users for each block. In the CE-BEM [10], [22], for an observation record length of T symbols with symbol interval Ts s, one chooses uq (n) = ejωq n , ωq := 2π[q − (Q + 1)/2]/T , L := τd /Ts , and Q ≥ 2fd T Ts + 1 when the underlying continuous-time channel has a delay spread of τd s and a Doppler spread of fd Hz. In the DPSBEM, the ith DPS vector ui := [ui (0) ui (1) · · · ui (T − 1)]T (which is called Slepian sequence in [51]) is the ith eigenvector of a matrix C [37]: Cui = λi ui , where the (m1 , m2 )th entry of the T × T matrix C is [C]m1 ,m2 = sin[2π(m1 − m2 )fd Ts ]/π(m1 − m2 ), and λ1 ≥ λ2 ≥ · · · ≥ λT are the T eigenvalues of C. The DPS sequences are orthonormal over the finite time interval n = 0, 1, . . . , T − 1. In this paper, we will use the Slepian sequences in (3), where one takes [51] Q ≥ 2fd Ts T + 1.

(4)

The Slepian sequences (time-limited DPS sequences) are windowed (using rectangular windows) versions of infinite DPS sequences that are exactly band limited to the frequency range [−fd Ts , fd Ts ] [37], [51]. As shown in [51], DPS-BEM outperforms other commonly used BEMs (such as CE-BEM [10], [22]

1344


and polynomial BEM [3]) in approximating a Jakes’ channel over a wide range of Doppler spreads for the same number of parameters; hence, we focus on DPS-BEM’s in this paper. Note also that whereas, for DPS-BEM (also true for CE-BEM), one can infer the number of basis functions needed from some physical channel parameters such as the Doppler spread fd [see (4)], no such relationship exists for polynomial BEMs; this is another point in favor of the use of DPS-BEM. In superimposed training-based approaches, one takes for the kth user sk (n) = bk (n) + ck (n)

(5)

where {ck (n)} is a training (pilot) sequence added (superimposed) at low power to the information sequence {bk (n)} at the transmitter before modulation and transmission. There is no loss in data transmission rate, unlike the conventional TM training. Periodic superimposed training has been discussed in [7], [31], and [45] for time-invariant channels and in [44] and [46] for time-varying SIMO channels based on the CE-BEM. Superimposed training-based channel estimation has been considered in [8], [14], and [25] for time-invariant MIMO systems and in [26] for CE-BEM-based time-variant MIMO systems.

m1 , m2 ∈ {0, 1, . . . , P˜ − 1}.) One way to accomplish this goal co (n + is to first pick a periodic “base” sequence {¯ co (n)} (= {¯ k P˜ )} for any integers k and n) with period P˜ such that

c¯m0 =

P˜ −1 1 ˜ c¯o (n)e−j(2πm/P )n P˜

(8)

n=0

˜ −1 −1 co (n)|2 = 1. Define {¯ c1 (n)}P and P˜ −1 P n=0 as K repetin=0 |¯ tions of {¯ co (n)}. Pick the superimposed training sequence for user k as ck (n) := σck c¯1 (n)ej(2π/P )(k−1)n for k = 1, 2, . . . , K so that P −1 by [14], we have

P −1 n=0

(9)

2 |ck (n)|2 = σck . Then,

P −1

1 ck (n)e−j(2πm /P )n P n=0 m −k+1 c ¯ , if m = k − 1 + Km, m = σ ck m0 K = 0, otherwise.

cm k =

(10)

III. F IRST-O RDER S TATISTICS -BASED E STIMATOR In this section, we extend the first-order statistics-based approach of [13] using DPS-BEM for single-user systems to multiuser/MIMO doubly selective channels. The main idea is to pick user-specific training sequences so that the problem of channel estimation is approximately decoupled across various users—this allows us to use the SIMO superimposed trainingbased approach discussed in [13]. The choice of superimposed training sequences follows the time-invariant MIMO results of [14]. A. Superimposed Training Sequences [14] Following [14], our approach is to assign distinct cycle frequencies of the periodic training sequences to distinct users, so that the problem of channel estimation is decoupled across various users (see Remark 1 later in Section III-B). Suppose that for every user k, {ck (n)} is periodic with period P = P˜ K, where P˜ is a positive integer. Then, in general

Thus, the preceding choice satisfies (7) for k = 1, 2, . . . , K. Candidate training sequences include m-sequences (maximallength pseudorandom sequences) [33] and discrete chirp sequences [31].

B. First-Order Statistics-Based MIMO Channel Estimator Using Superimposed Training We now extend the first-order statistics-based approach of [13] for single-user systems using DPS-BEM to multiuser/ MIMO doubly selective channels. Four model assumptions are made.

(7)

H1) The time-varying channel {hk (n; l)} satisfies (3), with {uq (n)}Q q=1 being the first Q DPS sequences. In addition, N ≥ 1. H2) The information sequences {bk (n)} are of zero mean 2 , and and finite alphabet, i.i.d. with E{|bk (n)|2 } = σbk mutually independent for k = 1, 2, . . . , K. H3) The measurement noise {v(n)} in (1) is of zero mean and white complex Gaussian and is independent of {bk (n)}, with E{[v(n + τ )][v(n)]H } = σv2 IN δ(τ ). H4) The superimposed training sequences ck (n) = ck (n + mP ) ∀m, n are nonrandom periodic sequences with −1 2 2 := P period P and average power σck n=0 |ck (n)| /P , satisfying (7) such that cmk = 0 for 1 ≤ k ≤ K and 0 ≤ m ≤ P˜ − 1, and P˜ is an integer with P = P˜ K. The training sequences at the receiver are synchronized with their respective counterparts at the transmitter.

for suitably chosen cmk = 0 for 1 ≤ k ≤ K and 0 ≤ m ≤ P˜ − 1. (Under this choice, we have αm1 k1 = αm2 k2 for any

The choice of training sequences discussed in Section III-A satisfies H4. Assumptions H2 and H3 are standard for digital communications signals.

ck (n) =

P −1

cm k ej(2πm /P )n

∀n

(6)

m =0

−1 −j(2πm /P )n where cm k := P −1 P . Pick {ck (n)} n=0 ck (n)e ˜ so that only P coefficients (out of the total P ) cm k that are associated with P˜ distinct frequencies are nonzeros. For instance, we may choose ck (n) =

˜ −1 P

cmk ejαmk n ,

αmk := 2π(Km+k−1)/P

m=0


Using (1), (3), (5), and (7) and taking expectation over the bk ’s and v, we obtain E {y(n)} =

˜ −1 Q L K P k=1 m=0 q=1

l=0

Choose the dmqk ’s to minimize J. We must have (∂J/ ∂d∗mqk )|dmqk =dˆ mqk = 0, which leads to ˜ −1 Q P K

hqk (l)cmk e−jαmk l

ˆ m q k d

T −1

k =1 q =1 m =0

j(αm k −αmk )n

uq (n)uq (n)e

n=0

=:dmqk

× uq (n)e

jαmk n

∀n.

=

(11)

T −1

dTmQk ]T

···

(12)

H H Hkl := [ hH (13) 1k (l) · · · hQk (l) ] ⎡ ⎤ 1 e−jα0k ··· e−jα0k L ⎢1 e−jα1k ··· e−jα1k L ⎥ ⎥ Vk := ⎢ .. .. .. ⎣ ... ⎦ . . . −jα(P˜ −1)k −jα(P˜ −1)k L 1 e ··· e P˜ ×(L+1)

(14)

C˜k := diag c0k , c1k , . . . , c(P˜ −1)k Vk

(15)

Ck := C˜k ⊗ IN Q

(16)

Hk = [ HH k0

···

H HH kL ]

Dk = DH 0k

···

DH (P˜ −1)k

(17) H

(18)

where Dmk is [N Q] × 1, Hkl is [N Q] × 1, C˜k is P˜ × (L + 1), Ck is [P˜ N Q] × [(L + 1)N Q], Hk is [(L + 1)N Q] × 1, and Dk is [P˜ N Q] × 1. By the definition of dmqk in (11), we have Ck Hk = Dk .

y(n) = E {y(n)} + e(n) =

˜ −1 Q P K

dmqk uq (n)ejαmk n + e(n)

(20)

k=1 q=1 m=0

where {e(n)} is a zero-mean random sequence. Equation (20) forms the basis for estimating the BEM coefficients hqk (l) from the received signal y(n) by first estimating the dmqk ’s via a least-squares approach and then using (19) with known superimposed training to estimate the hqk (l)’s. Given the received signal over n = 0, 1, . . . , T − 1, define the cost function J :=

T −1 n=0

e(n)2 .

(21)

(22)

=:gmqk

ˆ mqk , ˆ mk as in (13), with dmqk replaced with d Define D ˆ mk ; ˆ and define Dk as in (18), with Dmk replaced with D similarly, define Gmk as in (13), with dmqk replaced with gmqk , and define Gk as in (18), with Dmk replaced with Gmk , T T , . . . , gmQk ]T , etc. Furthermore, define i.e., Gmk := [gm1k ⎡ ˆ ⎤ ⎡ ⎤ D1 G1 ˆ ⎢D ⎥ ⎢ G2 ⎥ ˆ := ⎢ . 2 ⎥ ⎥ D G := ⎢ (23) ⎣ .. ⎦ ⎣ ... ⎦ ˆK D GK [K P˜ N Q]×1 [K P˜ N Q]×1 ⎡ ˜ Ψ11 ˜ 21 ⎢Ψ Ψ := ⎢ ⎣ ... ˜ K1 Ψ

˜ 12 Ψ ˜ 22 Ψ .. . ˜ K2 Ψ

··· ··· .. . ···

˜ 1K ⎤ Ψ ˜ 2K ⎥ Ψ .. ⎥ . ⎦ ˜ KK Ψ

(24)

[K P˜ Q]×[K P˜ Q]

˜ k k has entries (m, m = where the P˜ Q × P˜ Q matrix Ψ ˜ 0, 1, . . . , P − 1; q, q = 1, 2, . . . , Q) ˜ k k ]mQ+q,m Q+q = [Ψ

T −1

uq (n)uq (n)ej(αm k −αmk )n

n=0

k, k = 1, 2, . . . , K.

(19)

Since αmk ’s are distinct and cmk = 0 for 0 ≤ m ≤ P˜ − 1 and 1 ≤ k ≤ K, rank(Ck ) = N Q(L + 1) if P˜ ≥ L + 1; hence, we can uniquely determine the hqk (l)’s from (19). This requires knowledge of Dk , whose estimation we discuss next. It follows from (1), (3), (5), and (11) that

y(n)uq (n)e−jαmk n .

n=0

For 0 ≤ m ≤ P˜ − 1, 1 ≤ k ≤ K, and 0 ≤ l ≤ L, we define Dmk := [ dTm1k

1345

(25)

Then, (22) leads to ˆ=G (Ψ ⊗ IN )D

⇒

ˆ = (Ψ−1 ⊗ IN )G. D

Then, the estimate of Hk is given by ˆ k = CkH Ck −1 CkH D ˆk . H

(26)

(27)

ˆ qk (l). FollowDenote the corresponding estimate of hqk (l) as h ing the DPS-BEM representation (3), the estimate of the timevarying channel is given by ˆ k (n; l) = h

Q

ˆ qk (l)uq (n). h

(28)

q=1

Remark 1: In the succeeding remarks, m ∈ {0, 1, . . . , P˜ − 1}, and k = 1, 2, . . . , K. Following the arguments of [13, Remark 1], {uq (n)e−jαmk n } is approximately band limited to Km + k − 1 Km + k − 1 k¯ k¯ , fd Ts + + −fd Ts − + T P T P

1346


with the integer k¯ approximately as 1 ≤ k¯ ≤ 3. Therefore, for fd Ts 1/P , {uq (n)e−jαmk n } has (approximately) vanishing zero-frequency content when Km + k − 1 = 0, leading to T −1

Then, by (H2), (H3), and (11) E{gmqk } =

T −1

˜ mqk . E {y(n)} uq (n)e−jαmk n =: d

(33)

n=0

uq (n)e−jαmk n ≈ 0 ∀αmk = 0.

(29)

It follows that

n=0

By similar arguments, the discrete-time Fourier transforms of {uq (n)e−jαmk n } and {uq (n)e−jαm k n } are nonzero over nonoverlapping frequency bands, leading to T −1

˜ mqk + smqk + wmqk gmqk = d where wmqk :=

j(αm k −αmk )n

uq (n)uq (n)e

≈ δ(k − k)δ(m − m)δ(q − q).

y(n)uq (n)e−jαmk n

smqk :=

(30)

˜ k k ≈ I ˜ δ(k − k) and Ψ ≈ I ˜ . In this case, we have Ψ KP Q PQ ˆ mqk of dmqk is Then, the estimate d T −1

T −1

v(n)uq (n)e−jαmk n

(35)

n=0

n=0

ˆ mqk ≈ d

(34)

(31)

n=0

K T −1

L

hk (n; l)bk (n − l)

n=0 k =1 l=0

× uq (n)e−jαmk n .

(36)

We then have H E wm q k wmqk T −1 = σv2 uq (n)uq (n)ej(αmk −αm k )n IN

which, via (27), leads to channel estimates decoupled across different users. It is not too hard to show that timing synchronization among the superimposed training sequences of various users is not required for (30) to hold true, i.e., (30) holds, even if there is a relative time shift between ck (n) and ck (n), because the cycle frequencies of the various users are unchanged, and by choice, they are distinct.

(37)

n=0

E sm q k sH mqk K L 2 2 = (L + 1) σhpl σbp ×

T −1

p=1

l=0

j(αmk −αm k )n

uq (n)uq (n)e

IN .

(38)

n=0

IV. P ERFORMANCE A NALYSIS

Therefore, it follows that

In this section, we present a performance analysis of the approach discussed in Section III-B. To obtain tractable results, we assume that the MIMO channel is complex Gaussian, satisfying the assumption given here. H5) The time-varying channels {hk (n; l)} are of zero mean, complex Gaussian with correlation E{hk (n; l)hH k (n; 2 IN , and mutually independent for distinct l’s l)} = σhkl and different users. This assumption applies to this section only; the algorithms proposed in this paper are not based on this assumption. The widely used wide-sense stationary uncorrelated scattering channel model [33], combined with the independently fading (sub)channel between any transmit–receive antenna pair, satisfies H5. The assumption of independently fading links between any transmit–receive antenna pair has widely been used in MIMO literature (see [32], [43], and references therein). By (22), gmqk has contributions from the information sequences {bk (n)} unknown at the receiver, the superimposed training {ck (n)} known at the receiver, and the measurement noise v(n). It follows from (1)–(5), H2, and (22) that L T −1 K E {y(n)} + v(n) + hk (n; l)bk (n − l) gmqk = n=0

k =1 l=0

H cov(gm q k , gmqk ) = E sm q k sH mqk + E wm q k wmqk . (39) Hence ˆ D) ˆ = (Ψ−1 ⊗ IN )cov(G, G) (Ψ−1 )H ⊗ IN cov(D, (40) where various entries cov(gmqk , gmqk ).

in

cov(G, G)

follow

from

A. Simplification Under Remark 1 To obtain a simpler more “interpretable” covariance expression, we will invoke the conditions of Remark 1. Under (29) ˜ mqk = dmqk , ˆ mqk = gmqk , d and (30), it easily follows that d Ψ ≈ IK P˜ Q , and H E wm q k wmqk (41) ≈ σv2 IN δ(m − m)δ(q − q)δ(k − k) H E sm q k smqk K 2 2 σhp σbp δ(m − m)δ(q − q)δ(k − k)IN ≈ (L + 1) p=1

−jαmk n

× uq (n)e

.

(32)

(42)


2 2 , σbp , For a given channel, users’ power, and noise power σhpl 2 and σv , respectively, using (4), we may rewrite (51) as

where 2 := σhp

L

L 2 E hp (n; l)2 = σhpl .

l=0

Hence

ˆk , D ˆ k ) ≈ (L + 1) cov(D

K

MSEok = O

2 2 σhp σbp

+

σv2

IN QP˜

(44)

p=1

ˆ D) ˆ ≈ diag cov(D ˆk , D ˆ k ), cov(D,

'

(43)

l=0

1347

k = 1, 2, . . . , K .

Q 2 T σck

(

' =O

2fd Ts T + 1 2 T σck

( .

(52)

If fd = 0 (time-invariant channels, as in [14]), one has limT →∞ MSEok = 0. For time-varying channels with fd > 0, it follows from (52) that ' lim MSEok ≈ O

T →∞

2fd Ts 2 σck

( .

(53)

(45) Using (16) and the channel estimator (27), it follows that L K 2 2 2 ˆ k ) ≈ (L + 1) ˆk, H σhpl σbp + σv cov(H p=1

l=0

×

! "−1 C˜kH C˜k ⊗ IN Q . (46)

The mean square error (MSE) in channel estimation for user k is defined by #$ T −1 L $2 % 1 $ $ ˆ E $hk (n; l) − hk (n; l)$ . (47) MSEk := T n=0 l=0

If the true channel follows (3), using the orthonormality of the Slepian sequences, we have L Q & $2 $ 1 $ $ ˆ qk (l)$ (48) MSEk = E $hqk (l) − h T q=1 l=0

1 ˆk, H ˆk) tr cov(H (49) T K L # "−1% 1 2 2 2 ≈ . σhpl σbp + σv N Qtr C˜kH C˜k T p=1 =

l=0

(50) Thus, the interference from all the users’ information sequences {bk (n)} contribute to a major part of the MSE in the first-order statistics-based estimator, even when the channel estimation has been decoupled for each user. Following [13, Sec. IV-D1], tr{(C˜kH C˜k )−1 } is minimized if and only if C˜kH C˜k is diagonal, with all its diagonal elements equal. There is no unique choice of superimposed training sequence that minimizes tr{(C˜kH C˜k )−1 }; however, discrete chirp sequences [31] meet the requirement exactly [13], [14], and m-sequences [33] meet the requirement approximately [13]. −2 IL+1 [13, For such sequences, one obtains (C˜kH C˜k )−1 = σck eq. (50)]. Then, the optimized MSE for user k is given by K L 1 2 2 σbp σhpl + σv2 N Q(L + 1). MSEok ≈ 2 σck T p=1 l=0 (51)

Thus, whereas the MSE for user k can be decreased by in2 , it does not offer a “practical” creasing the training power σck 2 solution since, for a fixed transmitted power, increasing σck 2 would imply decreasing the signal power σbk , which, in turn, would lead to a deteriorated bit error performance. In addition, in accordance with one’s intuition, a smaller Doppler spread fd (slower time variations) leads to a smaller MSE, and vice versa. V. I TERATIVE J OINT C HANNEL E STIMATION AND DATA D ETECTION The first-order statistics-based approach of Section III-B views the information sequences from all users as interferences. Since the training and information sequences pass through an identical channel, this fact can be exploited to enhance channel estimation performance (and, hence, BER performance). We now consider joint channel and information sequence estimation via an iterative approach. Such techniques have been considered for the CE-BEM SIMO systems in [24] and the DPS-BEM SIMO systems in [13], using the Viterbi detector; the Kalman-filtering-based detectors have not been considered in these papers. One could view the objective function in (63) as our primary optimization cost function, which we seek to “reliably” initialize via the first-order statistics-based channel estimator of Section III-B. A. Iterative Enhancement via Viterbi Algorithm: DML Approach In this section, we consider joint channel and information sequence estimation via an iterative DML (since the information sequence is modeled as unknown but deterministic) approach. The cost in (63) is proportional to the negative loglikelihood function for the noisy data jointly conditioned on the channel and the information sequence; hence, one may also call this approach as an iterative conditional maximum-likelihood (CML) approach. However, our assumption H2 still holds true although it is not exploited. The DML formulation has been exploited by various authors in a variety of contexts (see [2], [42], and references therein); it is known to be statistically efficient at high SNRs [42]. If we consider a CML approach conditioned only on the channel, then one gets an intractable optimization problem; this formulation is called statistical ML [42].

1348


Given the received signal y(n) for n = 0, 1, . . . , T − 1, we define (54)–(59), shown at the bottom of the page, where Y is [(T − L)N ] × 1, s is [KT ] × 1, Σn is N × [N Q], T (s) is [(T − L)N ] × [K(L + 1)N Q], H is [K(L + 1)N Q] × 1, and V is [(T − L)N ] × 1. By (1) and (3), we have the linear model Y = T (s)H + V.

(60)

If we further define (61), shown at the bottom of the page, where F(H) is [(T − L)N ] × [(T − L)N ], we obtain another linear model as Y = F(H)s + V.

(62)

We consider joint (ML) estimation ˆ s} = arg min Y − T (s)H2 {H,

(63)

H,s∈S

initialized by the first-order statistics-based channel estimator ˆ k calculated via (31), where S is the (discrete) (27), with D domain of s. Under a white Gaussian noise assumption, the DML estimators are obtained by the nonlinear least-squares optimization (63). Using (60) and (62), we have a separable nonlinear least-squares problem that can sequentially be solved as follows: At iteration j, with an initial guess of the channel H(j) , the algorithm estimates the input sequence s(j) and the channel H(j+1) for the next iteration by $ " $2 $ $ s(j) = arg min $Y − F H(j) s$ (64) s∈S $ " $2 $ $ (65) H(j+1) = arg min $Y − T s(j) H$ . H

The optimization in (65) is a linear least-squares problem having the solution " ˆ (j+1) = T † s(j) Y (66) H

Y := [ yT (T − 1) s := [ s1 (T − 1)

whereas the optimization in (64) can be achieved by using the vector Viterbi algorithm [41, Sec. 7.8.4]. Since the preceding iterative procedure involving (64) and (65) decreases the cost at every iteration, one achieves a local minimum of the nonlinear least-squares cost (local maximum of DML function). If we initialize (64) with our superimposed training-based solution, one expects to reach the global extremum (minimum error probability sequence estimator) if the superimposed trainingbased solution is “good.” B. Iterative Enhancement via Linear MMSE Equalization The computational complexity of the Viterbi algorithm exponentially grows with the length of channel, the number of users, and the constellation of transmitted symbols [33, p. 681]. We may use other symbol detectors whose computational complexity linearly grows, e.g., the Kalman filter. It can be viewed as a suboptimal approximation to the DML approach, with much lower computational burden. Define the state vector for the kth user consisting of the transmitted symbols as

w(n) := [s1 (n + 1)

···

V := [ v (T − 1) T

⎡ ⎢ F(H) := ⎣

s2 (n + 1)

sK (n + 1)]T . (69)

···

Then, the state and the measurement equations of the statespace model of interest for the received signal are given by S(n + 1) = ΦS(n) + Γw(n) y(n) = H(n)S(n) + v(n)

yT (L) ]T

··· ···

sK (T − 1)

h1 (T − 1; 0)

(70) (71)

(54) s1 (T − 2)

···

T

sK (0) ] ··· ··· .. .

(55) ⎤

sK (T − L − 1)ΣT −1 sK (T − L − 2)ΣT −2 ⎥ ⎥ .. ⎦ .

···

(56)

(57)

sK (0)ΣL

T T HK ]

···

(67)

Define the K × 1 “input” w(n) as

uQ (n)IN ] · · · s1 (T − L − 1)ΣT −1 1 T −1 ⎢ s1 (T − 2)ΣT −2 · · · s1 (T − L − 2)ΣT −2 T (s) := ⎢ .. .. .. ⎣ . . . s1 (L)ΣL ··· s1 (0)ΣL H :=

sk (n − d)]T

···

where d ≥ L is also the equalization delay, and Sk (n) is (d + 1) × 1. The “overall” K(d + 1)-state vector is defined as T (68) S(n) := ST1 (n) ST2 (n) · · · STK (n) .

Σn := [ u1 (n)IN · · · ⎡ s (T − 1)Σ

[ H1T

sk (n − 1)

Sk (n) := [sk (n)

(58) T

T

v (L) ]

··· .. .

hK (T − 1; 0) h1 (L; 0)

(59)

··· .. . ···

⎤

hK (T − 1; L)

⎥ ⎦

..

hK (L; 0)

. ···

hK (L; L)

(61)


respectively, where 01×d 0 Φ := IK ⊗ Id 0d×1

Γ := IK ⊗[ 1 01×d ]T (72)

H(n) := [ H1 (n) H2 (n) · · · HK (n) ]

1349

superimposed training-based iterative approaches is I + 1 times that of the TM case, where we have counted the firstorder statistics-based step in addition to the I iterations. In the simulations section (see Section VI), we have taken I = 3.

(73)

Hk (n) := [ hk (n; 0) hk (n; 1) · · · hk (n; L) 0N ×(d−L) ] (74) with Φ being [K(d + 1)] × [K(d + 1)], Γ being [(K(d + 1)] × 1, Hk (n) being N × [d + 1], and H(n) being N × [(d + 1)]. The Kalman-filtering algorithm for estimating sk (n − d), given y(m) (m ≤ n), for a chosen value of equalization delay d based on (71) can be found in [39]. We use the estimated channel in (73) and (74), using (28), and estimated parameter ˆ vector H. In (69), for superimposed training, we take E{sk (n)} = ck (n) and var(sk (n)) = var(bk (n)). For TM training, we take E{sk (n)} = 0 and var(sk (n)) = var(bk (n)) if sk (n) is an information symbol, whereas we take E{sk (n)} = ck (n) and var(sk (n)) = 0 if sk (n) is a training symbol. Compared with the approach of Section V-A, here, iteration (64) is replaced with the hard-quantized Kalman equalizer output with equalization delay d, whereas optimization specified by (65) remains unchanged. C. Computational Complexity By [33, p. 681] and [41, Sec. 7.8.4], the computational complexity of the Viterbi algorithm, as applied to our problem, is O([KM ]L N T ), where M is the signal constellation size, and as defined earlier, K is the number of users (Tx antennas), N is the numbers of Rx antennas, L + 1 is the channel length, and T is the number of measurement samples at the receiver. By [11, Tab. 6.5], the computational complexity of the Kalman filter, as applied to our problem, is O((K(d + 1))2 N T + K(d + 1)N 2 T ), where we have assumed that the equalization delay d ≥ L. The computational complexity of the Kalman detector would be much less than that of the Viterbi detector, particularly for higher alphabet size, channel length, and number of Tx antennas. Channel reestimation via (66) has computational complexity O([K(L + 1)N Q]3 + T ). Therefore, the overall complexity of the iterative enhancement would be the number of iterations times the sum of the complexities of the data-detection and channel-estimation steps; comparatively, the first-order statistics-based channel-estimation approach has “negligible” computational requirement. Comparing TM training-based approaches (see also Section VI-A) with superimposed training-based approaches, we note that, in the TM case, one estimates the channel based solely on training and then detects the data; this is what we consider in Section VI. Thus, the computational complexity of TM training-based approaches is approximately equal to that of the data-detection step, which, in turn, is either O([KM ]L N T ) if a Viterbi detector is used or O((K(d + 1))2 N T + K(d + 1)N 2 T ) if a Kalman detector is used with equalization delay d (≥ L). Let I denote the number of iterations executed. Then, approximately, the computational complexity of the proposed

VI. S IMULATION E XAMPLES We now illustrate our approaches, with two simulation examples dealing with a two-user (K = 2) and multiple-receiverantennas scenario, where the received signals from N ≥ 1 receive antennas are jointly processed. We assume that 2 2 2 2 = σb2 = σb2 and σc1 = σc2 = σc2 with a both users have σb1 2 2 training-to-information-power ratio (TIR) σb /σc of 0.3. Considering a random doubly selective Rayleigh fading channel, we take L = 2 in (1) with hk (n; l) as in H5 and having a uni2 2 = σh2 = 1, satisfying Jakes’ form power delay profile with σh1 L 2 model (recall that σhk = l=0 E{hk (n; l)2 }). To this end, we simulate each single tap of a doubly selective channel following [52] (with a correction in [51, App.]). We consider both DPS-BEM and the widely used CE-BEM representations of the doubly selective channels. The methods of this paper √ also apply to the CE-BEMs, provided we set uq (n) = (1/ T )ejωq n . We consider a system with a carrier frequency of 2 GHz, a data rate of 40 kBd (therefore, Ts = 25 μs), and a Doppler spread of fd = 50 or 100 Hz. At the receiver, we take the record length (block size) of T = 420 symbols. We emphasize that the DPS-BEM is used only for processing at the receiver; the random channels are generated by Jakes’ model and not the DPS-BEM in (3). The estimated channel is used in a symbol detector, which could be a Viterbi detector or a Kalman filter in our examples, to calculate the BERs. We assume that the additive noise is zero-mean complex white Gaussian. The (receiver) SNR refers to the energy per bit per user (Tx) per Rx over onesided noise spectral density with both information and superimposed training sequence counting toward the bit energy. An m-sequence is used to generate the superimposed training sequence for each user. We hence take P˜ = 7 and P = 14 in H4. The training sequence for the first user is {c1 (n)}13 n=0 = {1, −1, −1, 1, 1, 1, −1, 1, −1, −1, 1, 1, 1, −1} (75) which is two repetitions of an m-sequence of period P˜ = 7, and c2 (n) satisfies (7). The sequences ck (n) are scaled to achieve TIR = 0.3. A. TM Training 1) Coordinated Users: For comparison, we consider a CEor DPS-BEM-based periodically placed TM training with zero padding, following the CE-BEM-based design of [50]; we used it for DPS-BEMs as well. In [50], each transmitted block −1 from the kth user is segmented into of symbols {sk (n)}Tn=0 J subblocks of training and information symbols ck (n) and bk (n). Each subblock is of equal length with Nb information symbols and Nc training symbols. If sk denotes a column vector −1 , then sk is arranged as composed of {sk (n)}Tn=0 T T T sk := bk1 , ck1 , bTk2 , cTk2 , . . . , bTkJ , cTkJ (76)

1350


where bkj is a column of Nb information symbols, and ckj is a column of Nc training symbols. We clearly have T = J(Nb + Nc ). For CE-BEM channels, [50] has shown that (76) is an optimum structure with Nc = (K + 1)L + K, J = Q, and ckj = [0Tk(L+1)−1 , γk , 0TL+(K−k)(L+1) ]T (γk > 0), where 0L denotes a null column of size L. (A justification of (76) for DPS-BEM for the single-user case is given in [38].) Thus, given a transmission block of size T , ((K + 1)L + K)J symbols have to be devoted to training, and the remaining T − ((K + 1)L + K)J are available for information symbols. Thus, we have a trainingto-information rate overhead of (((K + 1)L + K)J)[T − (((K + 1)L + K)J]−1 for the scheme of [50]. Zero padding in the design of [50] allows for multiuser-interference-free channel estimation. The results of [50] extend the single-user results of [22] to the multiuser/MIMO case. The design of [50] is possible only in the case of the coordinated users, as discussed in Section I. To carry out a fair comparison between superimposed and TM training-based approaches, for a given transmission block size T , we keep the training-to-information rate overhead of (((K + 1)L + K)J)[T − (((K + 1)L + K)J]−1 = −2 Nc /Nb , as well as the TIR of γk2 Jσbk [T − ((K + 1)L + −1 2 2 −1 K)J] = γk [σbk Nb ] (with J = Q whenever possible) pertaining to the TM training scheme of [50], to be equal to the 2 2 TIR of σck /σbk for the superimposed training-based schemes. (We take γk = γ ∀k when the channels for the individual users have the same statistics, as in the example considered here.) For the two-user case (K = 2), we take a training session of length ) eight symbols with the first user’s training sequence {0, 0, (K ) + 1)L + K, 0, 0, 0, 0, 0} and the second user’s {0, 0, 0, 0, 0, (K + 1)L + K, 0, 0} so that the training sessions have the same average power as the information data sessions. An information data session of 27 symbols with unit power is inserted between two such training sessions to form a frame of length 35 symbols. Such a frame is repeated over a record length of 420 symbols (12 frames). Thus, we have a training-to-information bit ratio of about 0.3. Using (3) and the training sequence, we can uniquely determine the hq (l)’s via a least-squares approach. The Viterbi detector or the Kalman detector was used for data detection using the estimated channel. 2) Uncoordinated Users: In the case of the uncoordinated users, there is no coordination among the various users regarding the placement of the training symbols in the transmitted block. Hence, multiuser interference in channel estimation cannot be avoided; therefore, zero padding in the training design does not make sense. For this reason, while the alternating arrangement of the training and information symbols for user k is like that in (76), the placement and the choice of the ckj ’s will be different: ckj consists of a random binary sequence that is scaled by γk , of length Nc for each k and independent for various k’s and j’s, and for different k’s, their placement may or may not overlap from run to run. As in Section VI-A1, this leads to a training-to-information rate overhead of Nc /Nb , and 2 Nb ]−1 . unlike Section VI-A1, the TIR turns out be [γk2 Nc ][σbk 2 If σbk = 1 and γk = 1 ∀k, then Nc /Nb equal to the TIR of the superimposed training leads to a fair comparison. For the two-user case considered in the simulations, we take γk = 1, Nc = 8, and Nb = 25 as in Section VI-A1, leading to a training-to-information bit ratio of about 0.3. As before, using

Fig. 1. Example 1. NCMSE: (77) versus SNR, averaged over 500 Monte Carlo runs for CE-BEM- and DPS-BEM-based DML channel estimators, with two users (K = 2), two receive antennas (N = 2), fd = 50 Hz, Viterbi detector, TIR = 0.3, binary signals, and T = 420 bits. Step 1 refers to the approach of Remark 1 in Section III-B. “3rd iter.” refers to the third iteration of the DML approach. SI-CE, SI-DPS, “TM-CE: coord,” “TM-DPS: coord,” and “TM-DPS: uncoord” refer to superimposed training with CEBEM representation, superimposed training with DPS-BEM representation, TM training for coordinated users with CE-BEM representation [50], TM training for coordinated users with DPS-BEM representation, and TM training for uncoordinated users with DPS-BEM representation, respectively.

(3) and the training sequence, we can uniquely determine hq (l)’s via a least-squares approach, and then, the Viterbi detector or the Kalman detector can be used for data detection using the estimated channel. B. Example 1 In this example, a two-user two-receiver (K = N = 2) scenario is considered. For each user, the information sequences are drawn from the binary alphabet {±1}. Both cases, i.e., coordinated users and uncoordinated users, are considered. This affects only the TM training design, as discussed in Section VI-A. A Viterbi detector is used for symbol detection. The BER and normalized channel MSE (NCMSE) results are shown in Figs. 1 and 2 for a Doppler spread of fd = 50 Hz and in Figs. 3 and 4 for a Doppler spread of fd = 100 Hz, where the NCMSE is defined as $2 M L $ −1 K T r $ ˆ (i) $ (i) (T Mr )−1 $hk (n; l) − hk (n; l)$ i=1 k=1 n=0 l=0 NCMSE = $2 M K T L $ −1 r $ (i) $ (T Mr )−1 $hk (n; l)$ i=1 k=1 n=0 l=0

(77) (i) ˆ (i) (n; l) is the where hk (n; l) is the true channel, and h k estimated channel at the ith run, among the total Mr runs. The corresponding detection results are based on the Viterbi algorithm utilizing the estimated channel. The iterations follow our DML approach in Section V-A. For fd = 50 and 100 Hz, we choose the number of basis functions Q = 3 and 5 by Q ≥ 2fd T Ts + 1 for CE-BEM and Q = 3 and 4 by (4) for DPS-BEM, respectively. The first-order statistics-based channel estimator (27) ˆ k calculated via (31). [see also (28)] was implemented with D


Fig. 2.

As in Fig. 1, except that BER versus SNR is shown.

Fig. 3.

As in Fig. 1, except that fd = 100 Hz.

Fig. 4.


We show the results from the first-order statistics-based estimator using superimposed training (denoted by “step 1” in the figures), the third iteration of the DML approach (denoted by “3rd iter.” in the figures), and the TM training. (In addition, in the figures, “SI-CE” denotes the estimators based on

1351

Fig. 5. BER comparison between the MIMO (K = 2, N = 2) and SISO (K = 1, N = 1) systems. The results are based on 500 runs and show the outcome of the third iteration of the DML algorithm.

superimposed training and CE-BEM, and “SI-DPS” denotes those based on superimposed training and DPS-BEM; “TMCE” and “TM-DPS” follow the similar terminology.) As noted in [51], the DPS-BEM efficiently reduces the spectral leakage induced by CE-BEM, leading to a much smaller modeling error; the BER and NCMSE curves both exhibit this advantage. It is also seen that the DML algorithm, whether it is CE or DPS-BEM based, significantly reduces the interference from the information sequence, which is induced by its first step, which is the first-order statistics-based approach. The BER performance after three DML iterations is superior to that due to the TM training for uncoordinated users and is competitive with the TM training for coordinated users, without incurring the 30% training overhead penalty. 1) MIMO versus SISO Comparison: In Fig. 5, we compare the BER performance of the MIMO case (two Tx antennas, two Rx antennas, and spatial multiplexing with “odd” bits sent through Tx 1 and “even” bits through Tx 2) with the SISO case of one Tx antenna and one Rx antenna. All (sub)channels corresponding to each Tx–Rx pair are mutually independent, identically distributed, and as described earlier for Example 1. Note that the total transmit power in both cases is the same, i.e., in the MIMO case, each of the two Tx antennas transmits half the power, compared with the SISO case, for a fair comparison between the MIMO and SISO cases [32]. The SNR shown in Fig. 5 is the total SNR at an Rx antenna resulting from the received signal from both Tx antennas; therefore, we refer to it in Fig. 5 as the total SNR to contrast it from the earlier results where the SNR includes the signal power from just one Tx antenna, as would be appropriate in a multiuser scenario. Only DPS-BEM representation is considered. It is seen that the MIMO case outperforms the SISO scenario while having an overall transmission rate that is double that for SISO transmission. 2) Performance Analysis Verification: For the first-order statistics-based DPS-BEM estimator, we also plotted the theoretical channel MSE of (50), after summing over the two users and normalizing as in (77), in Fig. 6. The theoretical expression and the simulation-based MSE results agree quite well.

1352


Fig. 6. Performance analysis comparison. Example 1. “SI-DPS: analytical” shows the theoretical channel MSE of (50) after summing over the two users and normalizing as in (77). Fig. 8.


us an “approximation” of the DML approach, the enhancement after iterations is still significant. As N increases, the gap between the BERs for the TM (coordinated users) and the DPSBEM-based approaches rapidly narrows, with the DPS-BEM approach being competitive with the TM approach for N = 3 without the 30% data rate loss. VII. C ONCLUSION

Fig. 7. Example 2. Normalized channel mean-square error [NCMSE: (77)] versus SNR, which is averaged over 1000 Monte Carlo runs, for DPS-BEMbased DML channel estimators, with two users (K = 2), multiple receive antennas (N = 2, 3, or 4), fd = 100 Hz, Kalman detector with equalization delay d = 5 symbols, TIR = 0.3, four-level complex-valued signals, and T = 420 symbols. Step 1 refers to the approach of Remark 1 in Section III-B. “3rd iter.” refers to the third iteration of the DML approach. SI is the superimposed training with DPS-BEM representation, and TM is the TM training with DPS-BEM representation.

C. Example 2 In this example, we compare the improvement of performance as the number of receive antennas N increases. Only DPS-BEM is considered here. For each user, the information √ sequences are drawn from the alphabet {(±1 ± j)/ 2} of size four. To reduce the computational complexity, we employ a Kalman filter, together with a quantizer, as the symbol detector exploiting the estimated channel, following the model (71), with an equalization delay d = 5 symbols. In Figs. 7 and 8, we compare the DPS-BEM-based estimators using superimposed and TM training (coordinated users only) for fd = 100 Hz. For N = 2, 3, and 4, the BER performance benefits from employing more receivers. Although the Kalman filter can only offer

Channel estimation and symbol detection for multiuser/ MIMO doubly selective fading channels using superimposed training and DPS-BEM have been considered. A user-specific periodic training sequence has been superimposed on each user’s information sequence. We have first employed a firstorder statistics-based estimator to estimate the channel, where the information sequences from all users act as interference. We have then presented an iterative joint channel estimation and data-detection approach exploiting the detected symbols from the previous iteration to enhance channel estimation. Simulation results have shown that, without incurring any loss in data transmission rate, the proposed approach is superior to the TM training for uncoordinated users, where the multiuser interference in channel estimation cannot be eliminated (as in a general multiuser scenario), and is competitive with the TM training for coordinated users, where the TM training design allows for multiuser-interference-free channel estimation (as in a MIMO scenario). R EFERENCES [1] T. Abe and T. Matsumoto, “Space time turbo equalization in frequencyselective MIMO channels,” IEEE Trans. Veh. Technol., vol. 52, no. 3, pp. 469–475, May 2003. [2] F. Alberge, M. Nikolova, and P. Duhamel, “Blind identification/ equalization using deterministic maximum likelihood and a partial prior on the input,” IEEE Trans. Signal Process., vol. 54, no. 2, pp. 724–737, Feb. 2006. [3] D. K. Borah and B. D. Hart, “Frequency-selective fading channel estimation with a polynomial time-varying channel model,” IEEE Trans. Commun., vol. 47, no. 6, pp. 862–873, Jun. 1999.


[4] M. Coldrey and P. Bohlin, “Training-based MIMO systems—Part II: Improvements using detected symbol information,” IEEE Trans. Signal Process., vol. 56, no. 1, pp. 296–303, Jan. 2008. [5] M. Dong, L. Tong, and B. M. Sadler, “Optimal insertion of pilot symbols for transmissions over time-varying flat fading channels,” IEEE Trans. Signal Process., vol. 52, no. 5, pp. 1403–1418, May 2004. [6] B. Farhang-Boroujeny, “Pilot-based channel identification: Proposal for semi-blind identification of communications channels,” Electron. Lett., vol. 31, no. 15, pp. 1044–1046, Jun. 1995. [7] M. Ghogho, D. McLernon, E. Alamdea-Hernandez, and A. Swami, “Channel estimation and symbol detection for block transmission using data-dependent superimposed training,” IEEE Signal Process. Lett., vol. 12, no. 3, pp. 226–229, Mar. 2005. [8] M. Ghogho, D. McLernon, E. Alamdea-Hernandez, and A. Swami, “SISO and MIMO channel estimation and symbol detection using datadependent superimposed training,” in Proc. IEEE ICASSP, Philadelphia, PA, Mar. 2005, pp. 461–464. [9] G. B. Giannakis, Y. Hua, P. Stoica, and L. Tong, Eds., Signal Processing Advances in Wireless & Mobile Communications, vol. 1, Trends in Channel Estimation and Equalization. Upper Saddle River, NJ: Prentice-Hall, 2001. [10] G. B. Giannakis and C. Tepedelenlioðlu, “Basis expansion models and diversity techniques for blind identification and equalization of timevarying channels,” Proc. IEEE, vol. 86, no. 10, pp. 1969–1986, Oct. 1998. [11] M. S. Grewal and A. P. Andrews, Kalman Filtering Theory and Practice. Englewood Cliffs, NJ: Prentice-Hall, 1993. [12] S. He and J. K. Tugnait, “Doubly-selective multiuser channel estimation using superimposed training and discrete prolate spheroidal basis expansion models,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., Honolulu, HI, Apr. 2007, vol. II, pp. 861–864. [13] S. He and J. K. Tugnait, “On doubly selective channel estimation using superimposed training and discrete prolate spheroidal sequences,” IEEE Trans. Signal Process., vol. 56, pt. 2, no. 7, pp. 3214–3228, Jul. 2008. [14] S. He, J. K. Tugnait, and X. Meng, “On superimposed training for MIMO channel estimation and symbol detection,” IEEE Trans. Signal Process., vol. 55, pt. 2, no. 6, pp. 3007–3021, Jun. 2007. [15] K. Josiam and D. Rajan, “Bandwidth efficient channel estimation using superimposed pilots in OFDM systems,” IEEE Trans. Wireless Commun., vol. 6, no. 6, pp. 2234–2245, Jun. 2007. [16] B. W. Kim, S. Y. Jung, J. Kim, and D. J. Park, “Hidden pilot based precoder design for MIMO-OFDM systems,” IEEE Commun. Lett., vol. 12, no. 9, pp. 657–659, Sep. 2008. [17] H. Kim and J. K. Tugnait, “Doubly-selective MIMO channel estimation using exponential basis models and subblock tracking,” in Proc. 42nd Annu. Conf. Inf. Sci. Syst., Mar. 19–21, 2008, pp. 1258–1261. [18] C. Komninakis, C. Fragouli, A. H. Sayed, and R. D. Wesel, “Multiinput multi-output fading channel tracking and equalization using Kalman estimation,” IEEE Trans. Signal Process., vol. 50, no. 5, pp. 1065–1076, May 2002. [19] G. Leus, “On the estimation of rapidly time-varying channels,” in Proc. Eur. Signal Process. Conf., Vienna, Austria, Sep. 6–10, 2004, pp. 2227–2230. [20] Z. Liu, X. Ma, and G. B. Giannakis, “Space-time coding and Kalman filtering for time-selective fading channels,” IEEE Trans. Commun., vol. 50, no. 2, pp. 183–186, Feb. 2002. [21] X. Li and T. F. Wong, “Turbo equalization with nonlinear Kalman filtering for time-varying frequency-selective fading channels,” IEEE Trans. Wireless Commun., vol. 6, no. 2, pp. 691–700, Feb. 2007. [22] X. Ma, G. B. Giannakis, and S. Ohno, “Optimal training for block transmissions over doubly selective wireless fading channels,” IEEE Trans. Signal Process., vol. 51, no. 5, pp. 1351–1366, May 2003. [23] Y. Ma, N. Yi, and R. Tafazolli, “Channel estimation for PRP-OFDM in slowly time-varying channel: First-order or second-order statistics?” IEEE Signal Process. Lett., vol. 13, no. 3, pp. 129–132, Mar. 2006. [24] X. Meng and J. K. Tugnait, “Semi-blind time-varying channel estimation using superimposed training,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Montreal, QC, Canada, May 2004, vol. 3, pp. 797–800. [25] X. Meng and J. K. Tugnait, “MIMO channel estimation using superimposed training,” in Proc. IEEE Int. Conf. Commun., Paris, France, Jun. 2004, pp. 2663–2667. [26] X. Meng and J. K. Tugnait, “Doubly-selective MIMO channel estimation using superimposed training,” in Proc. IEEE Sensor Array Multichannel Signal Process. Workshop, Barcelona, Spain, Jul. 2004, pp. 407–411. [27] M. Muck, M. de Courville, X. Miet, and P. Duhamel, “Iterative interference suppression for pseudo random postfix OFDM based channel estimation,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Philadelphia, PA, Mar. 2004, vol. 3, pp. 765–768.

1353

[28] M. Muck, M. de Courville, X. Miet, and P. Duhamel, “A pseudo random postfix OFDM modulator-semi-blind channel estimation and equalization,” IEEE Trans. Signal Process., vol. 54, no. 3, pp. 1005–1017, Mar. 2006. [29] J. P. Nair and R. V. Raja Kumar, “An iterative channel estimation method using superimposed training in OFDM systems,” in Proc. IEEE VTC—Fall, Calgary, AB, Canada, Dec. 21–24, 2008, pp. 1–5. [30] V. Nguyen, H. D. Tuan, H. H. Nguyen, and N. N. Tran, “Optimal superimposed training design for spatially correlated fading MIMO channels,” IEEE Trans. Wireless Commun., vol. 7, no. 8, pp. 3206–3217, Aug. 2008. [31] A. G. Orozco-Lugo, M. M. Lara, and D. C. McLernon, “Channel estimation using implicit training,” IEEE Trans. Signal Process., vol. 52, no. 1, pp. 240–254, Jan. 2004. [32] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time Wireless Communications. Cambridge, U.K.: Cambridge Univ. Press, 2003. [33] J. G. Proakis, Digital Communications, 4th ed. New York: McGrawHill, 2001. [34] M. Qaisrani and S. Lambotharan, “Estimation of doubly-selective MIMO channels using superimposed training and turbo equalization,” in Proc. IEEE VTC, Singapore, May 2008, pp. 1316–1319. [35] A. M. Sayeed and B. Aazhang, “Joint multipath-Doppler diversity in mobile wireless communications,” IEEE Trans. Commun., vol. 47, no. 1, pp. 123–132, Jan. 1999. [36] N. Seshadri, “Joint data and channel estimation using blind trellis search techniques,” IEEE Trans. Commun., vol. 42, no. 234, pp. 1000–1011, Mar. 1994. [37] D. Slepian, “Prolate spheroidal wave functions, Fourier analysis, and uncertainty—V: The discrete case,” Bell Syst. Tech. J., vol. 57, pp. 1371– 1430, May/Jun. 1978. [38] L. Song and J. K. Tugnait, “On designing time-multiplexed pilots for doubly-selective channel estimation using discrete prolate spheroidal basis expansion models,” in Proc. IEEE ICASSP Conf., Honolulu, HI, Apr. 2007, pp. III-433–III-436. [39] M. D. Srinath, P. K. Rajasekaran, and R. Viswanathan, Introduction to Statistical Signal Processing With Applications. Upper Saddle River, NJ: Prentice-Hall, 1996. [40] P. Stoica and R. L. Moses, Introduction to Spectral Analysis. Upper Saddle River, NJ: Prentice-Hall, 1997. [41] G. L. Stuber, Principles of Mobile Communication, 2nd ed. Boston, MA: Kluwer, 2001. [42] L. Tong and S. Perreau, “Multichannel blind identification: From subspace to maximum likelihood methods,” Proc. IEEE, vol. 86, no. 10, pp. 1951– 1968, Oct. 1998. [43] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge, U.K.: Cambridge Univ. Press, 2005. [44] J. K. Tugnait and W. Luo, “On channel estimation using superimposed training and first-order statistics,” in Proc. ICASSP, Hong Kong, Apr. 2003, vol. 4, pp. 624–627. [45] J. K. Tugnait and X. Meng, “On superimposed training for channel estimation: Performance analysis, training power allocation and frame synchronization,” IEEE Trans. Signal Process., vol. 54, no. 2, pp. 752– 765, Feb. 2006. [46] J. K. Tugnait, X. Meng, and S. He, “Doubly-selective channel estimation using superimposed training and exponential bases models,” EURASIP J. Appl. Signal Process.—Special Issue Reliable Commun. Over Rapidly Time-Varying Channels, vol. 2006, p. 252, Jul. 2006. [47] J. K. Tugnait, L. Tong, and Z. Ding, “Single-user channel estimation and equalization,” IEEE Signal Process. Mag., vol. 17, no. 3, pp. 16–28, May 2000. [48] X. Wang, Y. Wu, J.-Y. Chouinard, and H-C. Wu, “On the design and performance analysis of multisymbol encapsulated OFDM systems,” IEEE Trans. Veh. Technol., vol. 55, no. 3, pp. 990–1002, May 2006. [49] X. Wang, Y. Wu, H.-C. Wu, and G. Gagnon, “An MSE-OFDM system with reduced implementation complexity using pseudo random prefix,” in Proc. IEEE GLOBECOM Conf., Washington, DC, Nov. 26–30, 2007, pp. 2836–2840. [50] L. Yang, X. Ma, and G. B. Giannakis, “Optimal training for MIMO fading channels with time- and frequency-selectivity,” in Proc. Int. Conf. Acoust, Speech, Signal Process., May 17–21, 2004, vol. 3, pp. 821–824. [51] T. Zemen and C. F. Mecklenbräuker, “Time-variant channel estimation using discrete prolate spheroidal sequences,” IEEE Trans. Signal Process., vol. 53, no. 9, pp. 3597–3607, Sep. 2005. [52] Y. R. Zheng and C. Xiao, “Simulation models with correct statistical properties for Rayleigh fading channels,” IEEE Trans. Commun., vol. 51, no. 6, pp. 920–928, Jun. 2003.

1354


Jitendra K. Tugnait (M’79–SM’93–F’94) was born in Jabalpur, India, on December 3, 1950. He received the B.Sc. degree (with honors) in electronics and electrical communication engineering from the Punjab Engineering College, Chandigarh, India, in 1971, the M.S. and E.E. degrees in electrical engineering from Syracuse University, Syracuse, NY, in 1973 and 1974, respectively, and the Ph.D. degree in electrical engineering from the University of Illinois, Urbana, in 1978. From 1978 to 1982, he was an Assistant Professor of electrical and computer engineering with the University of Iowa, Iowa City. From June 1982 to September 1989, he was with the Long Range Research Division, Exxon Production Research Company, Houston, TX. In September 1989, he joined the Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, as a Professor, where he is currently the James B. Davis Professor. His current research interests are statistical signal processing, wireless and wireline digital communications, multiplesensor–multiple-target tracking, and stochastic systems analysis. Dr. Tugnait is a past Associate Editor for the IEEE T RANSACTIONS ON AUTOMATIC C ONTROL, the IEEE T RANSACTIONS ON S IGNAL P ROCESS ING , and the IEEE S IGNAL P ROCESSING L ETTERS . He is currently an Editor for the IEEE T RANSACTIONS ON W IRELESS C OMMUNICATIONS.

Shuangchi He received the B.E. and M.S. degrees in electronic engineering from Tsinghua University, Beijing, China, in 2000 and 2003, respectively, and the Ph.D. degree in electrical engineering from Auburn University, Auburn, AL, in 2007. From August 2003 to August 2007, he was a Graduate Research Assistant and then a Vodafone Fellow with the Department of Electrical and Computer Engineering, Auburn University. He is currently with the School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta. His research interests include channel estimation and equalization, multiuser detection, and statistical and adaptive signal processing and analysis.

Multiuser/MIMO Doubly Selective Fading Channel Estimation Using ...

Multiuser/MIMO Doubly Selective Fading Channel Estimation Using ...

Suggest Documents

Doubly Selective Channel Estimation Using ... - Semantic Scholar

Doubly Selective Channel Estimation Using Exponential ... - ISE, NUS

Doubly-Selective Channel Estimation Using Data ... - ISE, NUS

Space-time fading channel estimation and symbol

The Capacity of the Frequency/Time-Selective Fading Channel

Doubly Selective Vehicle-to-Vehicle Channel Measurements and ...

MIMO-OFDM Channel Estimation for Correlated Fading Channelshttps://www.researchgate.net/.../MIMO-OFDM-Channel-Estimation-for-Correlated-Fa...

ANN based Rayleigh Multipath fading channel estimation of a MIMO ...

Coding and Channel Estimation for Block Fading ... - EECS @ Michigan

Semi-Blind Maximum a Posteriori Fast Fading Channel Estimation for ...

Maximum a posteriori multipath fading channel estimation for cdma ...

channel phase and data estimation in slowly fading ... - Engin Zeydan

Transactions Papers Modeling Fading Channel-Estimation Errors in ...

Estimation of Fading Statistics of Nakagami Channel with Weibull

Estimation of Fading Channel Response and System Capacity

Filter-Based Fading Channel Modeling

Filter-Based Fading Channel Modeling

Fast Fading Channel Neural Equalization Using Levenberg-Marquardt ...

Improvement of BER on different fading channel using ... - Open Science

Rayleigh Fading Channel Characterization Using K-Band FMCW ...

Raptor Codes for Block Fading Channels using Channel ... - CiteSeerX

Sparse Multipath Channel Estimation Using Norm Combination ...

Towards underwater channel impulse response estimation using

Adaptive MIMO Channel Estimation using Sparse Variable