Symbol-timing synchronization in space-time coding ... - IEEE Xplore

2 downloads 6985 Views 304KB Size Report
Pokfulam Road, Hong Kong. Email: [email protected]. Abstract—In this paper, a new symbol-timing estimator for space-time coding systems is proposed.
Symbol-Timing Synchronization in Space-Time Coding Systems using Orthogonal Training Sequences ∗ Department

Yik-Chung Wu∗ , S. C. Chan† and Erchin Serpedin∗

of Electrical Engineering, Texas A&M University, College Station, TX 77843-3128, USA. Email: {ycwu, serpedin}@ee.tamu.edu † Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong. Email: [email protected]

Abstract— In this paper, a new symbol-timing estimator for space-time coding systems is proposed. It improves the conventional algorithm of Naguib et al. such that accurate timing estimates can be obtained even if the oversampling ratio is small (such as oversampling ratio Q=4). The increase in implementation complexity with respect to that of conventional algorithm is very small. The requirements and the design procedures for the training sequences are discussed. Analytical and simulation results show that the estimation mean square error of the proposed estimator is significantly smaller than that of the conventional algorithm.

The paper is organized as follows. The system model of the ST coding system is first described in Section II. A brief overview of the optimum sample selection algorithm for symbol-timing synchronization in ST coding system is given in Section III. Requirements and design of training sequences are discussed in Section IV. The proposed symboltiming estimator is then presented in Section V. Analytical and simulation results are then presented in Section VI, and finally conclusions are drawn in Section VII.

I. I NTRODUCTION Space-time processing using space-time (ST) coding has received considerable interests recently as an efficient means for high rate data transmission [1]-[10]. Symbol-timing synchronization is an important issue in ST coding systems because perfect symbol timing information at the receiver is usually assumed. This problem was first studied in [4], where orthogonal training sequences are transmitted at different transmit antennas to simplify the maximization of the oversampled approximated log-likelihood function. The sample having the largest magnitude, so called the “optimal sample”, is assumed to be closest to the optimum sampling instants (it will be referred as the optimum sample selection algorithm in the sequel for convenience). However, it is shown in this paper that the estimation Mean Square Error (MSE) of this algorithm is lower bounded by 1/(12Q2 ), where Q is the oversampling ratio. As a result, the performance of this timing synchronization method highly depends on the oversampling ratio. In fact, relatively high oversampling ratio might be required for accurate symbol-timing estimation. In this paper, a new symbol-timing estimator for ST coding systems is proposed. It improves the optimum sample selection algorithm in [4] so that accurate timing estimates can be obtained even if the oversampling ratio is small. The increase in implementation complexity with respect to that of optimum sample selection algorithm is very small. The requirements and the design procedures for the training sequences are discussed. Both analytical and simulation results show that the MSE of the proposed estimator is significantly smaller than that of the optimum sample selection algorithm.

II. S IGNAL M ODEL

WCNC 2004 / IEEE Communications Society

Consider a ST coding system with N transmit and M receive antennas operating in flat fading channel. The received signal at the j th receive antenna can be written as  N  Es  rj (t) = hij di (n)g(t − nT − T ) + nj (t), N i=1 (1) n j = 1, 2, ..., M where Es /N is the symbol energy; hij is the complex channel coefficient between the ith transmit antenna and the j th receive antenna, which is assumed to be statistically independent for different i and j; di (n) is the ST encoded information symbol transmitted from the ith transmit antenna; g(t) is the transmit filter, which is assumed to be a root raised cosine pulse; T is the symbol duration;  ∈ [−0.5, 0.5] is the unknown timing offset and nj (t) is the complex-valued Gaussian white noise at the j th receive antenna, with power density No . Throughout this paper, it is assumed that the channel is frequency flat and quasi-static. Let the received signal be sampled at a rate Q times faster than the symbol rate 1/T . The sampled and matched filtered signal at the j th receive antenna is given by  N  Es  rj (m) = hij di (n)p(mT /Q−nT −T )+ηj (m), N i=1 n (2) where1 rj (m)  rj (mT /Q), p(t)  g(t) ⊗ gr (t), ηj (m)  nj (t) ⊗ gr (t)|t=mT /Q , and gr (t) is the matched filter. The

1205

1 Notation

 stands for defined as and ⊗ denotes convolution.

0-7803-8344-3/04/$20.00 © 2004 IEEE

problem under consideration is to estimate the symbol timing delay  from the received samples in (2). III. T IMING S YNCHRONIZATION BY O PTIMUM S AMPLES S ELECTION As proposed in [4], orthogonal training sequences can be periodically transmitted in between data symbols to assist the timing synchronization (a two-transmit antenna example is shown in Figure 1). Note that the structure of training sequences in this paper is different from that presented in [4]. In this paper, a cyclic prefix and cyclic suffix, each of length L, are included in order to remove the intersymbol interference (ISI) from the random data transmitted before and after the orthogonal training sequences. Since L is usually kept as a small number, the increase in length of training is very small, especially when the length of the orthogonal training sequences is large. More precisely, let ci  [ci (0) ci (1) ... ci (Lt − 1)] be the orthogonal training sequence of length Lt to be transmitted from the ith transmit antenna. The sampled signal at the j th receive antenna can be obtained by replacing di (n) in (2) with ci (n). Further, let m = lQ + k (l = 0, 1, ..., Lt − 1 and k = ko , ko + 1, ..., ko + Q − 1, where ko = −(1/2 − )Q and x denotes the nearest integer less than or equal to x), so that each sample is indexed by the lth training bit and the k th phase. In order to maintain the orthogonality between the received training sequences and the local copies, the first phase is taken at −(1/2 − )Q such that all the Q samples for the lth training bit are taken from −T /2 ≤ t − lT ≤ T /2. Then the received signal rj (lQ + k) due to the orthogonal training sequences can be rewritten as  N  Es  rj (lQ + k) = hij ci (n) N i=1 n ×p(kT /Q + (l − n)T −  T ) + ηj (lQ + k), (3)

where    + ko /Q. Note that ko has been dropped from the index of ηj (lQ + k) since a fixed time shift does not affect the noise statistics. In practice, it is sufficient to estimate  only as it represents the time difference between the first sample of the training sequence and the next nearest optimum sampling instance. Grouping the samples with the same phase and defining Υm (ci ) as the cyclic left shift of ci by m bits, one can form the vector rj (k) as follows: rj (k)  [rj (k) rj (Q + k) ... rj ((Lt − 1)Q + k)]T  N Es  = hij Ci p(k) + η j (k), N i=1

η j (k)  [ηj (k) ηj (Q + k) ... ηj ((Lt − 1)Q + k)]T . Define the sequence Ψij (k)  cH i rj (k). Since ci ’s are orthogonal to each other when the relative delay is zero, it follows that for k = 0, 1, ..., Q − 1,  Es hij p(kT /Q −  T )ci 2 Ψij (k) = N  N Es    (k) + cH η j (k), (6) + hi j cH i Ci p i N  i =1  where ci   cH i ci is the norm of ci , which is a constant;  i is the same as Ci but with the (L + 1)th column removed C  (k) is the same as p(k) but with the (L + 1)th entry and p removed. The second term in (6) represents the ISI if the training sequences from different antennas are not orthogonal when the relative delay is not zero. The last term in (6) is the noise term. From (6), it can be observed that, if the second and third terms are very small (a training sequence design procedure that make the second term zero is discussed in the next section; the third term is small at high Signal-to-Noise Ratios (SNRs)), Ψij (k) has the same shape as p(t) for −T /2 ≤ t ≤ T /2, except that it is scaled by a complex channel gain and is corrupted by additive noise. In order to remove the effect of the channel, let us form the sequence Λij (k)  |Ψij (k)|2 . Now, the sequence Λij (k) should have a similar shape to the function |p(t)|2 for −T /2 ≤ t ≤ T /2. This is illustrated in Figure 2, where an example sequence of Λij (k) is shown (Q=8, Lt =32, L=3 and in the absence of noise). Note that a scaled version of |p(t)|2 for −T /2 ≤ t ≤ T /2 is also shown (in dotted line) for comparison. It can be seen that the optimum sampling time is at t=0 and the sample with maximum amplitude is the one closest to the optimum sampling instant. A simple symbol-timing synchronization algorithm is to choose a value of k closest to the optimum sampling instants. That is, the optimum sampling phase k = kˆ is selected such that it maximizes Λij (k). For multiple transmit and receive antennas, the average of Λij (k) over all i and j should be maximized. As mentioned in [4], this is in fact the approximated log likelihood function for symbol-timing synchronization, when the ISI plus noise term in (6) is assumed to be Gaussian. Therefore, the optimum sampling phase is selected as

(4) (5)

kˆ = arg

max

k=0,1,...,Q−1

with3 ΛM L (k) 

where2 Ci  [Υ−L (ci )T Υ−L+1 (ci )T ... ΥL (ci )T ] p(k)  [p(kT /Q − LT −  T ) p(kT /Q − (L − 1)T −  T ) ... p(kT /Q + LT −  T )]T 2 Notation xT denotes the transpose of x and xH denotes the transpose conjugate of x.

WCNC 2004 / IEEE Communications Society

M  N 

ΛM L (k)

(7)

Λij (k).

(8)

j=1 i=1

Under the optimistic assumption that the samples closest to the optimum sampling positions are correctly estimated (at high Signal-to-Noise Ratio), the estimation error, normalized 3 The scaling factor 1/M N is not included in order to preserve a simplified notation.

1206

0-7803-8344-3/04/$20.00 © 2004 IEEE

with respect to the symbol duration, is a uniformly distributed random variable in the range [−1/2Q, 1/2Q]. Therefore, the MSE normalized with respect to T 2 is 1/(12Q2 ). That is, if Q=4, the MSE is lower bounded by 5.2 × 10−3 ; if Q=8, the MSE is lower bounded by 1.3 × 10−3 ; if Q=16, the MSE is lower bounded by 3.26 × 10−4 . Thus, a relatively high oversampling ratio might be required in order to obtain a small MSE. IV. D ESIGN OF T RAINING S EQUENCES In order to minimize the effect of ISI term in (6), the training sequences need to be designed such that  cH i Ci = 0

(9)



for all combination of i and i . Combining with the fact that sequences from different antennas have to be orthogonal when the relative delay is zero, the problem of training sequences design resumes to finding N sequences such that  ci 2 I if i = i  CH C = i i 0 if i = i where I denotes the identity matrix. This is exactly the problem of designing multiple (2L+1)-perfect sequences [11][13]. Here, we just mention the procedures for designing the training sequences, interested readers can refer to the original papers [11]-[13] for details. 1) Construct a sequence s  [s(0) s(1) ... s(Lt − 1)] with length Lt such that all of its out-of-phase periodic autocorrelation terms are equal to zero. One example of this kind of sequence is the Chu sequence [14]. 2) Construct another sequence s  [s (0) s (1) ... s (Lt + 2N L − 1)] of length Lt + 2N L as follow

function can be done by interpolation based on a few samples, thus keeping the oversampling ratio at a small number. More precisely, let us construct a periodic sequence  M L (m) by periodically extending the approximated log Λ  M L (ˆ ) likelihood sequence ΛM L (k) in (8). Further, denotes Λ as the continuous and periodic approximated log likelihood  M L (m). According to function with its samples given by Λ the sampling theorem, as long as the sampling frequency Q/T  M L (ˆ ), then is higher than twice the highest frequency of Λ   M L (ˆ ) can be represented by its samples Λ  M L (m) without Λ  M L (ˆ ) has the same shape as loss of information. Since Λ |p(t)|2 for −T /2 ≤ t ≤ T /2, where p(t) is a raised cosine pulse, the sampling frequency Q/T has to be at least 2 × 2/T  M L (m)  M L (ˆ ) and Λ (i.e., Q ≥ 4). The relationship between Λ is then given by

∞  ˆ T − mT /Q   M L (ˆ ) =  M L (m)sinc π Λ Λ . (12) T /Q m=−∞  M L (ˆ ) into a Fourier series Now, expand Λ  M L (ˆ ) = Λ where

For example, let consider Lt =32, L=3, N =2. First we construct a Chu sequence of length 32. Then cyclically extend the Chu sequence by copying the first 2 × 2 × 3 = 12 bits and put them at the back. Then c1 = [s (3) s (4) ... s (34)] and c2 = [s (9) s (10) ... s (40)]. V. T IMING S YNCHRONIZATION BY E STIMATION In optimum samples selection algorithm, symbol timing is estimated by maximization of the oversampled approximated log-likelihood function. As the number of samples becomes very large (which requires a large oversampling ratio), the estimate could become accurate. However, noting that the approximated log likelihood function is ‘smooth’ (see Figure 2), we expect that the maximization of the log-likelihood WCNC 2004 / IEEE Communications Society

A =

(13)

0

1

 M L (ˆ )e−j2πˆ dˆ . Λ

(14)

Substituting (12) into (14) and putting m = lQ + k yields

Q−1 ∞ 1   ˆ T −lT −kT /Q −j2πˆ ˆ e A = ΛM L (k) sinc π d T /Q 0 k=0

=

Q−1  k=0

s

ci = [s ((2i − 1)L) ... s ((2i − 1)L + Lt − 1)]. (11)

ˆ

A ej2π ,

=−∞

s  [s(0) s(1) ...s(Lt − 1) s(0) s(1) ... s(2N L − 1)].   (10) Note that Lt ≥ 2N L must be satisfied. That is, if the number of transmit antenna N is large, we cannot use training sequences with short length. 3) The orthogonal training sequences are given by

∞ 

l=−∞

ΛM L (k)e−j2πk/Q

1 F{sinc(π ˆ )}f =/Q , Q

where F{} denotes the Fourier transform. Without loss of generality, we only consider Q is even, in which case

 Q−1 Q 1 ΛM L (k)e−j2πk/Q ,  = − Q 2 , ..., 2 A = Q k=0 0 otherwise. From (13), it can be seen that once the coefficients A are determined, the timing delay  can be estimated by maximiz M L (ˆ ) for 0 ≤ ˆ ≤ 1. For efficient implementation, ing Λ  ΛM L (ˆ ) for 0 ≤ ˆ ≤ 1 can be approximated by a K-point ˆ for 0 ≤ kˆ ≤ K − 1, by zero sequence, denoted as ΛM L (k) padding the high frequencies coefficients of A and performing a K-point inverse Discrete Fourier Transform (IDFT). For ˆ becomes very close sufficiently large value of K, ΛM L (k)    ˆ ˆ to ΛM L ( ) for 0 ≤  ≤ 1, and the index with the maximum amplitude can be viewed as an improved estimate of the timing parameter  . To avoid the complexity in performing the K-point IDFT, an approximation4 is applied to (13). More precisely, extensive

1207

4A

similar approximation has been applied in [15], in a different context.

0-7803-8344-3/04/$20.00 © 2004 IEEE

simulations show that A±1 are much greater than A for || > 1, therefore,  M L (ˆ ) ≈ A0 + 2Re{A1 ej2πˆ } Λ

 SN  Ξ

k =0

for 0 ≤ ˆ ≤ 1, (15)

where Re{x} stands for real part of x. In order to maximize  M L (ˆ ), we notice the approximated log likelihood function Λ arg(A1 ) = −2π ˆ ,

(16)

N N Ξ

p(k  T /Q −  T )p(k  T /Q −  T )

k =0

and

ϕ(τ ) 

Q−1

(17)

k=0

The estimated delay ˆ is the time between the first sampling phase and the nearest optimum sampling instant. The calculation within the arg-operation is actually the 2nd output of an Q-point Discrete Fourier Transform (DFT) of the sequence (or the Fourier coefficient at symbol rate f = 1/T ). Note that the increase in complexity of the proposed algorithm in (17) with respect to that of optimum samples selection algorithm is only a Q-point DFT and an arg-operation. From the simulation results to be presented at next section, it is found that an oversampling factor Q of 4 is sufficient to yield good estimates in practical applications. Therefore, the 4-point DFT in (17) can be computed easily without any multiplications. This greatly reduces the arithmetic complexity of implementation. VI. P ERFORMANCE A NALYSIS A. Analytical Mean Square Error It is shown in [16] that the MSE of the proposed algorithm for a specific delay  is given by  2 1 Re{B} − D  2  ˆ , (18) E[( −  ) ] = − 2π Re{B} + D where 1 + M N j4π B  L2t e (ΞSS )2 N2  −1  −2  Es Es 2Lt j4π e + ΞSN + ej4π ΞN N , No N No 1 + MN D  L2t |ΞSS |2 N2  −1  −2 Es Es 2Lt  N N , Ξ ΞSN + + No N No with ΞSS  ΞSN 

Q−1 

k=0 Q−1  Q−1  k =0

ΞN N

p2 (kT /Q −  T )e−j2πk/Q ,

k =0

p(k  T /Q −  T )p(k  T /Q −  T ) 



·ϕ((k  − k  )T /Q)e−j2πk /Q e−j2πk /Q , Q−1  Q−1     ϕ2 ((k  − k  )T /Q)e−j2πk /Q e−j2πk /Q , k =0 k =0

WCNC 2004 / IEEE Communications Society





·ϕ((k  − k  )T /Q)ej2πk /Q e−j2πk /Q , Q−1  Q−1     ϕ2 ((k  − k  )T /Q)ej2πk /Q e−j2πk /Q , k =0 k =0

where arg(x) denotes the phase of x. Or equivalently,  1 ˆ = − arg{ ΛM L (k)e−j2πk/Q }. 2π

Q−1  Q−1 



−∞

gr (t)gr∗ (t + τ )dt.

Since the timing delay is assumed to be uniformly distributed, the average MSE will be calculated by numerical integration of (18). B. Simulation Results The MSE performances of the synchronization algorithm based on the optimum sample selection algorithm (7) and the proposed algorithms (17) are evaluated by Monte-Carlo simulations with each point obtained by averaging over 105 estimates. The timing offset  is generated to be uniformly distributed in the interval [−0.5, 0.5]. The channel coefficients hij are generated as complex Gaussian random variables with zero mean and a variance of 0.5 per dimension. The raised cosine pulse with excess bandwidth α = 0.3 is considered. The training sequences are generated following the procedures in Section IV with L = 4. The MSE are plotted against Es /No in Figure 3 for Q=4, 8 and 16 in a two-transmit, four-receive antenna system with Lt =32. The simulation results are shown by the markers, while the solid lines represents the theoretical MSE given in the last subsection. First, we note that the analytical MSE of the proposed algorithm (solid lines in the figure) match very well with the simulation results. Second, we note that for Q=8 and 16, the performance of the proposed algorithm is better than that of Q=4 at high Es /No . This can be explained by the fact that  M L (ˆ ) is a truncated version of |p(t)|2 , so Λ  M L (ˆ ) is no Λ  longer bandlimited. Therefore, ΛM L (m) would, in general, suffer from aliasing from the neighboring spectra. Increasing Q thus reduces the aliasing and improves the performance. Strictly speaking, Q should be at least equal to 16 in order  M L (m) without loss  M L (ˆ ) using its samples Λ to represent Λ of information. However, for Q=4, the MSE of the proposed algorithm reaches the order of 10−5 at medium and high Es /No , which is a reasonably good performance for most practical applications. Finally, simulation results show that the performances of the optimum sample selection algorithm are lower bounded by 1/(12Q2 ), and are significantly poorer than that of the proposed algorithm. For example, for Q=4, the MSE of the optimum sample selection algorithm at high Es /No is 5.2 × 10−3 , which is exactly the lower bound derived in Section III. For the proposed algorithm, the MSE at high Es /No is 3.5 × 10−5 , which is more than two orders of magnitude improvement with respect to that of the optimum sample selection algorithm.

1208

0-7803-8344-3/04/$20.00 © 2004 IEEE

Further results not presented here show that similar conclusions can be drawn when different number of antennas or different length of training sequences are used. Detailed performance analysis is reported in [16]. VII. C ONCLUSIONS A new symbol-timing delay estimator for ST coding systems has been proposed. It improves the optimum sample selection algorithm of Naguib et al. [4] such that accurate timing estimates are obtained even if the oversampling ratio is small (such as Q=4). The increase in implementation complexity with respect to optimum sample selection algorithm is very small. The requirements and the design procedure for the training sequences are discussed. It is shown that the MSE analytical expressions for the proposed algorithm match very well with the simulation results. Furthermore, analytical and simulation results show that the MSE of the proposed estimator is significantly smaller than that of the optimum sample selection algorithm.

Data

c1

L

Data

Lt

Data

L

c2

L

Data

Lt

L

Fig. 1. Structure of the training sequence for symbol timing synchronization in a two transmit antennas system.

3000

|p (t)| 2 2500

R EFERENCES

WCNC 2004 / IEEE Communications Society

2000

1500

1000

500

0 -0.5

-0.4

-0.3

-0.2

-0.1

0 t (T)

0.1

0.2

0.3

0.4

0.5

Fig. 2. An example of Λij (k) with the scaled version of |p(t)|2 for −T /2 ≤ t ≤ T /2 (dotted line).

10

10

10

MSE

[1] A. F. Naguib, N. Seshadri and A. R. Calderbank, “Increasing data rate over wireless channels,” IEEE Signal Processing Magazine, vol. 17, pp. 76-92, May 2000. [2] V. Tarokh, H. Jafarkhani and A. R. Calderbank, “Space-time block coding for wireless communications: performance results,” IEEE J. Select. Areas in Commun., vol. 17, pp. 451-460, Mar. 1999. [3] S. M. Alamouti, “A simple transmit diversity technique for wireless communications,” IEEE J. Select. Areas in Commun., vol. 16, pp. 14511458, Oct. 1998. [4] A. F. Naguib, V. Tarokh, N. Seshadri and A. R. Calderbank, “A spacetime coding modem for high-data-rate wireless communications,” IEEE J. Select. Areas in Commun., vol. 16, pp. 1459-1478, Oct. 1998. [5] V. Tarokh, N. Seshadri and A. R. Calderbank, “Space-time codes for high rate wireless communication: performance criterion and code construction,” IEEE Trans. Inform. Theory, vol. 44, pp. 744-765, Mar. 1998. [6] E. G. Larsson, P. Stoica and J. Li, “On the maximum-likelihood detection and decoding for space-time coding system,” IEEE Trans. Signal Processing, vol. 50, pp. 937-944, Apr. 2002. [7] H. E. Gamal, “On the robustness of space-time coding,” IEEE Trans. Signal Processing, vol. 50, pp. 2417-2428, Oct. 2002. [8] Z. Liu and G. B. Giannakis, “Space-time block coded multiple access through frequency selective fading channels,” IEEE Trans. Commun., vol. 49, pp. 1033-1045, June 2001. [9] A. R. Hammons Jr. and H. E. Gamal, “On the theory of space-time codes for PSK modulation,” IEEE Trans. Inform. Theory, pp. 524-542, Mar. 2000. [10] G. Yi and K. B. Letaief, “Performance evaluation and analysis of space-time coding in unequalized multipath fading links,” IEEE Trans. Commun., vol. 48, pp. 1778-1782, Nov 2000. [11] C. Fragouli, N. Al-Dhahir and W. Turin, “Finite-alphabet constantamplitude training sequence for multiple-antenna broadband transmission,” Proc. of ICC 2002, pp. 6-10. [12] C. Fragouli, N. Al-Dhahir and W. Turin, “Reduced-complexity training schemes for multiple-antenna broadband transmissions,” Proc. of WCNC 2002, pp. 78-83. [13] C. Fragouli, N. Al-Dhahir and W. Turin, “Training-based channel estimation for multiple-antenna broadband transmissions,”IEEE Trans. Wireless Commun., vol. 2, pp. 384-391, Mar 2003. [14] D. C. Chu, “Polyphase codes with good periodic correlation properties,” IEEE Trans. Inform. Theory, pp. 531-532, Jul. 1972. [15] M. Morelli, A. N. D’ Andrea and U. Mengali, “Feedforward ML-based timing estimation with PSK signals,” IEEE Commun. Letters, vol. 1, pp. 80-82, May 1997. [16] Y.-C. Wu, S. C. Chan, and E. Serpedin, “Symbol-Timing Estimation in Space-Time Coding Systems based on Orthogonal Training Sequences,” submitted to IEEE Trans. on Wireless Commun. (March 2003).

10

10

-1

-2

-3

-4

Optimum sample selection

-5

Proposed algorithm Q =4 Q =8

10

10

-6

Q=16

-7

-10

-5

0

5

10

15 20 E s/N o (dB)

25

30

35

40

Fig. 3. MSE performance for different oversampling ratio Q (N =2, M =4, Lt =32, α=0.3).

1209

0-7803-8344-3/04/$20.00 © 2004 IEEE

Suggest Documents