IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

1889

Continuous Flat-Fading MIMO Channels: Achievable Rate and Optimal Length of the Training and Data Phases Volker Pohl, Phuc Hau Nguyen, Volker Jungnickel, Member, IEEE, and Clemens von Helmolt

Abstract—In this paper, we propose a simple framework for evaluating the required repetition rate of channel estimation for multiple-input multiple-output (MIMO) systems in continuous slowly fading radio channels. An analytical formula for the interference due to the temporal variation of the channel coefficients is given and verified by link level simulations based on synthetic and measured impulse responses. The proposed interference model makes it possible to optimize the lengths of the training and data phases under different performance criteria. As an example, we investigate the bit-error rate performance of MIMO systems with zero-forcing detection and determine the time interval, after which the channel has to be estimated again in order to keep the error probability below a desired threshold. In the second part, the lengths of the training and data phases is optimized based on the information-theoretical capacity. It is shown that these lengths depend not only on the Doppler spread of the channel, but also on the antenna configuration and noise power. Index Terms—Continuous fading channels, estimation parameters, information rates, multiple-input multiple-output (MIMO) systems, wireless communications.

I. I NTRODUCTION

T

O RETRIEVE the transmitted signals in radio links with coherent detection, the channel coefficients have to be known. Often, they are obtained by separate channel estimation. Afterwards, a large data block is transmitted and reconstructed at the receiver based on the channel estimate. Slowly fading channels are often considered as time invariant over a coherence interval T (block fading channels). But in reality, the channel coefficients will change continuously after the initial estimation (continuous fading channels). Thus, the data are reconstructed based on invalid channel coefficients, resulting in a continuous increase of errors in the reconstructed data stream. Therefore, the channel estimation has to be repeated periodically. Estimating the channel too often will decrease the spectral efficiency and too little training, on the other hand, limits the achievable data rate by increasing the number of errors at the end of every Manuscript received September 4, 2003; revised February 3, 2004; accepted June 1, 2004. The editor coordinating the review of this paper and approving it for publication is D. Gesbert. This work was supported in part by the German Federal Ministry of Education and Research (BMBF) in the HyEff project under Grant 01 BU 150. V. Pohl, V. Jungnickel, and C. von Helmolt are with the Fraunhofer Institute for Telecommunications–Heinrich-Hertz-Institut, 10587 Berlin, Germany (e-mail: [email protected]). P. H. Nguyen is with Philips Semiconductors GmbH, 90443 Nürnberg, Germany (e-mail: phuc_hau.nguyen@philips.com). Digital Object Identifier 10.1109/TWC.2005.850325

data block. Furthermore, the quality of the channel estimate depends on the length of the training sequences and affects the achievable data rate as well. The main objective of this paper is to optimize the lengths of the training and data blocks in order to obtain a maximal throughput. The time variation entails that even if the channel is perfectly known at the beginning of the transmission phase, it will become more and more unknown as time proceeds. Moreover, the estimation error implies that there is always some uncertainty about the channel state. The general effect of imperfect channel knowledge on the capacity was studied in [1] and [2]. Analysis of the optimal training length for block fading channels was done in [3] and [4], taking into account the estimation error. Continuous fading channels were considered in [5] and [6], allowing for the time variation during the data transmission as well. In [5], the training and data phases were optimized to obtain a maximal throughput for a given bit error rate (BER) in a Vertical Bell Laboratories Layered Space-Time (V-BLAST) system. This was essentially done by means of computer simulations. The achievable information-theoretical capacity of multiple-input multiple-output (MIMO) systems with perfect interleaver was considered in [6]. There, the data rate was optimized with respect to the optimal sampling period of the channel via Monte Carlo simulations. Noncoherent MIMO communication systems in fast-fading channels were investigated in [7]. Our approach is similar to [4], where the estimation error was related to a loss in the SNR. In this paper, we extend this effective SNR by a term reflecting the continuously increasing difference between the channel estimate and the actual channel coefficients. This model for the variance of the channel uncertainty allows us to use the methods of [1], [2] and [4] to assess the impact on the capacity. Compared to the common block fading-channel model [3], [4], the continuous fading approach optimizes the length of the whole coherence interval T as well. The paper is organized as follows. Starting from a model of the time-varying radio channel, we derive, in Section II, a formula for the signal-to-interference-and-noise ratio (SINR) as a function of elapsed time since the last channel estimation. Our simulation tool and the channel measurements are briefly described in Section III. The derived SINR model is used in Section IV to investigate the time dependence of the BER of MIMO systems with zero-forcing (ZF) detection. The theoretical results, based on the proposed effective SINR, are compared with link-level simulations based on measured impulse

1536-1276/$20.00 © 2005 IEEE

1890

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

responses. In Section V, we use the effective SINR in conjunction with the information-theoretical capacity to optimize the lengths of the training and data phases, in order to obtain a maximal information rate. It is investigated how these lengths and the achievable rate depend on the Doppler spread, SNR, number of antennas, and channel condition. II. C HANNEL M ODEL AND I NTERFERENCE At the beginning, we derive a formula for the average signalto-interference ratio (SIR) due to the change in the channel coefficients since the last channel estimation. In the second part of this section, this interference is incorporated into the flatfading MIMO channel model. A. SIR Due to Changing Channel Coefﬁcients The time-varying baseband impulse response between two antennas is modeled as h(t, τ ) =

S

αi exp(j2πfi t)δ(τ − τi )

(1)

i=1

where αi , τi and fi are the complex amplitude, the delay time, and the Doppler frequency of the ith multipath component, respectively, and δ(t) denotes the Kronecker-delta function which is nonzero only at t = 0. We assume that αi are independent complex random variables with zero mean and a variance of σi2 (τi ) dependent on the delay τi . The Doppler frequencies fi are modeled as independent random numbers uniformly distributed on the interval [−fD , fD ], where fD denotes the maximum Doppler frequency (the Doppler spread of the channel). Such a flat Doppler spectrum is regarded as a good model for indoor channels (see, e.g., [8]). We consider only flat and slow-fading channels, i.e., we assume that all multipath components arrive at the receiver within the symbol duration TS and that TS 1/fD . The flat fading channel coefficient is thus h(t) =

S

αi exp(j2πfi t)

(2)

i=1

which is a zero mean random variable with the variance 2 σh2 = E |h|2 = σi .

(3)

i

Assume now that the channel is perfectly estimated at t = 0. (The inevitable estimation error is taken into account later, in Section II-B, as an additive noiselike term.) The difference between the actual h(t) and the measured h(t = 0) is then e(t) = h(t) − h(0) = αi [exp(j2πfi t) − 1] . (4) i

The error e(t) can be written as e(t) = Σαi zi , with the random numbers zi = [exp(j2πfi t) − 1]. All zi are independent but identically distributed. The corresponding density function can be determined from the given distribution of the Doppler

frequencies fi . In the following, only the expectation of |z|2 is needed, which is found to be E |z|2 (t) = 2 [1 − sin c(2πfD t)] ,

for fD t ≤

1 2

(5)

with the function sinc defined by sinc(x) := sin(x)/x. For small times t, (5) can be approximated by 4π 2 (fD t)2 , E |z|2 (t) ≈ 3

for fD t

1 . 2π

(6)

The received signal y at time t can now be written as y = hx + e(t)x

(7)

in which x is the transmitted symbol and h = h(0) is the measured channel coefficient. The average power S of the signal component h · x at the receiver is then S = E |h|2 · E |x|2 = σh2 Px (8) where Px = E[|x|2 ] is the average transmit power. For the average power I of the interference e(t)x, we obtain I = E |e|2 · Px = E Σ|αi |2 |zi |2 · Px = σh2 E |z|2 · Px (9) using the fact that αi is zero mean, that αi and zi are independent, and that E[|zi |2 ] = [|z|2 ] for all i. The average SIR as function of the Doppler spread fD and time t is then −1 SIR(fD t) = E |z|2 1 [1 − sin c(2πfD t)]−1 2 1 3 ≈ 2 4π (fD t)2 =

(10)

Thus, the interference power due to the outdated channel state information grows approximately quadratically with time (compare to [5], where a similar relation for the average SIR over the whole coherence block was found). Note that (10) does not depend on the signal power, so the SIR cannot be improved by simply increasing the transmit power. The channel coefficient h, as well as the interference e(t), depend on the multipath amplitudes αi . Nevertheless, both components can be regarded as independent. If we collect all αi in a vector α = [α1 , α2 , . . . , αS ]T ∈ CS , and all conjugate complex zi in a vector z = [z1∗ , z2∗ , . . . , zS∗ ]T ∈ CS , then h and e can be written as scalar products in CS : h = α, ι and e = α, z, where ι = [1, 1, . . . , 1]T denotes a vector whose components are all one. So, for a given α the random number e is apparently independent of h. Thus, we can assume that h is the fading channel coefficient and e(t)x is an independent random interference with a certain distribution. This distribution can be determined from the given channel model. It is not Gaussian, in general. Therefore, the exact calculation of the BER and capacity may become very complicated. However, it is well known that the worst distribution for the capacity that the independent, additive interference can

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

have is Gaussian [1], [4]. Therefore, we model this interference as additive Gaussian with the variance (9). In doing so, at least a worst case assessment of the real performance is obtained. Moreover, in MIMO channels (see the next section), the whole interference at each receive antenna is the sum of independent contributions with the same distribution, so it will tend towards a Gaussian as the number of transmit antennas increases by the central limit theorem.

B. MIMO-Channel Model Consider a multiantenna system with NRx and NTx antennas at the receiver (Rx) and transmitter (Tx), respectively (denoted as NRx × NTx system). We assume the following periodic transmission scheme. During a training phase, a different training sequence with Lτ ≥ NTx symbols is transmitted from each Tx antenna to obtain an estimate of the flat fading-channel coefficients. The length of this training phase is Tτ = Lτ · TS . After the training, the data-transmission phase of length Td follows, during which Ld = Td /TS data symbols are transmitted over each Tx antenna. These data are reconstructed at the Rx based on the estimated channel coefficients. The coherence interval1 of the system is T = Tτ + Td . Due to the periodically incorporated training symbols, the spectral efficiency is reduced by the factor Ld /(Lτ + Ld ). The transmission over the flat fading MIMO channel is modeled by y = Hx + E(t)x + Vx + n

(11)

in which the column-vectors x and y contain the NTx transmitted and NRx received complex symbols, respectively. The NRx × NTx matrix H contains the channel coefficients hj,i (t = 0) according to (2) estimated at t = 0. The additive white-Gaussian noise (AWGN) is represented by the vector n, which contains NRx independent zero-mean complex-Gaussian random variables, with variance σn2 . The NRx × NTx matrix V describes the estimation errors. We assume that a maximumlikelihood (ML) estimator with orthogonal sequences is used. Then, the NRx · NTx ML estimates are independent and unbiased with variance [3] σv2 =

NTx 2 σ ρτ Lτ h

(12)

in which ρτ is the average SNR at one Rx antenna during the training phase. The worst effect that the estimation error can have is to behave like AWGN [1]. Therefore, we model the entries vj,i of V as independent AWGN with variance (12). In so doing, a lower bound of the real performance is obtained. The interference due to the time-varying channel coefficients is taken into account by the matrix E(t), with independent entries ej,i (t) according to (4).

1891

The signal at one Rx antenna j = 1, . . . , NRx can thus be written as yj =

N Tx

[hj,i (t = 0)xi + ej,i (t)xi + vj,i xi ] + nj .

(13)

i=1

Since all hj,i (t), ej,i (t), and vj,i are independent, we obtain for the average signal and interference power at one Rx antenna S = NTx · E[|h|2 ] · Px and I = NTx · E[|e|2 ] · Px + NTx · σv2 · Px , respectively. Neglecting the interference, the average SNR at one Rx antenna during the data transmission is ρd =

NTx · σh2 · Px . σn2

(14)

Therewith, the average SINR at one Rx antenna becomes ρeﬀ

−1 1 S NTx 4π 2 2 (fD t) = = + + I + σn2 σd ρτ Lτ 3

(15)

using (12) and the approximation (6). Neglecting the first two terms in (15), due to the noise and the channel estimation errors, gives the SIR due to the time-varying channel alone. This SIR is independent of the antenna configuration. In particular, it is the same as for the single-input single-output (SISO) channel (10). In the derivation of (12), it is assumed that the channel is invariant during the channel training. Clearly, this is not true in real systems. The corresponding error may also be modeled by an equivalent SIR, which would yield one more term inside the brackets of (15), making the following calculation much more complex. Nevertheless, using the above model for the interference due to the time-varying channel (4), it can be shown (see Appendix I) that this additional error can be neglected as long as 1 fD TS < 2π

3 . ρτ L3τ

(16)

If this condition is fulfilled, the estimation error due to the temporal variation of the channel is smaller than the estimation error due to the noise. Clearly, this condition depends also on the SNR during the training ρτ and on the training length Lτ . In Section V, we will determine the optimal training length as a function of the SNR and the Doppler spread. It was checked that all results satisfy (16). Therefore, the assumption that this error can be neglected is adequate for our intentions. Before the effective SINR (15) is used to predict the BER in Section IV and to determine the optimal lengths of the training and data phases in Section V, we briefly describe our simulation tool. III. L INK -L EVEL S IMULATION

1 The coherence interval of the system must not be confused with the coherence time of the channel τc = 1/(2πfd ).

Data transmission over the MIMO channel was simulated using impulse responses obtained from an MIMO channel

1892

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

measurement campaign, as well as using time-varying impulse responses according to the model given in Section II. A. Measurements The measurements were done using a Medav RUSK ATM channel sounder adapted to MIMO [9]. It measures the radio channel between up to 16 Tx and 16 Rx antennas in a total time of about 1 ms. The channel sounder records the frequency response of every channel in the range of 5.14–5.26 GHz. From these frequency responses, we determined the impulse responses by inverse Fourier transform. The bandwidth of 120 MHz results in a time resolution of 8.3 ns. The time between two successive MIMO measurement snapshots was 10 ms. We used so-named “high-capacity antennas.” These antennas are characterized by good decorrelation between single antennas. For a detailed description of the measurement set-up and the antenna design, see [10]. The measurements were taken in a fully equipped 5 × 7 × 3 m3 laboratory. To obtain time-varying impulse responses, the Rx antenna array was held at a fixed position while the Tx array was moved at a velocity of about 0.3 m/s. B. Rayleigh Channel Model The time-varying impulse response between each pair of Tx and Rx antennas was modeled independently according to (1). An exponential power delay profile with a common decay time τ for all channels was assumed, i.e., the variances σi2 (τi ) of the amplitudes αi of the multipath components were chosen as σi2 (τi ) ∼ exp(τi /τ ). In all simulations τ = 25 ns was used, which is approximately the delay spread found in our measurements. We used the same time resolution for the model impulse responses as given from the measurements: 190 multipath components with equally spaced delays τi , with ∆τ = τi+1 − τi = 8.3 ns. Simulation results using this model were always averaged over at least 500 random channels. C. Simulations The symbol length TS in the simulated systems was chosen much larger than τ such that intersymbol interference caused no additional errors. At the Tx, the data were modulated using multilevel quadrature amplitude modulation (M-QAM). No pulse shaping was used. At the Rx, noise was added and the signals were sampled and reconstructed with an appropriate algorithm (e.g., ZF or V-BLAST [11]). At the end, the data were sent through an M-QAM decoder. No receive filter was employed, but it was checked that such an additional filter has a negligible influence on the results. The channel matrix was determined using training sequences of 128 symbols. Every inphase and quadrature component of these sequences consisted of a different Gold sequence. The channel coefficients were obtained by correlating the received signals with the known training sequences. The channel was estimated in intervals TM smaller than 1/fD . At the time instances in between, the data were reconstructed using the preceding channel estimation. At every simulated time instance, the actual channel was frozen

Fig. 1. Increasing BER after the channel estimation. Markers: simulation results using the channel model. Lines: theoretical expectation using the effective SINR (15).

and 104 symbols were transmitted over each Tx antenna to get the instantaneous bit-error probability. IV. BER A NALYSIS If the error probability as function of the SNR is known for a given communication system, the SINR function (15) can be inserted, and so the bit-error-probability as a function of the elapsed time t after the channel estimation is obtained. Fig. 1 shows BER simulations based on the channel model with fD = 10 Hz for an 8 × 6 MIMO system with 4-QAM modulation. The simulations were performed with ZF and V-BLAST detection and for different values of the SNR. The lines show theoretical graphs: For BLAST, the effective SINR (15) was inserted into a BER(SNR) function obtained by a normal BER simulation, whereas for ZF, the known analytical results from [13] was used. We observe that the temporal growth of the BER due to the time-varying channel can be predicted very well by the derived effective SINR. Similarly, good agreements were also found in other simulations with different Doppler frequencies, modulations, and antenna configurations [12]. At small times t, the BER is determined mainly by the noise. If time proceeds, the interference due to the temporal change of the channel coefficients slowly increases the BER. If the interference becomes larger than the noise, it dominates the BER performance. The channel estimation has to be repeated if the BER reaches a certain threshold BER0 . So, it becomes clear that detection schemes which need a lower SNR to obtain a certain BER have to estimate the channel less frequently than schemes which need a higher SNR. For instance, if we demand that the average BER should remain less than 10−3 in Fig. 1, the ZF system with an SNR of 20 dB has to estimate the channel in time intervals of fD t ≈ 0.025, whereas the corresponding interval for the V-BLAST system is fD t ≈ 0.058. In the following, we investigate MIMO systems with MQAM modulation and ZF detection using this method. Since we are mainly interested in the influence of the changing channel coefficients on the BER performance, we neglect the noise and the estimation error in (15) and use only the SIR (10) in the

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

1893

following analysis. We ask for the necessary time interval t0 between the channel estimations such that the BER due to the changing channel coefficients remains less than a given threshold BER0 of 10−5 . We ascertain how t0 depends on the diversity and the modulation level used. For MIMO systems with ZF detection operating in Rayleighfading channels, the BER as function of the average SNR is known analytically [13] (at least for BPSK and QPSK modulation). Inserting (10) in this formula, we obtain the BER as a function of time

1−µ BER = 2

L L−1 L − 1 + k 1 + µ k k 2

(17)

k=0

Fig. 2. Influence of the modulation on the time behavior of the BER. Markers: simulation results using the channel model with fD = 5 Hz. Lines: theoretical curves (17).

with µ = {1 + M NTx [1 − sin c(2πfD t)]}− 2 1

(17a)

where L = NRx − NTx + 1 is the diversity order of the system and M is a constant factor dependent on the modulation scheme, reflecting the fact that higher modulations need a better SNR for the same error probability. A good rule of thumb is M = 2k , where k is the number of bits per symbol. For BPSK (2-QAM) and 4-QAM, this rule is exact, i.e., MBPSK = 2 and M4−QAM = 4, respectively. For fD t 1/(2π), (17) can be approximated by BER(fD t) ≈

2L − 1 L

π2 M · NTx 6

L (fD t)2L

(18)

using (6) and the approximation for the BER given in [13]. So, it appears that the error rate increases with the 2Lth power of time initially after the channel estimation. Equation (18) shows how the BER depends on the Doppler spread fD , on the modulation level M , and on the diversity order L, in principle. A. Dependency on Doppler Spread According to (17) and (18), the Doppler spread fD scales the time axis t. So, all graphs can be plotted over the normalized time fD t. Fig. 1 already showed that the theoretical formula (17) fits the results from the simulations very well over the whole time range. B. Dependency on Modulation Fig. 2 shows simulation results (based on the channel model with fD = 5 Hz) for an 8 × 6 MIMO system with different modulations and the corresponding theoretical curves (17). Again, the simulations agree well with the theoretical graphs. We see that all curves have the same slope independent of the modulation. This slope is determined by the exponent 2L of the time t in (18). The graphs for higher modulation schemes are shifted towards higher BER values due to the factor M L in (18). Consequently, the time instance t0 when the BER(t) reaches the threshold √ BER0 depends on M . From (18), we read: fD t0 ∼ 1/ M . Thus, from the time t0 for one modulation M1 ,

Fig. 3. Dependency on modulation: simulations with measured impulse responses. The inset shows the Doppler power spectrum of the measurement data.

the corresponding time t0 for any other modulation M2 can be determined by M1 t0|M2 = · t0|M1 . (19) M2 From Fig. 2, for instance, we obtain fD t0 ≈ 0.023 for BPSK modulation (M = 2) and for BER0 = 10−5 . But the curve

for 64-QAM (M = 64) reaches BER0 already at fD t0 ≈ 2/64 · 0.023 ≈ 0.004. Generally, the channel has to be estimated more frequently if a higher modulation scheme is used. In Fig. 3, results from link level simulations based on measured time-varying impulse responses are plotted. Simulations were performed over 1 s total time, and channel estimation was done every 100 ms. The resulting ten BER(t) curves were averaged. The inset of Fig. 3 shows the average Doppler spectrum of the measurement data during this time. From this spectrum, we read the maximum Doppler frequency fD of about 5 Hz. For comparison, theoretical curves with fD = 5 Hz are also plotted in Fig. 3. We see that the simulated curves agree very well with the theoretical expectation even though the Doppler frequencies of the measured channel are not uniformly distributed, as assumed in the theoretical derivation and although only ten channels are averaged.

1894

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

starting level BER(t = 0), and so it takes a bit longer to reach the threshold BER0 (cf. Fig. 1). But if the corresponding gain in data transmission time is compensated by the longer training phase, the overall throughput may not be increased. Probably, there is an optimal length for the training block. Of course, the influence of the training length could be included in the BER analysis, but then all results would again be valid only for a special detection scheme (like ZF above). Therefore, we will investigate the optimal training and data lengths based on the information-theoretical capacity in Section V. V. C APACITY A NALYSIS Fig. 4. Dependency on antenna configuration: link-level simulations with measured impulse responses (markers), theoretical curves (17) (solid lines), and approximations (18) (dashed lines).

C. Dependency on Antenna Conﬁguration It is well known that spatial diversity improves the performance of radio systems. Thus, we expect that the time interval t0 is larger for systems with a higher diversity order L. Our expectation is confirmed by Fig. 4, in which the BER curves for different antenna configurations are shown. The markers indicate results obtained by link level simulation with the measured impulse responses already used in Fig. 3. The solid lines represent (17) with fD = 5 Hz and the dashed lines are the corresponding approximations (18). The slopes of the curves in Fig. 4 are determined by the exponent 2L of the time t in (18): The BER increases by 2L decades if the time increases by one decade. But nevertheless, the BER is smaller for higher values of L, in general, since fD t 1. From (18), it is clear that the BER depends on the number of Tx antennas as well. For a fixed diversity order L, the BER increases proportional to the Lth power of NTx . So, the √ necessary time interval fD t0 is proportional to fD t0 ∼ 1/ NTx . The larger the NTx , the more frequently the channel has to be estimated. All BER simulation results show a very good agreement with the theoretical curves based on the effective SINR (15). Hence, the proposed model for the interference due to the changing channel coefficients holds very well. Slight discrepancies are only found at small t (see Fig. 2), eventually due to the nonGaussian distribution of the interference e(t). Since fD t is limited to a finite region, the error e(t) is also limited to a small range [see (4)], e.g., if this range is smaller than half of the distance between two QAM symbols, no error will appear. This could explain why the BER found in the simulations without noise at small t is slightly less than predicted by the model with the Gaussian distributed interference. The above analysis of the BER gives a good estimate at which time interval the channel has to be measured again to keep the BER below a certain threshold. However, we have neglected the influence of the noise and the channel estimation errors. The time dependent BER curves start at a certain level BER(t = 0), which is determined by the SNR and the estimation errors. Using a longer estimation phase will reduce the

We insert the effective SINR ρeﬀ (t) (15) into the capacity formula for MIMO systems with channel state information (CSI) at the Rx and with additional white-Gaussian noise [14] ρeﬀ (Lτ , t) log2 1 + λi C(Lτ , t) = (20) NTx i where λi are the squares of the singular values of the channel matrix H. Due to ρeﬀ (t), the capacity depends on the time after the last channel estimation t, on the number of training symbols Lτ , on the Doppler spread fD , on the SNR during the estimation ρτ , and on the SNR during the data transmission ρd . In general, the powers used during the training and data phases have to be optimized as well. The optimization of ρτ and ρd for block fading channels was done in [4]. It was found that it is better to spend as little as possible time on training but to use more power during the training. A similar analysis could be applied here as well, but since the length of the coherence interval is not known at the beginning and since the effective SINR depends on the time as well, this analysis becomes slightly more involved and would go beyond the scope of this paper. Moreover, in practical systems [11], it may be more complicated to change the power between the training and data phases, since then higher requirements on the linearity of the components are needed. Therefore, we assume throughout this paper that ρτ = ρd . Due to the increasing difference between the actual channel coefficients and the CSI obtained by the previous estimation, the capacity (20) decreases monotonically with time. The total amount of information which can transmitted during the data transmission phase is Td D(Lτ , Td ) =

C(Lτ , t)dt.

(21)

0

But due to the channel estimation, spectral efficiency is lost such that the average information rate is only R(Lτ , Td ) =

D(Lτ , Td ) . Tτ + Td

(22)

In the following, we determine the optimal lengths of the training and data phases such that R is maximized. Optimization of the training length for block fading channels can be

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

1895

found in [3] and [4]. In such channels, it is assumed that the length of the coherence interval T is given such that the question is only how much of T should be spent on training. In continuous fading channels, we do not have this restriction of a fixed coherence interval, i.e., the training and data blocks can have any desired length, at least in principle. At a result of this, we will see that the optimal coherence interval T depends also on the SNR and on the antenna configuration. A. Optimization Over Td First, we assume that the number of training symbols Lτ is given. To find the optimal data length Td , the first derivative of R (22) with respect to Td is set to zero. This yields a quite intuitive equation for the optimal Td (Lτ ) (Tτ + Td ) · C(Lτ , Td ) = D(Lτ , Td ).

(23)

It is easily shown (see Appendix II) that the so-obtained Td is unique and that it really maximizes R. The maximal rate is obtained by inserting (23) into (22) Rmax (Lτ ) = C (Lτ , Td (Lτ ))

(24)

which is still a function of the training length Lτ . It says that at the end of the optimal data phase, the instantaneous capacity is equal to the rate of the whole coherence interval. So the loss of spectral efficiency due to the training phase is compensated by the higher capacity at the beginning of the data block. To reach this Rmax , the data rate would have to be changed continuously during the data transmission phase. Writing the equation for the optimal Td (23) in detail gives Nλ 1 ρeﬀ (Lτ , Td ) k · fD TS · Lτ ln 1 + λi 2 NTx i=1 =

fD Td fD Td ai arctan k − Nλ b arctan k ai b i=1

Nλ

(25)

with ai =

1 NTx λi + + and b = ρd ρτ Lτ NTx

1 NTx + ρd ρτ Lτ

Fig. 5. Optimal number of (a) data symbols and (b) achievable rate for a continuous fading SISO channel versus the number of training symbols.

for different Doppler spreads are merely shifted vertically. At small times t, ρeﬀ (t) is determined mainly by the SNR ρd . As time proceeds, ρeﬀ (t) is more and more dominated by the interference due to the time-varying channel, which yields a decrease in capacity. Clearly, at high SNR this interference dominates ρeﬀ (t) earlier than at lower SNR. Therefore, the optimal data phase is shorter at higher SNR. Fig. 5(b) shows that the achievable rate is still a function of the training length Lτ . Apparently, there is an optimal training length at which the rate becomes maximal. It depends on the SNR and on the normalized Doppler spread.

(25a)

√ and k = 2π/ 3. Nλ denotes the number of singular values, i.e., the rank of the channel. Equation (25) can be solved only numerically. The solution depends on the SNR, the√ normalized Doppler spread fD TS , and the singular values λi of the channel. Fig. 5(a) shows some solutions of (25) for an SISO channel (λ1 = 1), where Td was converted into “number of symbols” by Ld = Td /TS . In Fig. 5(b), the corresponding maximal rate (24) is plotted over the training length Lτ . Clearly, the optimal number of data symbols increases with the number of training symbols to maintain a high spectral efficiency. Also, it is intuitively clear that in slower fading channels, the data phase can be longer than in fast-fading channels such that the graphs

B. Optimization Over Lτ In order to optimize the maximal information rate with respect to the training length Lτ , (24) must be derived with respect to Lτ and set to zero. Since the optimal Td (Lτ ) is already given only by the implicit function (23), this yields complicated implicit expressions for the optimal Lτ . Therefore, we calculated the optimal length Ld and the maximal rate Rmax (Lτ ) as functions of Lτ as in Section V-A. From those results, the optimal training length Lτ which maximizes Rmax (Lτ ), the corresponding optimal data length Ld , and the maximal rate Rmax were extracted. First, we again consider only SISO channels. The top graph of Fig. 6 shows the optimal block lengths Ld and Lτ as a function of the SNR. In the bottom diagram, the corresponding

1896

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

Fig. 6. (a) Optimal number of training and data symbols and (b) the maximal achievable rate as function of the SNR for different Doppler spreads.

maximal achievable rates are shown. As explained above, the optimal number of training and data symbols decreases as the SNR increases. Both lengths are reduced almost equally as the SNR grows, such that the ratio Ld /(Lτ + Ld ) is nearly independent on the SNR. Note that at higher SNR values, the optimal training length becomes one, especially in fast fading channels. The training length cannot be reduced further, whereas the optimal data length continues to decrease as the SNR increases (note that in MIMO systems the minimal number of training symbols would be NTx ). Therefore, the spectral efficiency becomes worse at high SNR in fast-fading channels. The impact of this behavior on the achievable rate can be studied in the bottom graph of Fig. 6: In a timeinvariant channel, the rate increases linearly with the logarithm of the SNR. If the channel is varying in time, the slope decreases due to the loss in spectral efficiency by the periodic channel estimation. At some SNR value, the achievable rate begins to saturate due to the further reduced spectral efficiency because the training length has reached its minimal value. Therefore, in time-varying channels, there is an upper bound on the achievable rate even if the SNR approaches infinity. The graphs indicate that for fast-fading channels (fD TS 10−2 ), other transmission schemes, which account for the temporal variations during the data transmission, are necessary to reach a satisfactory capacity. On the upper graph of Fig. 7, it is shown how the optimal block lengths depend on the normalized Doppler spread fD TS of the channel. As fD TS increases, the optimal training and

Fig. 7. (a) Optimal number of training and data symbols and (b) maximal achievable rate for a SISO channel versus the normalized Doppler spread for different SNRs.

data lengths are reduced.2 But the number of data symbols Ld decreases faster than the number of training symbols Lτ such that the ratio Ld /(Lτ + Ld ) becomes worse as fD TS increases. Therefore, the maximal rate, which can be transmitted over a time-varying channel, becomes smaller as the channel fades faster. This is illustrated in Fig. 7(b). Again, we observe that at high Doppler spreads, the optimal training length becomes one, resulting in an even faster decrease in spectral efficiency as fD TS is increased further. Consequently, the achievable rate falls faster beyond this point (cf. Fig. 7). Nevertheless, the relative loss of rate at low Doppler frequencies is almost independent of the SNR: e.g., the rate at SNR = 40 dB drops to 90% of its maximal value at about fD TS ≈ 1 · 10−4 whereas the rate at SNR = 0 dB reaches 90% of its maximum at fD TS ≈ 3 · 10−4 . Note that the results here are somewhat different from the findings for block fading channels with fixed coherence interval T = Tτ + Td given in [4]. There, it was found that at sufficiently high SNR, the optimal length of the training phase becomes minimal (Lτ = NTx ) but as the SNR decreases, an increasing fraction of T has to be used for the training. If finally the SNR becomes sufficiently low, it will be optimal to spend half of T (Tτ = T /2) for training such that 50% of the capacity is lost compared to the case where the Rx knows 2 Interestingly, even thought the optimal L and L decrease, the block τ d lengths Td = Ld TS and Tτ = Lτ TS increase as fD TS rise.

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

1897

the channel. Our results show that the optimal coherence interval T = Tτ + Td depends also on the SNR. Both the optimal Tτ and Td increases as the SNR decreases. Therefore, the coherence interval T increases as the SNR decreases, whereas the ratio between the training and data phases remains nearly independent on the SNR (Fig. 6). This ratio is mainly determined by the Doppler spread of the channel (Fig. 7). Therefore, we have (unlike in block fading channels) essentially no loss in the spectral efficiency as the SNR becomes low. For example, we consider the results in Fig. 6 for fD TS = 1 · 10−5 : At SNR = 20 dB, the number of symbols in one coherence block is L = Lτ + Ld = 949, whereas at SNR = 0 dB, the coherence interval is five times longer L = 4567. The fraction Ld /L, on the other hand, is almost independent of the SNR (98.5% at SNR = 20 dB and 98.7% at 0 dB). C. MIMO Channels Up to now, we considered only SISO channels and investigated the influence of the SNR and the Doppler spread on the achievable rate and the optimal length of the training and data blocks. For MIMO channels, the dependence on the SNR and the Doppler spread is the same, in principle. Therefore, we concentrate in this section on the influence of the number of antennas and of the condition of the channel. The capacity (20) is written as the sum of the capacities of several√SISO subchannels, each is characterized by a singular value λi of the MIMO channel. Thus, the length of the optimal training and data blocks may depend on the singular values of the actual channel. To investigate the influence of different channels, we use the following common Ricean model [15], [16] for the flat-fading MIMO channel matrix H H=

1 HRay + K +1

K Hspec K +1

(26)

where HRay represents the Rayleigh (scattered) component. Its entries are independent complex-Gaussian random numbers with zero mean and unit variance. The matrix Hspec represents a specular (deterministic) component. All its entries are equal to one. All entries of H have unit variance. The Ricean-factor K (or K-factor) is defined as the ratio of deterministic-to-scattered power. K = 0 is a purely Rayleigh fading channel, whereas K → ∞ models a line-of-sight channel. We used this model to generate, for a given K, a large number (105 ) of random channel matrices H. For every single H, we determined the squares of the singular values λi . The means of the λi were then used in (25) as a representation of an MIMO channel with the given K-factor. The following results are obtained for “the average channel,” but they are not average results for Ricean channels. Initially, we consider MIMO channels with the same number of antennas at the Tx and Rx (NRx = NTx = N ). Fig. 8 shows the optimal number of training and data symbols (top) and the maximal achievable rate (bottom) versus the number of antennas N for an SNR of 10 dB. We observe that the block lengths Lτ and Ld increase slightly with increasing N . Nevertheless, Lτ increases a little bit faster than Ld resulting in a decreasing

Fig. 8. N × N MIMO channel with different K-factors. (a) Optimal number of training and data symbols. Graphs for different K-factors (K = 0, 5, 50) lie almost on top of each other. (b) Maximal achievable rate.

spectral efficiency. It is also observed that the optimal lengths are nearly independent of the K-factor. The maximal rate scales nearly linearly with the number of antennas N but the slope becomes smaller as the Doppler spread increases as a result of the reduced spectral efficiency at higher N . The principal behavior is the same for different K-factors apart from the well known fact that the capacity decreases with increasing K-factor. Now, we investigate the influence of receive diversity due to additional Rx antennas. Consider an MIMO system with four antennas at the Tx (NTx = 4). Using more Rx antennas (NRx > NTx ) increases the receive signal power and the singular values of the channel. By (20), this enhances the capacity of the channel as well. But what influence has the surplus of Rx antennas on the optimal block lengths and the achievable rate? The upper graph of Fig. 9 shows that the optimal block lengths are nearly independent of the number of Rx antennas. Therefore, the reduced spectral efficiency, due to the necessary channel training, is also independent on NRx . This results in a constantly smaller data rate (compared with a known channel) dependent on the Doppler spread of the channel as illustrated by Fig. 9(b). Again, the K-factor has almost no impact on the principal results. VI. S UMMARY We have investigated continuous flat-fading MIMO channels with training based channel estimation. The continuously

1898

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

Altogether, the optimal block lengths depend on the performance measure used (e.g., bit-error probability, capacity). However, the simple time-dependent SINR model proposed in this paper has successfully been applied under different performance measures and so it may be a useful tool for the optimization of radio systems in continuous fading channels. A PPENDIX I In this appendix, we give an upper bound for the estimation error due to the time-varying channel and compare it with the estimation error due to the noise. We assume that the receive signal during the training phase at the jth Rx antenna yj can be written as in (13) without the term vj,i · xi , which already models the estimation error. We assume further that orthogonal training sequences si = [si (0), si (1), . . . , si (Lτ − 1)]T of length Lτ and with sH i si = Lτ are transmitted at the same time over the Tx antennas i = 1, . . . , NTx . An estimate of the channel coefficient hj,k is obtained by an ML estimator from the Lτ consecutively received symbols yj (m); m = 1, . . . , Lτ − 1 by L τ −1 ˆ j,k = 1 s∗ (m)yj (m) = hj,k + ∆hDoppler + ∆hNoise h Lτ m=0 k (27)

Fig. 9. MIMO channel with four Tx antennas. (a) Optimal number of training and data symbols. The graphs for different K-factors (K = 0, 5, 50) lie almost on top of each other. (b) Maximal achievable rate.

increasing difference between the actual channel coefficients and the previous channel estimate can be translated into an effective loss in signal-to-interference ratio. We found that this interference power increases almost quadratically with time and with the Doppler spread of the channel. Based on this interference model, the continuously increasing bit-error probability of MIMO systems after the channel estimation can be predicted very well. This was verified by link level simulations using impulse responses according to a wellestablished channel model as well as with impulse responses obtained by an MIMO channel-sounding campaign. The time interval after which the channel has to be estimated again in order to keep the BER below a desired threshold strongly depends on the actual system configuration such as modulation scheme, detection algorithm, and number of antennas. In the second part, we used the proposed interference model for the analytical optimization of the achievable information rate by adjusting the lengths of the training and data phases. These optimal lengths depend not only on the Doppler spread of the channel but also on the noise power and on the number of transmit antennas in the MIMO system. The number of receive antennas and the actual channel condition, on the other side, only have a negligible influence on the optimal block lengths. It was shown that the optimal length of the coherence interval increases as the SNR decreases whereas, unlike block fading channels, the ratio between the length of the training and data phases is almost independent of the SNR.

For the estimation errors ∆hDoppler and ∆hNoise due to the time variation of the channel and due to the thermal noise, respectively, we obtain ∆hNoise =

∆hDoppler =

Lτ −1 1 s∗ (m)nj (m) Lτ m=0 k NTx 1 sk , si j,i Lτ i=1

(28)

with sk , si j,i =

L τ −1

s∗k (m)ej,i (m · TS )si (m)

(29)

m=0

using the channel model and the notations of Section II. Note that we omit the subscripts j,k in the notation of ∆hDoppler and ∆hNoise and that ·, ·j,i is only a short-hand notation and does not denote a scalar product. Both errors are independent, and we can calculate the average powers separately. For ∆hNoise , we easily obtain (12) 1 2 NTx 2 2 σ∆Noise = E |∆hNoise |2 = σn = σ . Lτ ρτ Lτ h

(30)

The calculation for ∆hDoppler is slightly more intricate 2 σ∆Doppler = E |∆hDoppler |2 =

NTx 1 E sk , si j,i sk , sl ∗j,l . 2 Lτ i,l=1

(31)

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

The expectations E are always taken over several channel realizations. Because of the statistical independence of the different impulse responses and since E[ej,i (t)] = 0 for all t, we have E[sk , si j,i sk , sl ∗j,l ] = 0 for i = l. In the case i = l, we obtain E sk , si j,i sk , si ∗j,i =

L τ −1

s∗k (m)si (n) · E ej,i (mTS )e∗j,i (nTS ) · si (m)s∗i (n)

m,n=0 τ −1

L s∗k (m)sk (n)si (m)s∗i (n) ≤ E |ej,i (Lτ TS )|2

1899

where the last inequality holds because of the monotonic decreasing C. Therefore, (34) is a contradiction for Td2 > Td1 . This proves that (23) has no more than one solution. The second derivative of the rate R (22), with respect to Td at the local extremum (23), gives ∂2R ∂C 1 = Td1 . Subtracting both (23) from one another gives Tτ [C(Td2 ) − C(Td1 )] =

Td2 C(t)dt + Td1 C(Td1 ) − Td2 C(Td2 ) Td1

= [Td2 − Td1 ] C(TM ) + Td1 C(Td1 ) − Td2 C(Td2 )

(34)

where the second equation was obtained by the mean value theorem with Td1 ≤ TM ≤ Td2 . The left-hand side of this equation is strictly negative (because C is strictly monotonic decreasing in t) whereas the right-hand side is positive. This can be seen by replacing C(Td1 ) by the smaller value C(Td2 ) [Td2 − Td1 ]C(TM ) + Td1 C(Td1 ) − Td2 C(Td2 ) ≥ [Td2 − Td1 ] · [C(TM ) − C(Td2 )] ≥ 0 (35)

ACKNOWLEDGMENT The authors wish to thank W. Wirnitzer, D. Brückner and S. Warzügel from MEDAV GmbH for providing the MIMO channel sounder and for the assistance during the measurements. R EFERENCES [1] M. Medard, “The effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel,” IEEE Trans. Inf. Theory, vol. 46, no. 3, pp. 933–946, May 2000. [2] A. Lapidoth and S. Shamai, “Fading channels: How perfect need ‘Perfect Side Information’ be?,” IEEE Trans. Inf. Theory, vol. 48, no. 3, pp. 1118– 1134, May 2002. [3] T. L. Marzetta, “BLAST training: Estimating channel characteristics for high capacity space–time wireless,” in Proc. 37th Annual Allerton Conf. Communications, Control, and Computing, Monticello, IL, 1999, pp. 958–966. [4] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-antenna wireless links?,” IEEE Trans. Inf. Theory, vol. 49, no. 4, pp. 951–963, Apr. 2003. [5] Q. Sun, D. C. Cox, H. C. Huang, and A. Lozano, “Estimation of continuous flat fading MIMO channels,” IEEE Trans. Wireless Commun., vol. 1, no. 4, pp. 549–553, Oct. 2002. [6] J. Baltersee, G. Fock, and H. Meyr, “Achievable rate of MIMO channels with data-aided channel estimation and perfect interleaving,” J. Sel. Areas Commun., vol. 19, no. 12, pp. 2358–2368, Dec. 2001. [7] L. Zheng and D. Tse, “Communication on the Grassman manifold: A geometric approach to the noncoherent multiple-antenna channel,” IEEE Trans. Inf. Theory, vol. 48, no. 2, pp. 359–383, Feb. 2002. [8] B. Sklar, “Rayleigh fading channels in mobile digital communication systems, Part 1: Characterization,” IEEE Commun. Mag., vol. 35, no. 9, pp. 136–146, Sep. 1997. [9] W. Wirnitzer, D. Brückner, R. S. Thomä, G. Sommerkorn, and D. Hampicke, “Broadband vector channel sounder for MIMO channel measurements,” in Proc. IEE Seminar on MIMO Communication Systems, London, U.K., 2001, pp. 17/1–17/4. [10] V. Jungnickel, V. Pohl, H. Nguyen, U. Krüger, T. Haustein, and C. von Helmolt, “High capacity antennas for MIMO radio systems,” in Proc. 5th Wireless Personal Multimedia Communications (WPMC), Honolulu, HI, 2002, pp. 407–411. [11] G. D. Golden, C. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky, “Detection algorithm and initial laboratory results using V-BLAST space–time communication architecture,” Electron. Lett., vol. 35, no. 1, pp. 14–16, Jan. 1999. [12] V. Pohl, P. H. Nguyen, V. Jungnickel, and C. von Helmolt, “How often channel estimation is needed in MIMO systems,” in Proc. IEEE Global Communications (GLOBECOM), San Francisco, CA, 2003, pp. 814–818. [13] J. G. Proakis, Digital Communications. New York: McGraw-Hill, 1983. [14] E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans. Telecommun., vol. 10, no. 6, pp. 585–595, Nov. 1999. [15] S. O. Rice, “Mathematical analysis of random noise,” Bell Syst. Tech. J., vol. 23, no. 3, pp. 282–332, Jul. 1944. [16] ——, “Mathematical analysis of random noise,” Bell Syst. Tech. J., vol. 24, no. 1, pp. 46–156, Jan. 1945.

1900

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

Volker Pohl was born in Dresden, Germany, in 1972. He received the Dipl.Ing. degree in electrical engineering from the Technische Universität Berlin, Berlin, Germany, in 2000. He became a Certified Technician for Electromechanics with Mertik GmbH, Quedlinburg, Germany in 1992. In 1998, he joined the HeinrichHertz-Institut für Nachrichtentechnik (HHI), Berlin, Germany, as a Student Research Associate. Initially, he was involved in the development of electroluminescence color displays. Then he transferred to the broad-band mobile communications networks department, where his research was concerned with the development of wireless infrared systems, infrared channel modeling, in particular. At present, he is a Research Associate at HHI. While working towards his Ph.D., he was concerned with the modeling and measurement of MIMO radio channels and equalization.

Phuc Hau Nguyen was born in Ben Tre, Vietnam, in 1972. He received the Dipl. Ing. degree in electrical engineering from the Technische Universität Berlin, Berlin, Germany, in 2004. He was a Student Employee in 2000 at the Research Institute for Open Communication Systems GMD Fokus, Berlin, Germany. In 2001, he joined the Broadband Mobile Communications Networks Department of the Heinrich-Hertz-Institute (HHI) for Telecommunication, Berlin, Germany as a Student Research Associate, where he developed a link-level simulation tool for MIMO radio channels and was involved in research on MIMO channel modeling. At present, he is with the Algorithms & Firmware Department, Philips Semiconductors GmbH, Nürnberg, Germany, where he is concerned with the development of UMTS Systems at DSP platforms.

Volker Jungnickel (M’00) was born in Großenhain, Germany, in 1964. He received the Dipl.Phys. and Dr.rer.nat. degrees in physics from Humboldt University, Berlin, Germany, in 1992 and 1995, respectively. In 1984, he became a Certified Optician with Carl Zeiss, Jena, Germany. He has been concerned with research on photoluminescence properties of quantum dots and the mechanism of electron–photon coupling in strong confining zero-dimensional semiconductors. In 1995, he joined the ILS GmbH, Stansdorf, Germany, working in the field of medical laser technology. In 1997, he joined the Broadband Mobile Communication Networks Department, HeinrichHertz-Institut für Nachrichtentechnik, Berlin, Germany, where he developed a high-speed wireless system for indoor communication based on infrared. At present, his research is focused on multiple-input multiple-output radio systems for high-speed wireless communication. He has authored and coauthored about 40 conference and journal papers and holds two patents. Dr. Jungnickel is a member of the German Physical Society.

Clemens von Helmolt was born in Berlin, Germany, in 1952. He received the Dipl.Ing. and Dr. Ing. degrees in electrical engineering from the Technische Universität Berlin, Berlin, Germany, in 1979 and 1985, respectively. From 1979 to 1984, he was a Research Associate at the Institut für Hochfrequenztechnik, Technische Universität Berlin, where he worked in the field of acoustooptic interaction in optical strip waveguides. In 1984, he joined the Heinrich-Hertz-Institut für Nachrichtentechnik (HHI), Berlin, Germany, where he was engaged in research on LiNbO3 devices for coherent receivers until the end of 1987. From 1988 to 1998, he worked in the fields of optical frequency stabilization of optical heteroyne and WDM broadband communication systems, where he coordinated the HHI research activities related to the European projects “RACE-1010,” “RACE-2065,” and “ACT-084.” From 1996 to 2000, he was responsible for a national research project related to broad-band mobile indoor communication based on infrared. Since November 2000, he has been managing research projects related to MIMO techniques and systems for RF multielement antenna mobile communication. He has authored and coauthored more than 40 journal publications and conference presentations each, has made several contributions to books, and holds several European and US patents.

1889

Continuous Flat-Fading MIMO Channels: Achievable Rate and Optimal Length of the Training and Data Phases Volker Pohl, Phuc Hau Nguyen, Volker Jungnickel, Member, IEEE, and Clemens von Helmolt

Abstract—In this paper, we propose a simple framework for evaluating the required repetition rate of channel estimation for multiple-input multiple-output (MIMO) systems in continuous slowly fading radio channels. An analytical formula for the interference due to the temporal variation of the channel coefficients is given and verified by link level simulations based on synthetic and measured impulse responses. The proposed interference model makes it possible to optimize the lengths of the training and data phases under different performance criteria. As an example, we investigate the bit-error rate performance of MIMO systems with zero-forcing detection and determine the time interval, after which the channel has to be estimated again in order to keep the error probability below a desired threshold. In the second part, the lengths of the training and data phases is optimized based on the information-theoretical capacity. It is shown that these lengths depend not only on the Doppler spread of the channel, but also on the antenna configuration and noise power. Index Terms—Continuous fading channels, estimation parameters, information rates, multiple-input multiple-output (MIMO) systems, wireless communications.

I. I NTRODUCTION

T

O RETRIEVE the transmitted signals in radio links with coherent detection, the channel coefficients have to be known. Often, they are obtained by separate channel estimation. Afterwards, a large data block is transmitted and reconstructed at the receiver based on the channel estimate. Slowly fading channels are often considered as time invariant over a coherence interval T (block fading channels). But in reality, the channel coefficients will change continuously after the initial estimation (continuous fading channels). Thus, the data are reconstructed based on invalid channel coefficients, resulting in a continuous increase of errors in the reconstructed data stream. Therefore, the channel estimation has to be repeated periodically. Estimating the channel too often will decrease the spectral efficiency and too little training, on the other hand, limits the achievable data rate by increasing the number of errors at the end of every Manuscript received September 4, 2003; revised February 3, 2004; accepted June 1, 2004. The editor coordinating the review of this paper and approving it for publication is D. Gesbert. This work was supported in part by the German Federal Ministry of Education and Research (BMBF) in the HyEff project under Grant 01 BU 150. V. Pohl, V. Jungnickel, and C. von Helmolt are with the Fraunhofer Institute for Telecommunications–Heinrich-Hertz-Institut, 10587 Berlin, Germany (e-mail: [email protected]). P. H. Nguyen is with Philips Semiconductors GmbH, 90443 Nürnberg, Germany (e-mail: phuc_hau.nguyen@philips.com). Digital Object Identifier 10.1109/TWC.2005.850325

data block. Furthermore, the quality of the channel estimate depends on the length of the training sequences and affects the achievable data rate as well. The main objective of this paper is to optimize the lengths of the training and data blocks in order to obtain a maximal throughput. The time variation entails that even if the channel is perfectly known at the beginning of the transmission phase, it will become more and more unknown as time proceeds. Moreover, the estimation error implies that there is always some uncertainty about the channel state. The general effect of imperfect channel knowledge on the capacity was studied in [1] and [2]. Analysis of the optimal training length for block fading channels was done in [3] and [4], taking into account the estimation error. Continuous fading channels were considered in [5] and [6], allowing for the time variation during the data transmission as well. In [5], the training and data phases were optimized to obtain a maximal throughput for a given bit error rate (BER) in a Vertical Bell Laboratories Layered Space-Time (V-BLAST) system. This was essentially done by means of computer simulations. The achievable information-theoretical capacity of multiple-input multiple-output (MIMO) systems with perfect interleaver was considered in [6]. There, the data rate was optimized with respect to the optimal sampling period of the channel via Monte Carlo simulations. Noncoherent MIMO communication systems in fast-fading channels were investigated in [7]. Our approach is similar to [4], where the estimation error was related to a loss in the SNR. In this paper, we extend this effective SNR by a term reflecting the continuously increasing difference between the channel estimate and the actual channel coefficients. This model for the variance of the channel uncertainty allows us to use the methods of [1], [2] and [4] to assess the impact on the capacity. Compared to the common block fading-channel model [3], [4], the continuous fading approach optimizes the length of the whole coherence interval T as well. The paper is organized as follows. Starting from a model of the time-varying radio channel, we derive, in Section II, a formula for the signal-to-interference-and-noise ratio (SINR) as a function of elapsed time since the last channel estimation. Our simulation tool and the channel measurements are briefly described in Section III. The derived SINR model is used in Section IV to investigate the time dependence of the BER of MIMO systems with zero-forcing (ZF) detection. The theoretical results, based on the proposed effective SINR, are compared with link-level simulations based on measured impulse

1536-1276/$20.00 © 2005 IEEE

1890

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

responses. In Section V, we use the effective SINR in conjunction with the information-theoretical capacity to optimize the lengths of the training and data phases, in order to obtain a maximal information rate. It is investigated how these lengths and the achievable rate depend on the Doppler spread, SNR, number of antennas, and channel condition. II. C HANNEL M ODEL AND I NTERFERENCE At the beginning, we derive a formula for the average signalto-interference ratio (SIR) due to the change in the channel coefficients since the last channel estimation. In the second part of this section, this interference is incorporated into the flatfading MIMO channel model. A. SIR Due to Changing Channel Coefﬁcients The time-varying baseband impulse response between two antennas is modeled as h(t, τ ) =

S

αi exp(j2πfi t)δ(τ − τi )

(1)

i=1

where αi , τi and fi are the complex amplitude, the delay time, and the Doppler frequency of the ith multipath component, respectively, and δ(t) denotes the Kronecker-delta function which is nonzero only at t = 0. We assume that αi are independent complex random variables with zero mean and a variance of σi2 (τi ) dependent on the delay τi . The Doppler frequencies fi are modeled as independent random numbers uniformly distributed on the interval [−fD , fD ], where fD denotes the maximum Doppler frequency (the Doppler spread of the channel). Such a flat Doppler spectrum is regarded as a good model for indoor channels (see, e.g., [8]). We consider only flat and slow-fading channels, i.e., we assume that all multipath components arrive at the receiver within the symbol duration TS and that TS 1/fD . The flat fading channel coefficient is thus h(t) =

S

αi exp(j2πfi t)

(2)

i=1

which is a zero mean random variable with the variance 2 σh2 = E |h|2 = σi .

(3)

i

Assume now that the channel is perfectly estimated at t = 0. (The inevitable estimation error is taken into account later, in Section II-B, as an additive noiselike term.) The difference between the actual h(t) and the measured h(t = 0) is then e(t) = h(t) − h(0) = αi [exp(j2πfi t) − 1] . (4) i

The error e(t) can be written as e(t) = Σαi zi , with the random numbers zi = [exp(j2πfi t) − 1]. All zi are independent but identically distributed. The corresponding density function can be determined from the given distribution of the Doppler

frequencies fi . In the following, only the expectation of |z|2 is needed, which is found to be E |z|2 (t) = 2 [1 − sin c(2πfD t)] ,

for fD t ≤

1 2

(5)

with the function sinc defined by sinc(x) := sin(x)/x. For small times t, (5) can be approximated by 4π 2 (fD t)2 , E |z|2 (t) ≈ 3

for fD t

1 . 2π

(6)

The received signal y at time t can now be written as y = hx + e(t)x

(7)

in which x is the transmitted symbol and h = h(0) is the measured channel coefficient. The average power S of the signal component h · x at the receiver is then S = E |h|2 · E |x|2 = σh2 Px (8) where Px = E[|x|2 ] is the average transmit power. For the average power I of the interference e(t)x, we obtain I = E |e|2 · Px = E Σ|αi |2 |zi |2 · Px = σh2 E |z|2 · Px (9) using the fact that αi is zero mean, that αi and zi are independent, and that E[|zi |2 ] = [|z|2 ] for all i. The average SIR as function of the Doppler spread fD and time t is then −1 SIR(fD t) = E |z|2 1 [1 − sin c(2πfD t)]−1 2 1 3 ≈ 2 4π (fD t)2 =

(10)

Thus, the interference power due to the outdated channel state information grows approximately quadratically with time (compare to [5], where a similar relation for the average SIR over the whole coherence block was found). Note that (10) does not depend on the signal power, so the SIR cannot be improved by simply increasing the transmit power. The channel coefficient h, as well as the interference e(t), depend on the multipath amplitudes αi . Nevertheless, both components can be regarded as independent. If we collect all αi in a vector α = [α1 , α2 , . . . , αS ]T ∈ CS , and all conjugate complex zi in a vector z = [z1∗ , z2∗ , . . . , zS∗ ]T ∈ CS , then h and e can be written as scalar products in CS : h = α, ι and e = α, z, where ι = [1, 1, . . . , 1]T denotes a vector whose components are all one. So, for a given α the random number e is apparently independent of h. Thus, we can assume that h is the fading channel coefficient and e(t)x is an independent random interference with a certain distribution. This distribution can be determined from the given channel model. It is not Gaussian, in general. Therefore, the exact calculation of the BER and capacity may become very complicated. However, it is well known that the worst distribution for the capacity that the independent, additive interference can

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

have is Gaussian [1], [4]. Therefore, we model this interference as additive Gaussian with the variance (9). In doing so, at least a worst case assessment of the real performance is obtained. Moreover, in MIMO channels (see the next section), the whole interference at each receive antenna is the sum of independent contributions with the same distribution, so it will tend towards a Gaussian as the number of transmit antennas increases by the central limit theorem.

B. MIMO-Channel Model Consider a multiantenna system with NRx and NTx antennas at the receiver (Rx) and transmitter (Tx), respectively (denoted as NRx × NTx system). We assume the following periodic transmission scheme. During a training phase, a different training sequence with Lτ ≥ NTx symbols is transmitted from each Tx antenna to obtain an estimate of the flat fading-channel coefficients. The length of this training phase is Tτ = Lτ · TS . After the training, the data-transmission phase of length Td follows, during which Ld = Td /TS data symbols are transmitted over each Tx antenna. These data are reconstructed at the Rx based on the estimated channel coefficients. The coherence interval1 of the system is T = Tτ + Td . Due to the periodically incorporated training symbols, the spectral efficiency is reduced by the factor Ld /(Lτ + Ld ). The transmission over the flat fading MIMO channel is modeled by y = Hx + E(t)x + Vx + n

(11)

in which the column-vectors x and y contain the NTx transmitted and NRx received complex symbols, respectively. The NRx × NTx matrix H contains the channel coefficients hj,i (t = 0) according to (2) estimated at t = 0. The additive white-Gaussian noise (AWGN) is represented by the vector n, which contains NRx independent zero-mean complex-Gaussian random variables, with variance σn2 . The NRx × NTx matrix V describes the estimation errors. We assume that a maximumlikelihood (ML) estimator with orthogonal sequences is used. Then, the NRx · NTx ML estimates are independent and unbiased with variance [3] σv2 =

NTx 2 σ ρτ Lτ h

(12)

in which ρτ is the average SNR at one Rx antenna during the training phase. The worst effect that the estimation error can have is to behave like AWGN [1]. Therefore, we model the entries vj,i of V as independent AWGN with variance (12). In so doing, a lower bound of the real performance is obtained. The interference due to the time-varying channel coefficients is taken into account by the matrix E(t), with independent entries ej,i (t) according to (4).

1891

The signal at one Rx antenna j = 1, . . . , NRx can thus be written as yj =

N Tx

[hj,i (t = 0)xi + ej,i (t)xi + vj,i xi ] + nj .

(13)

i=1

Since all hj,i (t), ej,i (t), and vj,i are independent, we obtain for the average signal and interference power at one Rx antenna S = NTx · E[|h|2 ] · Px and I = NTx · E[|e|2 ] · Px + NTx · σv2 · Px , respectively. Neglecting the interference, the average SNR at one Rx antenna during the data transmission is ρd =

NTx · σh2 · Px . σn2

(14)

Therewith, the average SINR at one Rx antenna becomes ρeﬀ

−1 1 S NTx 4π 2 2 (fD t) = = + + I + σn2 σd ρτ Lτ 3

(15)

using (12) and the approximation (6). Neglecting the first two terms in (15), due to the noise and the channel estimation errors, gives the SIR due to the time-varying channel alone. This SIR is independent of the antenna configuration. In particular, it is the same as for the single-input single-output (SISO) channel (10). In the derivation of (12), it is assumed that the channel is invariant during the channel training. Clearly, this is not true in real systems. The corresponding error may also be modeled by an equivalent SIR, which would yield one more term inside the brackets of (15), making the following calculation much more complex. Nevertheless, using the above model for the interference due to the time-varying channel (4), it can be shown (see Appendix I) that this additional error can be neglected as long as 1 fD TS < 2π

3 . ρτ L3τ

(16)

If this condition is fulfilled, the estimation error due to the temporal variation of the channel is smaller than the estimation error due to the noise. Clearly, this condition depends also on the SNR during the training ρτ and on the training length Lτ . In Section V, we will determine the optimal training length as a function of the SNR and the Doppler spread. It was checked that all results satisfy (16). Therefore, the assumption that this error can be neglected is adequate for our intentions. Before the effective SINR (15) is used to predict the BER in Section IV and to determine the optimal lengths of the training and data phases in Section V, we briefly describe our simulation tool. III. L INK -L EVEL S IMULATION

1 The coherence interval of the system must not be confused with the coherence time of the channel τc = 1/(2πfd ).

Data transmission over the MIMO channel was simulated using impulse responses obtained from an MIMO channel

1892

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

measurement campaign, as well as using time-varying impulse responses according to the model given in Section II. A. Measurements The measurements were done using a Medav RUSK ATM channel sounder adapted to MIMO [9]. It measures the radio channel between up to 16 Tx and 16 Rx antennas in a total time of about 1 ms. The channel sounder records the frequency response of every channel in the range of 5.14–5.26 GHz. From these frequency responses, we determined the impulse responses by inverse Fourier transform. The bandwidth of 120 MHz results in a time resolution of 8.3 ns. The time between two successive MIMO measurement snapshots was 10 ms. We used so-named “high-capacity antennas.” These antennas are characterized by good decorrelation between single antennas. For a detailed description of the measurement set-up and the antenna design, see [10]. The measurements were taken in a fully equipped 5 × 7 × 3 m3 laboratory. To obtain time-varying impulse responses, the Rx antenna array was held at a fixed position while the Tx array was moved at a velocity of about 0.3 m/s. B. Rayleigh Channel Model The time-varying impulse response between each pair of Tx and Rx antennas was modeled independently according to (1). An exponential power delay profile with a common decay time τ for all channels was assumed, i.e., the variances σi2 (τi ) of the amplitudes αi of the multipath components were chosen as σi2 (τi ) ∼ exp(τi /τ ). In all simulations τ = 25 ns was used, which is approximately the delay spread found in our measurements. We used the same time resolution for the model impulse responses as given from the measurements: 190 multipath components with equally spaced delays τi , with ∆τ = τi+1 − τi = 8.3 ns. Simulation results using this model were always averaged over at least 500 random channels. C. Simulations The symbol length TS in the simulated systems was chosen much larger than τ such that intersymbol interference caused no additional errors. At the Tx, the data were modulated using multilevel quadrature amplitude modulation (M-QAM). No pulse shaping was used. At the Rx, noise was added and the signals were sampled and reconstructed with an appropriate algorithm (e.g., ZF or V-BLAST [11]). At the end, the data were sent through an M-QAM decoder. No receive filter was employed, but it was checked that such an additional filter has a negligible influence on the results. The channel matrix was determined using training sequences of 128 symbols. Every inphase and quadrature component of these sequences consisted of a different Gold sequence. The channel coefficients were obtained by correlating the received signals with the known training sequences. The channel was estimated in intervals TM smaller than 1/fD . At the time instances in between, the data were reconstructed using the preceding channel estimation. At every simulated time instance, the actual channel was frozen

Fig. 1. Increasing BER after the channel estimation. Markers: simulation results using the channel model. Lines: theoretical expectation using the effective SINR (15).

and 104 symbols were transmitted over each Tx antenna to get the instantaneous bit-error probability. IV. BER A NALYSIS If the error probability as function of the SNR is known for a given communication system, the SINR function (15) can be inserted, and so the bit-error-probability as a function of the elapsed time t after the channel estimation is obtained. Fig. 1 shows BER simulations based on the channel model with fD = 10 Hz for an 8 × 6 MIMO system with 4-QAM modulation. The simulations were performed with ZF and V-BLAST detection and for different values of the SNR. The lines show theoretical graphs: For BLAST, the effective SINR (15) was inserted into a BER(SNR) function obtained by a normal BER simulation, whereas for ZF, the known analytical results from [13] was used. We observe that the temporal growth of the BER due to the time-varying channel can be predicted very well by the derived effective SINR. Similarly, good agreements were also found in other simulations with different Doppler frequencies, modulations, and antenna configurations [12]. At small times t, the BER is determined mainly by the noise. If time proceeds, the interference due to the temporal change of the channel coefficients slowly increases the BER. If the interference becomes larger than the noise, it dominates the BER performance. The channel estimation has to be repeated if the BER reaches a certain threshold BER0 . So, it becomes clear that detection schemes which need a lower SNR to obtain a certain BER have to estimate the channel less frequently than schemes which need a higher SNR. For instance, if we demand that the average BER should remain less than 10−3 in Fig. 1, the ZF system with an SNR of 20 dB has to estimate the channel in time intervals of fD t ≈ 0.025, whereas the corresponding interval for the V-BLAST system is fD t ≈ 0.058. In the following, we investigate MIMO systems with MQAM modulation and ZF detection using this method. Since we are mainly interested in the influence of the changing channel coefficients on the BER performance, we neglect the noise and the estimation error in (15) and use only the SIR (10) in the

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

1893

following analysis. We ask for the necessary time interval t0 between the channel estimations such that the BER due to the changing channel coefficients remains less than a given threshold BER0 of 10−5 . We ascertain how t0 depends on the diversity and the modulation level used. For MIMO systems with ZF detection operating in Rayleighfading channels, the BER as function of the average SNR is known analytically [13] (at least for BPSK and QPSK modulation). Inserting (10) in this formula, we obtain the BER as a function of time

1−µ BER = 2

L L−1 L − 1 + k 1 + µ k k 2

(17)

k=0

Fig. 2. Influence of the modulation on the time behavior of the BER. Markers: simulation results using the channel model with fD = 5 Hz. Lines: theoretical curves (17).

with µ = {1 + M NTx [1 − sin c(2πfD t)]}− 2 1

(17a)

where L = NRx − NTx + 1 is the diversity order of the system and M is a constant factor dependent on the modulation scheme, reflecting the fact that higher modulations need a better SNR for the same error probability. A good rule of thumb is M = 2k , where k is the number of bits per symbol. For BPSK (2-QAM) and 4-QAM, this rule is exact, i.e., MBPSK = 2 and M4−QAM = 4, respectively. For fD t 1/(2π), (17) can be approximated by BER(fD t) ≈

2L − 1 L

π2 M · NTx 6

L (fD t)2L

(18)

using (6) and the approximation for the BER given in [13]. So, it appears that the error rate increases with the 2Lth power of time initially after the channel estimation. Equation (18) shows how the BER depends on the Doppler spread fD , on the modulation level M , and on the diversity order L, in principle. A. Dependency on Doppler Spread According to (17) and (18), the Doppler spread fD scales the time axis t. So, all graphs can be plotted over the normalized time fD t. Fig. 1 already showed that the theoretical formula (17) fits the results from the simulations very well over the whole time range. B. Dependency on Modulation Fig. 2 shows simulation results (based on the channel model with fD = 5 Hz) for an 8 × 6 MIMO system with different modulations and the corresponding theoretical curves (17). Again, the simulations agree well with the theoretical graphs. We see that all curves have the same slope independent of the modulation. This slope is determined by the exponent 2L of the time t in (18). The graphs for higher modulation schemes are shifted towards higher BER values due to the factor M L in (18). Consequently, the time instance t0 when the BER(t) reaches the threshold √ BER0 depends on M . From (18), we read: fD t0 ∼ 1/ M . Thus, from the time t0 for one modulation M1 ,

Fig. 3. Dependency on modulation: simulations with measured impulse responses. The inset shows the Doppler power spectrum of the measurement data.

the corresponding time t0 for any other modulation M2 can be determined by M1 t0|M2 = · t0|M1 . (19) M2 From Fig. 2, for instance, we obtain fD t0 ≈ 0.023 for BPSK modulation (M = 2) and for BER0 = 10−5 . But the curve

for 64-QAM (M = 64) reaches BER0 already at fD t0 ≈ 2/64 · 0.023 ≈ 0.004. Generally, the channel has to be estimated more frequently if a higher modulation scheme is used. In Fig. 3, results from link level simulations based on measured time-varying impulse responses are plotted. Simulations were performed over 1 s total time, and channel estimation was done every 100 ms. The resulting ten BER(t) curves were averaged. The inset of Fig. 3 shows the average Doppler spectrum of the measurement data during this time. From this spectrum, we read the maximum Doppler frequency fD of about 5 Hz. For comparison, theoretical curves with fD = 5 Hz are also plotted in Fig. 3. We see that the simulated curves agree very well with the theoretical expectation even though the Doppler frequencies of the measured channel are not uniformly distributed, as assumed in the theoretical derivation and although only ten channels are averaged.

1894

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

starting level BER(t = 0), and so it takes a bit longer to reach the threshold BER0 (cf. Fig. 1). But if the corresponding gain in data transmission time is compensated by the longer training phase, the overall throughput may not be increased. Probably, there is an optimal length for the training block. Of course, the influence of the training length could be included in the BER analysis, but then all results would again be valid only for a special detection scheme (like ZF above). Therefore, we will investigate the optimal training and data lengths based on the information-theoretical capacity in Section V. V. C APACITY A NALYSIS Fig. 4. Dependency on antenna configuration: link-level simulations with measured impulse responses (markers), theoretical curves (17) (solid lines), and approximations (18) (dashed lines).

C. Dependency on Antenna Conﬁguration It is well known that spatial diversity improves the performance of radio systems. Thus, we expect that the time interval t0 is larger for systems with a higher diversity order L. Our expectation is confirmed by Fig. 4, in which the BER curves for different antenna configurations are shown. The markers indicate results obtained by link level simulation with the measured impulse responses already used in Fig. 3. The solid lines represent (17) with fD = 5 Hz and the dashed lines are the corresponding approximations (18). The slopes of the curves in Fig. 4 are determined by the exponent 2L of the time t in (18): The BER increases by 2L decades if the time increases by one decade. But nevertheless, the BER is smaller for higher values of L, in general, since fD t 1. From (18), it is clear that the BER depends on the number of Tx antennas as well. For a fixed diversity order L, the BER increases proportional to the Lth power of NTx . So, the √ necessary time interval fD t0 is proportional to fD t0 ∼ 1/ NTx . The larger the NTx , the more frequently the channel has to be estimated. All BER simulation results show a very good agreement with the theoretical curves based on the effective SINR (15). Hence, the proposed model for the interference due to the changing channel coefficients holds very well. Slight discrepancies are only found at small t (see Fig. 2), eventually due to the nonGaussian distribution of the interference e(t). Since fD t is limited to a finite region, the error e(t) is also limited to a small range [see (4)], e.g., if this range is smaller than half of the distance between two QAM symbols, no error will appear. This could explain why the BER found in the simulations without noise at small t is slightly less than predicted by the model with the Gaussian distributed interference. The above analysis of the BER gives a good estimate at which time interval the channel has to be measured again to keep the BER below a certain threshold. However, we have neglected the influence of the noise and the channel estimation errors. The time dependent BER curves start at a certain level BER(t = 0), which is determined by the SNR and the estimation errors. Using a longer estimation phase will reduce the

We insert the effective SINR ρeﬀ (t) (15) into the capacity formula for MIMO systems with channel state information (CSI) at the Rx and with additional white-Gaussian noise [14] ρeﬀ (Lτ , t) log2 1 + λi C(Lτ , t) = (20) NTx i where λi are the squares of the singular values of the channel matrix H. Due to ρeﬀ (t), the capacity depends on the time after the last channel estimation t, on the number of training symbols Lτ , on the Doppler spread fD , on the SNR during the estimation ρτ , and on the SNR during the data transmission ρd . In general, the powers used during the training and data phases have to be optimized as well. The optimization of ρτ and ρd for block fading channels was done in [4]. It was found that it is better to spend as little as possible time on training but to use more power during the training. A similar analysis could be applied here as well, but since the length of the coherence interval is not known at the beginning and since the effective SINR depends on the time as well, this analysis becomes slightly more involved and would go beyond the scope of this paper. Moreover, in practical systems [11], it may be more complicated to change the power between the training and data phases, since then higher requirements on the linearity of the components are needed. Therefore, we assume throughout this paper that ρτ = ρd . Due to the increasing difference between the actual channel coefficients and the CSI obtained by the previous estimation, the capacity (20) decreases monotonically with time. The total amount of information which can transmitted during the data transmission phase is Td D(Lτ , Td ) =

C(Lτ , t)dt.

(21)

0

But due to the channel estimation, spectral efficiency is lost such that the average information rate is only R(Lτ , Td ) =

D(Lτ , Td ) . Tτ + Td

(22)

In the following, we determine the optimal lengths of the training and data phases such that R is maximized. Optimization of the training length for block fading channels can be

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

1895

found in [3] and [4]. In such channels, it is assumed that the length of the coherence interval T is given such that the question is only how much of T should be spent on training. In continuous fading channels, we do not have this restriction of a fixed coherence interval, i.e., the training and data blocks can have any desired length, at least in principle. At a result of this, we will see that the optimal coherence interval T depends also on the SNR and on the antenna configuration. A. Optimization Over Td First, we assume that the number of training symbols Lτ is given. To find the optimal data length Td , the first derivative of R (22) with respect to Td is set to zero. This yields a quite intuitive equation for the optimal Td (Lτ ) (Tτ + Td ) · C(Lτ , Td ) = D(Lτ , Td ).

(23)

It is easily shown (see Appendix II) that the so-obtained Td is unique and that it really maximizes R. The maximal rate is obtained by inserting (23) into (22) Rmax (Lτ ) = C (Lτ , Td (Lτ ))

(24)

which is still a function of the training length Lτ . It says that at the end of the optimal data phase, the instantaneous capacity is equal to the rate of the whole coherence interval. So the loss of spectral efficiency due to the training phase is compensated by the higher capacity at the beginning of the data block. To reach this Rmax , the data rate would have to be changed continuously during the data transmission phase. Writing the equation for the optimal Td (23) in detail gives Nλ 1 ρeﬀ (Lτ , Td ) k · fD TS · Lτ ln 1 + λi 2 NTx i=1 =

fD Td fD Td ai arctan k − Nλ b arctan k ai b i=1

Nλ

(25)

with ai =

1 NTx λi + + and b = ρd ρτ Lτ NTx

1 NTx + ρd ρτ Lτ

Fig. 5. Optimal number of (a) data symbols and (b) achievable rate for a continuous fading SISO channel versus the number of training symbols.

for different Doppler spreads are merely shifted vertically. At small times t, ρeﬀ (t) is determined mainly by the SNR ρd . As time proceeds, ρeﬀ (t) is more and more dominated by the interference due to the time-varying channel, which yields a decrease in capacity. Clearly, at high SNR this interference dominates ρeﬀ (t) earlier than at lower SNR. Therefore, the optimal data phase is shorter at higher SNR. Fig. 5(b) shows that the achievable rate is still a function of the training length Lτ . Apparently, there is an optimal training length at which the rate becomes maximal. It depends on the SNR and on the normalized Doppler spread.

(25a)

√ and k = 2π/ 3. Nλ denotes the number of singular values, i.e., the rank of the channel. Equation (25) can be solved only numerically. The solution depends on the SNR, the√ normalized Doppler spread fD TS , and the singular values λi of the channel. Fig. 5(a) shows some solutions of (25) for an SISO channel (λ1 = 1), where Td was converted into “number of symbols” by Ld = Td /TS . In Fig. 5(b), the corresponding maximal rate (24) is plotted over the training length Lτ . Clearly, the optimal number of data symbols increases with the number of training symbols to maintain a high spectral efficiency. Also, it is intuitively clear that in slower fading channels, the data phase can be longer than in fast-fading channels such that the graphs

B. Optimization Over Lτ In order to optimize the maximal information rate with respect to the training length Lτ , (24) must be derived with respect to Lτ and set to zero. Since the optimal Td (Lτ ) is already given only by the implicit function (23), this yields complicated implicit expressions for the optimal Lτ . Therefore, we calculated the optimal length Ld and the maximal rate Rmax (Lτ ) as functions of Lτ as in Section V-A. From those results, the optimal training length Lτ which maximizes Rmax (Lτ ), the corresponding optimal data length Ld , and the maximal rate Rmax were extracted. First, we again consider only SISO channels. The top graph of Fig. 6 shows the optimal block lengths Ld and Lτ as a function of the SNR. In the bottom diagram, the corresponding

1896

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

Fig. 6. (a) Optimal number of training and data symbols and (b) the maximal achievable rate as function of the SNR for different Doppler spreads.

maximal achievable rates are shown. As explained above, the optimal number of training and data symbols decreases as the SNR increases. Both lengths are reduced almost equally as the SNR grows, such that the ratio Ld /(Lτ + Ld ) is nearly independent on the SNR. Note that at higher SNR values, the optimal training length becomes one, especially in fast fading channels. The training length cannot be reduced further, whereas the optimal data length continues to decrease as the SNR increases (note that in MIMO systems the minimal number of training symbols would be NTx ). Therefore, the spectral efficiency becomes worse at high SNR in fast-fading channels. The impact of this behavior on the achievable rate can be studied in the bottom graph of Fig. 6: In a timeinvariant channel, the rate increases linearly with the logarithm of the SNR. If the channel is varying in time, the slope decreases due to the loss in spectral efficiency by the periodic channel estimation. At some SNR value, the achievable rate begins to saturate due to the further reduced spectral efficiency because the training length has reached its minimal value. Therefore, in time-varying channels, there is an upper bound on the achievable rate even if the SNR approaches infinity. The graphs indicate that for fast-fading channels (fD TS 10−2 ), other transmission schemes, which account for the temporal variations during the data transmission, are necessary to reach a satisfactory capacity. On the upper graph of Fig. 7, it is shown how the optimal block lengths depend on the normalized Doppler spread fD TS of the channel. As fD TS increases, the optimal training and

Fig. 7. (a) Optimal number of training and data symbols and (b) maximal achievable rate for a SISO channel versus the normalized Doppler spread for different SNRs.

data lengths are reduced.2 But the number of data symbols Ld decreases faster than the number of training symbols Lτ such that the ratio Ld /(Lτ + Ld ) becomes worse as fD TS increases. Therefore, the maximal rate, which can be transmitted over a time-varying channel, becomes smaller as the channel fades faster. This is illustrated in Fig. 7(b). Again, we observe that at high Doppler spreads, the optimal training length becomes one, resulting in an even faster decrease in spectral efficiency as fD TS is increased further. Consequently, the achievable rate falls faster beyond this point (cf. Fig. 7). Nevertheless, the relative loss of rate at low Doppler frequencies is almost independent of the SNR: e.g., the rate at SNR = 40 dB drops to 90% of its maximal value at about fD TS ≈ 1 · 10−4 whereas the rate at SNR = 0 dB reaches 90% of its maximum at fD TS ≈ 3 · 10−4 . Note that the results here are somewhat different from the findings for block fading channels with fixed coherence interval T = Tτ + Td given in [4]. There, it was found that at sufficiently high SNR, the optimal length of the training phase becomes minimal (Lτ = NTx ) but as the SNR decreases, an increasing fraction of T has to be used for the training. If finally the SNR becomes sufficiently low, it will be optimal to spend half of T (Tτ = T /2) for training such that 50% of the capacity is lost compared to the case where the Rx knows 2 Interestingly, even thought the optimal L and L decrease, the block τ d lengths Td = Ld TS and Tτ = Lτ TS increase as fD TS rise.

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

1897

the channel. Our results show that the optimal coherence interval T = Tτ + Td depends also on the SNR. Both the optimal Tτ and Td increases as the SNR decreases. Therefore, the coherence interval T increases as the SNR decreases, whereas the ratio between the training and data phases remains nearly independent on the SNR (Fig. 6). This ratio is mainly determined by the Doppler spread of the channel (Fig. 7). Therefore, we have (unlike in block fading channels) essentially no loss in the spectral efficiency as the SNR becomes low. For example, we consider the results in Fig. 6 for fD TS = 1 · 10−5 : At SNR = 20 dB, the number of symbols in one coherence block is L = Lτ + Ld = 949, whereas at SNR = 0 dB, the coherence interval is five times longer L = 4567. The fraction Ld /L, on the other hand, is almost independent of the SNR (98.5% at SNR = 20 dB and 98.7% at 0 dB). C. MIMO Channels Up to now, we considered only SISO channels and investigated the influence of the SNR and the Doppler spread on the achievable rate and the optimal length of the training and data blocks. For MIMO channels, the dependence on the SNR and the Doppler spread is the same, in principle. Therefore, we concentrate in this section on the influence of the number of antennas and of the condition of the channel. The capacity (20) is written as the sum of the capacities of several√SISO subchannels, each is characterized by a singular value λi of the MIMO channel. Thus, the length of the optimal training and data blocks may depend on the singular values of the actual channel. To investigate the influence of different channels, we use the following common Ricean model [15], [16] for the flat-fading MIMO channel matrix H H=

1 HRay + K +1

K Hspec K +1

(26)

where HRay represents the Rayleigh (scattered) component. Its entries are independent complex-Gaussian random numbers with zero mean and unit variance. The matrix Hspec represents a specular (deterministic) component. All its entries are equal to one. All entries of H have unit variance. The Ricean-factor K (or K-factor) is defined as the ratio of deterministic-to-scattered power. K = 0 is a purely Rayleigh fading channel, whereas K → ∞ models a line-of-sight channel. We used this model to generate, for a given K, a large number (105 ) of random channel matrices H. For every single H, we determined the squares of the singular values λi . The means of the λi were then used in (25) as a representation of an MIMO channel with the given K-factor. The following results are obtained for “the average channel,” but they are not average results for Ricean channels. Initially, we consider MIMO channels with the same number of antennas at the Tx and Rx (NRx = NTx = N ). Fig. 8 shows the optimal number of training and data symbols (top) and the maximal achievable rate (bottom) versus the number of antennas N for an SNR of 10 dB. We observe that the block lengths Lτ and Ld increase slightly with increasing N . Nevertheless, Lτ increases a little bit faster than Ld resulting in a decreasing

Fig. 8. N × N MIMO channel with different K-factors. (a) Optimal number of training and data symbols. Graphs for different K-factors (K = 0, 5, 50) lie almost on top of each other. (b) Maximal achievable rate.

spectral efficiency. It is also observed that the optimal lengths are nearly independent of the K-factor. The maximal rate scales nearly linearly with the number of antennas N but the slope becomes smaller as the Doppler spread increases as a result of the reduced spectral efficiency at higher N . The principal behavior is the same for different K-factors apart from the well known fact that the capacity decreases with increasing K-factor. Now, we investigate the influence of receive diversity due to additional Rx antennas. Consider an MIMO system with four antennas at the Tx (NTx = 4). Using more Rx antennas (NRx > NTx ) increases the receive signal power and the singular values of the channel. By (20), this enhances the capacity of the channel as well. But what influence has the surplus of Rx antennas on the optimal block lengths and the achievable rate? The upper graph of Fig. 9 shows that the optimal block lengths are nearly independent of the number of Rx antennas. Therefore, the reduced spectral efficiency, due to the necessary channel training, is also independent on NRx . This results in a constantly smaller data rate (compared with a known channel) dependent on the Doppler spread of the channel as illustrated by Fig. 9(b). Again, the K-factor has almost no impact on the principal results. VI. S UMMARY We have investigated continuous flat-fading MIMO channels with training based channel estimation. The continuously

1898

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

Altogether, the optimal block lengths depend on the performance measure used (e.g., bit-error probability, capacity). However, the simple time-dependent SINR model proposed in this paper has successfully been applied under different performance measures and so it may be a useful tool for the optimization of radio systems in continuous fading channels. A PPENDIX I In this appendix, we give an upper bound for the estimation error due to the time-varying channel and compare it with the estimation error due to the noise. We assume that the receive signal during the training phase at the jth Rx antenna yj can be written as in (13) without the term vj,i · xi , which already models the estimation error. We assume further that orthogonal training sequences si = [si (0), si (1), . . . , si (Lτ − 1)]T of length Lτ and with sH i si = Lτ are transmitted at the same time over the Tx antennas i = 1, . . . , NTx . An estimate of the channel coefficient hj,k is obtained by an ML estimator from the Lτ consecutively received symbols yj (m); m = 1, . . . , Lτ − 1 by L τ −1 ˆ j,k = 1 s∗ (m)yj (m) = hj,k + ∆hDoppler + ∆hNoise h Lτ m=0 k (27)

Fig. 9. MIMO channel with four Tx antennas. (a) Optimal number of training and data symbols. The graphs for different K-factors (K = 0, 5, 50) lie almost on top of each other. (b) Maximal achievable rate.

increasing difference between the actual channel coefficients and the previous channel estimate can be translated into an effective loss in signal-to-interference ratio. We found that this interference power increases almost quadratically with time and with the Doppler spread of the channel. Based on this interference model, the continuously increasing bit-error probability of MIMO systems after the channel estimation can be predicted very well. This was verified by link level simulations using impulse responses according to a wellestablished channel model as well as with impulse responses obtained by an MIMO channel-sounding campaign. The time interval after which the channel has to be estimated again in order to keep the BER below a desired threshold strongly depends on the actual system configuration such as modulation scheme, detection algorithm, and number of antennas. In the second part, we used the proposed interference model for the analytical optimization of the achievable information rate by adjusting the lengths of the training and data phases. These optimal lengths depend not only on the Doppler spread of the channel but also on the noise power and on the number of transmit antennas in the MIMO system. The number of receive antennas and the actual channel condition, on the other side, only have a negligible influence on the optimal block lengths. It was shown that the optimal length of the coherence interval increases as the SNR decreases whereas, unlike block fading channels, the ratio between the length of the training and data phases is almost independent of the SNR.

For the estimation errors ∆hDoppler and ∆hNoise due to the time variation of the channel and due to the thermal noise, respectively, we obtain ∆hNoise =

∆hDoppler =

Lτ −1 1 s∗ (m)nj (m) Lτ m=0 k NTx 1 sk , si j,i Lτ i=1

(28)

with sk , si j,i =

L τ −1

s∗k (m)ej,i (m · TS )si (m)

(29)

m=0

using the channel model and the notations of Section II. Note that we omit the subscripts j,k in the notation of ∆hDoppler and ∆hNoise and that ·, ·j,i is only a short-hand notation and does not denote a scalar product. Both errors are independent, and we can calculate the average powers separately. For ∆hNoise , we easily obtain (12) 1 2 NTx 2 2 σ∆Noise = E |∆hNoise |2 = σn = σ . Lτ ρτ Lτ h

(30)

The calculation for ∆hDoppler is slightly more intricate 2 σ∆Doppler = E |∆hDoppler |2 =

NTx 1 E sk , si j,i sk , sl ∗j,l . 2 Lτ i,l=1

(31)

POHL et al.: CONTINUOUS MIMO CHANNELS: RATE AND LENGTH OF TRAINING AND DATA PHASES

The expectations E are always taken over several channel realizations. Because of the statistical independence of the different impulse responses and since E[ej,i (t)] = 0 for all t, we have E[sk , si j,i sk , sl ∗j,l ] = 0 for i = l. In the case i = l, we obtain E sk , si j,i sk , si ∗j,i =

L τ −1

s∗k (m)si (n) · E ej,i (mTS )e∗j,i (nTS ) · si (m)s∗i (n)

m,n=0 τ −1

L s∗k (m)sk (n)si (m)s∗i (n) ≤ E |ej,i (Lτ TS )|2

1899

where the last inequality holds because of the monotonic decreasing C. Therefore, (34) is a contradiction for Td2 > Td1 . This proves that (23) has no more than one solution. The second derivative of the rate R (22), with respect to Td at the local extremum (23), gives ∂2R ∂C 1 = Td1 . Subtracting both (23) from one another gives Tτ [C(Td2 ) − C(Td1 )] =

Td2 C(t)dt + Td1 C(Td1 ) − Td2 C(Td2 ) Td1

= [Td2 − Td1 ] C(TM ) + Td1 C(Td1 ) − Td2 C(Td2 )

(34)

where the second equation was obtained by the mean value theorem with Td1 ≤ TM ≤ Td2 . The left-hand side of this equation is strictly negative (because C is strictly monotonic decreasing in t) whereas the right-hand side is positive. This can be seen by replacing C(Td1 ) by the smaller value C(Td2 ) [Td2 − Td1 ]C(TM ) + Td1 C(Td1 ) − Td2 C(Td2 ) ≥ [Td2 − Td1 ] · [C(TM ) − C(Td2 )] ≥ 0 (35)

ACKNOWLEDGMENT The authors wish to thank W. Wirnitzer, D. Brückner and S. Warzügel from MEDAV GmbH for providing the MIMO channel sounder and for the assistance during the measurements. R EFERENCES [1] M. Medard, “The effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel,” IEEE Trans. Inf. Theory, vol. 46, no. 3, pp. 933–946, May 2000. [2] A. Lapidoth and S. Shamai, “Fading channels: How perfect need ‘Perfect Side Information’ be?,” IEEE Trans. Inf. Theory, vol. 48, no. 3, pp. 1118– 1134, May 2002. [3] T. L. Marzetta, “BLAST training: Estimating channel characteristics for high capacity space–time wireless,” in Proc. 37th Annual Allerton Conf. Communications, Control, and Computing, Monticello, IL, 1999, pp. 958–966. [4] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-antenna wireless links?,” IEEE Trans. Inf. Theory, vol. 49, no. 4, pp. 951–963, Apr. 2003. [5] Q. Sun, D. C. Cox, H. C. Huang, and A. Lozano, “Estimation of continuous flat fading MIMO channels,” IEEE Trans. Wireless Commun., vol. 1, no. 4, pp. 549–553, Oct. 2002. [6] J. Baltersee, G. Fock, and H. Meyr, “Achievable rate of MIMO channels with data-aided channel estimation and perfect interleaving,” J. Sel. Areas Commun., vol. 19, no. 12, pp. 2358–2368, Dec. 2001. [7] L. Zheng and D. Tse, “Communication on the Grassman manifold: A geometric approach to the noncoherent multiple-antenna channel,” IEEE Trans. Inf. Theory, vol. 48, no. 2, pp. 359–383, Feb. 2002. [8] B. Sklar, “Rayleigh fading channels in mobile digital communication systems, Part 1: Characterization,” IEEE Commun. Mag., vol. 35, no. 9, pp. 136–146, Sep. 1997. [9] W. Wirnitzer, D. Brückner, R. S. Thomä, G. Sommerkorn, and D. Hampicke, “Broadband vector channel sounder for MIMO channel measurements,” in Proc. IEE Seminar on MIMO Communication Systems, London, U.K., 2001, pp. 17/1–17/4. [10] V. Jungnickel, V. Pohl, H. Nguyen, U. Krüger, T. Haustein, and C. von Helmolt, “High capacity antennas for MIMO radio systems,” in Proc. 5th Wireless Personal Multimedia Communications (WPMC), Honolulu, HI, 2002, pp. 407–411. [11] G. D. Golden, C. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky, “Detection algorithm and initial laboratory results using V-BLAST space–time communication architecture,” Electron. Lett., vol. 35, no. 1, pp. 14–16, Jan. 1999. [12] V. Pohl, P. H. Nguyen, V. Jungnickel, and C. von Helmolt, “How often channel estimation is needed in MIMO systems,” in Proc. IEEE Global Communications (GLOBECOM), San Francisco, CA, 2003, pp. 814–818. [13] J. G. Proakis, Digital Communications. New York: McGraw-Hill, 1983. [14] E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans. Telecommun., vol. 10, no. 6, pp. 585–595, Nov. 1999. [15] S. O. Rice, “Mathematical analysis of random noise,” Bell Syst. Tech. J., vol. 23, no. 3, pp. 282–332, Jul. 1944. [16] ——, “Mathematical analysis of random noise,” Bell Syst. Tech. J., vol. 24, no. 1, pp. 46–156, Jan. 1945.

1900

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 4, JULY 2005

Volker Pohl was born in Dresden, Germany, in 1972. He received the Dipl.Ing. degree in electrical engineering from the Technische Universität Berlin, Berlin, Germany, in 2000. He became a Certified Technician for Electromechanics with Mertik GmbH, Quedlinburg, Germany in 1992. In 1998, he joined the HeinrichHertz-Institut für Nachrichtentechnik (HHI), Berlin, Germany, as a Student Research Associate. Initially, he was involved in the development of electroluminescence color displays. Then he transferred to the broad-band mobile communications networks department, where his research was concerned with the development of wireless infrared systems, infrared channel modeling, in particular. At present, he is a Research Associate at HHI. While working towards his Ph.D., he was concerned with the modeling and measurement of MIMO radio channels and equalization.

Phuc Hau Nguyen was born in Ben Tre, Vietnam, in 1972. He received the Dipl. Ing. degree in electrical engineering from the Technische Universität Berlin, Berlin, Germany, in 2004. He was a Student Employee in 2000 at the Research Institute for Open Communication Systems GMD Fokus, Berlin, Germany. In 2001, he joined the Broadband Mobile Communications Networks Department of the Heinrich-Hertz-Institute (HHI) for Telecommunication, Berlin, Germany as a Student Research Associate, where he developed a link-level simulation tool for MIMO radio channels and was involved in research on MIMO channel modeling. At present, he is with the Algorithms & Firmware Department, Philips Semiconductors GmbH, Nürnberg, Germany, where he is concerned with the development of UMTS Systems at DSP platforms.

Volker Jungnickel (M’00) was born in Großenhain, Germany, in 1964. He received the Dipl.Phys. and Dr.rer.nat. degrees in physics from Humboldt University, Berlin, Germany, in 1992 and 1995, respectively. In 1984, he became a Certified Optician with Carl Zeiss, Jena, Germany. He has been concerned with research on photoluminescence properties of quantum dots and the mechanism of electron–photon coupling in strong confining zero-dimensional semiconductors. In 1995, he joined the ILS GmbH, Stansdorf, Germany, working in the field of medical laser technology. In 1997, he joined the Broadband Mobile Communication Networks Department, HeinrichHertz-Institut für Nachrichtentechnik, Berlin, Germany, where he developed a high-speed wireless system for indoor communication based on infrared. At present, his research is focused on multiple-input multiple-output radio systems for high-speed wireless communication. He has authored and coauthored about 40 conference and journal papers and holds two patents. Dr. Jungnickel is a member of the German Physical Society.

Clemens von Helmolt was born in Berlin, Germany, in 1952. He received the Dipl.Ing. and Dr. Ing. degrees in electrical engineering from the Technische Universität Berlin, Berlin, Germany, in 1979 and 1985, respectively. From 1979 to 1984, he was a Research Associate at the Institut für Hochfrequenztechnik, Technische Universität Berlin, where he worked in the field of acoustooptic interaction in optical strip waveguides. In 1984, he joined the Heinrich-Hertz-Institut für Nachrichtentechnik (HHI), Berlin, Germany, where he was engaged in research on LiNbO3 devices for coherent receivers until the end of 1987. From 1988 to 1998, he worked in the fields of optical frequency stabilization of optical heteroyne and WDM broadband communication systems, where he coordinated the HHI research activities related to the European projects “RACE-1010,” “RACE-2065,” and “ACT-084.” From 1996 to 2000, he was responsible for a national research project related to broad-band mobile indoor communication based on infrared. Since November 2000, he has been managing research projects related to MIMO techniques and systems for RF multielement antenna mobile communication. He has authored and coauthored more than 40 journal publications and conference presentations each, has made several contributions to books, and holds several European and US patents.