In the Gaussian fading channel, the application of the MAP criterion for the per-sequence chan- nel estimation leads to the optimal sequence or symbol detector.
Bayesian EM-Based Demodulators for Frequency-Selective Fading Channels Mauri Nissil¨a and Subbarayan Pasupathy
VTT Electronics, Kaitov¨ayl¨a 1, P.O.Box 1100, FIN-90571 Oulu, Finland Department of Electrical and Computer Engineering, University of Toronto, 10 King’s Collage Road, Toronto, Ontario, Canada M5S 3G4
Abstract— In this paper, the problem of adaptive MAP symbol detection in the uncoded transmission as well as the problem of adaptive APP demodulation in the coded transmission of data symbols over the frequency-selective Rayleigh fading channel are explored within the framework of the Bayesian expectationmaximization (BEM) algorithm. In particular, two novel versions of the BEM-based detection and demodulation algorithms are derived. In contrast to the earlier developments of BEM algorithms, the formulations derived in this paper lead to the computationally efficient algorithms which avoid the matrix inversions while using sequential processing over the time and trellis branch indexes. In addition, it is shown how the recursive versions of the BEM algorithms can be combined with the well-known forward-backward processing soft-input soft-output (SISO) algorithms resulting in adaptive SISOs with soft decision directed (SDD) channel estimators. An application of the proposed algorithms to the iterative “turbo-processing” receivers illustrates how these SDD channel estimators can efficiently exploit the extrinsic information obtained from the SISO decoder in order to enhance their estimation accuracy.
I. I NTRODUCTION As is well known, the Maximum Likelihood Sequence Detector (MLSD) and the Maximum A Posteriori Symbol Detector (MAPSD) represent the optimal detection strategies for uncoded data sequences and uncoded data symbols, respectively, under the assumption that the parameters of the communication channel are perfectly known [1]. Optimal detector for coded information bits transmitted over known frequency selective channel can be obtained by applying ML or MAP optimization criterion for the combined trellis of the channel and the decoder. Close to optimal, but computationally much simpler detector for coded bits can be built by combining the demodulator/equalizer and the decoder blocks iteratively using the Turbo-principle [2]. In practice, however, the channel is unknown and has to be estimated from the received signal by using pilot symbols and/or performing estimation and detection somehow jointly. One approach for combining detection and estimation is to perform joint maximum likelihood (ML) detection and estimation. Related methods are obtained by replacing the likelihood function in the derivation of the joint detector and estimator with the a posteriori density function in order to take into account the available a priori knowledge about the channel The research was financially supported by Nokia Foundation.
process and/or data symbols. In the Gaussian fading channel, the application of the MAP criterion for the per-sequence channel estimation leads to the optimal sequence or symbol detector in the sense that they maximize the likelihood function which is averaged over the unknown channel process [3], [4]. An alternative approach is to estimate the channel parameters in non data-aided (NDA) or blind manner and then with the aid of these estimated parameters to perform detection using MLSD or MAPSD processing. From many different blind estimation methods reported in literature, we will, in this paper, study probabilistic methods exclusively, i.e., methods which use likelihood processing. A probabilistic approach was used in [5], [6], where linear MMSE estimates of the frequency-flat and frequency selective Rayleigh fading channels were obtained jointly with MAP symbol decisions via Bayesian EM (BEM) algorithm. The properties of Martingale difference sequences was used in [7] to derive the nonlinear Kalman-like recursive channel estimators which were able to exploit the probabilistic information about the symbols. In effect, a common base for these probabilistic algorithms is that the channel parameters are estimated by using the probabilistic or soft information about the data symbols instead of the hard decisions. In section IV, a blind Kalman smoother/estimator, which is conceptually similar to the blind Kalman smoother/estimator in [6], is derived in a novel way. This novel derivation allows considerable reduction in implementation complexity and even more importantly, a recursive version of the blind Kalman estimator can easily be obtained from this novel derivation. Further reduction in implementation complexity was obtained by an EM-based decomposition of the received signal into independent multipath components. In section II, the system and channel models are presented while in section III, detection and estimation criterions are shortly summarized. The numerical results are given in section V and conclusions in section VI. II. S YSTEM
AND
C HANNEL M ODEL
We consider the transmission of uncoded and coded QPSK modulated symbols over the frequency-selective time-varying channel. In order to simplify the notation and to make the numerical computations more efficient, we use a simple receiver front-end processor, which consists of the whitened matched filter followed by a symbol-spaced sampler with ideal sampling
instants. However, an extension of the derived algorithms to support a more accurate oversampled system model is straightforward. The received signal samples can now be expressed as (1)
where !" $# is the channel impulse response at time % , & is the length of the channel memory and is a zero-mean white Gaussian noise sample with power spectral density ' . In the terminology of trellis processing, is called as a branch (or transition) at time % and it is defined as !( )*$+ ,)- # ./$*)- # , where /$0.)-1+ 2,)-1+ 3 # is called as a state of the trellis;at3 time % . The trellis has 4 54 6 states and 4 784 9: 6 branches between the states where 6 is the number of constellation points of the modulator. The CIR is modelled as a Rayleigh fading process while it is strictly bandlimited to some maximum Doppler frequency and hence, its autocorrelation function has infinite length. In practice, however, the stationary Rayleigh fading process can, according to Wold’s decomposition theorem, be approximated by an autoregressive (AR) model of sufficiently high order. Therefore, the memory of the channel in the time axis can be limited to some finite < and the process can be modelled as a < -order discrete-time Gauss-Markov process. In this paper we will assume that < >= in order to keep the notations and computer simulations simple. The resulting linear dynamic system can be described with the state-space model
? ?
@ $+A3BDC0EF G HI*
(2) (3)
where EF is the sample of zero mean complex white Gaussian vector noise process whose & J= elements are independent of the receiver noise . The deterministic & K= #L & K= # matrices @ and C are the model matrices. III. D ETECTION AND E STIMATION C RITERIA In Rayleigh fading channels, the optimal ML sequence detector and MAP symbol detector implicitly contain the persequence MMSE channel estimators [4], [3]. The detection criterion for these optimal detectors can be expressed as
M N8 ) M N8 hAi
O TV Q PSR UXWZY \[ 4 # ]OQT_ P^R U`Wabdc,Y \[ 4 fe #fg OQj2PSkmR l a b cY \ [* 4 e #og % p=q2 ' T$n j2k
(4) (5)
where sr-)* denotes all symbol sequences consistent with )* . The parameter vector e comprises of the sequence of unknown Gaussian channel impulse responses, i.e., et c 3 vu g . The drawback of the optimal detectors is high complexity. This problem, however, can be alleviated with the pruning of the sequence tree, but only at the expense of reduced performance. A suboptimal joint detection and estimation criterion for Rayleigh
fading channels can be expressed as
xe M wHhAi )*M ?
O P^R Y z e 4 [ # y y a c e 4[ g b Oyj2P^k{R Y |)- 4 [ e M wHAh i # % =q2 '}
(6) (7)
There does not seem to exist any direct method to calculate (6), but (6) and (7) can be obtained jointly by the iterative Bayesian EM-algorithm [6]. In section IV, we will derive a computationally feasible algorithm to realize it. Efficient iterative receiver structures for detection of coded information bits can be obtained by using the principle of turboprocessing [2]. In the Turbo receiver, the outer SISO, corresponding to the channel encoder, requires at its input some soft information about the channel symbols. According to [8], a reasonable choice for the soft information of the symbols in the presence of unknown random parameters is APP |)- # ~ T$n j k a b cY |[ 4 e #fg . An ingenious method to calculate these symbol APPs was presented in [8] based on which several suboptimal sequence-pruning algorithms were proposed. Some related adaptive SISOs have also been reported in [3], [9]. However, the symbol APPs in the randomM channels can alternatively be defined as APP |) ^# Y ) 4 [* exwHhAi # , and they are obtained as a by-product of the BEM algorithm. Obviously, in the case Turbo receivers, it is not clear which definition of symbol APP would produce the minimum bit error rate. IV. BAYESIAN EM ALGORITHMS FOR R ANDOM C HANNELS The Bayesian EM algorithm aims to find iteratively the maximum of the posterior density function of the parameter to be estimated, i.e., MAP parameter estimate, in the presence of nuisance parameters. Interestingly, the BEM algorithm applied to the estimation of the frequency-selective Rayleigh fading channel in the presence of unknown data symbols can exactly be realized by iteratively cross coupling the BCJR algorithm with the fixed interval Kalman smoother (KS) operating on an “averaged” state space model where the averaging is performed over the states of the demodulator trellis [6]. Computing the model parameters of the “averaged” state space model is, however, very tedious including forward-backward processing of the whole set of symbol APPs and received signal samples. In addition, the computation of matrix inversion is required at every backward processing step. Furthermore, due to this forward-backward processing requirement, the embedding of the BEM algorithm described in [6] into the existing SISO algorithms does not seem to be possible. In this section, we will propose a novel formulation of the BEM algorithm which avoids the shortcomings due to the “averaged” state space model. In particular, the BCJR algorithm can be iteratively cross coupled with a vector Kalman smoother operating on the time-varying state space model given at the th iteration as (the proof is omitted)
@ 1+ 3 DC0E
(8)
X `
* (9) X`
% = # X`
% 9 # and where [ XX
`X
is defined as X`
L & the 9 matrix N XX
% f9 # . The 9 L = noise 7 3 `X
% = uv vector has a covariance matrix of y N where N is 9 L 9 identity matrix. The branch APPs X`
% 6 # are defined as X`
% 6 # 784 [* c M u3 g X`
(10) [ `X
- V= # # 7
and they are obtained from the BCJR algorithm. In many practical cases, the computationally demanding Kalman smoother can be replaced with the Kalman filter (KF) without deteriorating the symbol or bit error rate of the associated detector significantly. We will refer to these algorithms as BEM-KS and BEM-KF algorithms. Importantly, the building of the state space model (8) and (9) does not require any actual processing at all. On the other hand, the vector Kalman smoother/filter operating on this state space model is usually computationally more demanding than the scalar Kalman smoother/filter operating on the “averaged” state space model. Thanks to the diagonality of the covariance matrix of , the computational burden of the vector Kalman processing can significantly be alleviated by using the sequential processing technique outlined in [10]. By applying the sequential processing to (8) and (9), we obtain a computationally very efficient blind Kalman filter (BKF) which can be described as: For each instant of time, % p=q ' , compute
M @ M @ @DDCC0 for 6 =29 u 7x % 6 # 7 7 % 6 # M M x¡ 7 M ¡ 7
end.
The matrix denotes now the error covariance matrix of the channel estimate and is the Kalman gain vector. In fact, the blind Kalman filter is updated sequentially over the time and branch indexes. Like all iterative algorithms, the adaptive BEM-KS/KF algorithms have to be initialized properly. In practice, the initialization is obtained with the aid of pilot symbols, hence the BEM algorithms are not fully blind but rather semi-blind. Aside from iterative formulation, the BKF estimators can also be easily embedded into the fixed interval (FI) or fixed lag (FL) SISO algorithms which have been developed for known channels. In effect, only the branch APPs % 6 # in the BKF
estimator have to be replaced by the branch APPs £¢
7¥y4 [ 3 M 13 + 3 fined as v¢ % 6 # ¤
% 6 #
de-
, thus resulting in what we will refer to as a soft decision directed KF (SDDKF) estimator. The forward processing of the adaptive BCJR algorithm (or A-SISO algorithm) with embedded SDD-KF estimator (referred here to as an APP-SDD-KF algorithm) can be done recursively while the backward processing is done by using the stored transition metrics. Finally, the symbol APPs are obtained as a product of the forward and the backward product sum (PS) terms. The reduced complexity PSP Kalman filter bank for estimating fast frequency-selective channels was proposed in [11]. In contrast to the PSP-based Kalman processing, we will now propose a novel reduced complexity blind Kalman smoother/filter (RC-BKS/RC-BKF) bank which can exploit the soft statistics about the data symbols in the same way as the previously proposed blind channel estimators. The reduced complexity Kalman processing rely on the assumption of the wide sense stationary uncorrelated scattering (WSSUS) channel model which inherently implies that the cross correlation terms of the matrices @ and C are zero and, therefore, the channel process in (2) can be decomposed into & = independently fading channel taps. The complexity reduction in the blind Kalman processing, by itself, is based on the idea of decomposing the received signal samples into independent multipath components which, then, can be used to estimate the fading channel taps separately. The estimated values of the unobserved multipath components are called as pseudo observations in [11]. Specifically, we apply the EM-based decomposition method similar to the one in [12] except that now the unknown data symbols are also added to the “missing” data set. Under the assumption of PSK modulation and first order AR channel taps, we obtain the decomposed state space model described at th iteration for each multipath component ¦ §=q2 & t= as follows (the proof is omitted):
l
¨¥J©G¨1 1+ 3 ¨ªI«¨¬ ¨ A + 3
+ 3 7 ¨ M X ¨
7 #® X
% 6 # ¨BI¯ ¨
(11) (12)
M X + ¨ 3
7¥ # denotes the per branch pseudo observation M +A3 M +A3 at ¡ = # th iteration defined as X ¨
7¥ # 7¥ ¨ X ¨
° x¡ 7 M X + 3
and ° .=$± & = # . The variance of the u noise components ¯ ¨ is ² ³®´ ° . The updated channel M estimates X`
¨ are obtained by applying the Kalman smoothing/filtering operations to the state space model (11) and (12) and the branch APPs X`
% 6 # are subsequently updated by where
using the BCJR algorithm. We will refer to the resulting iterative SISO algorithm as an BEM-RC-KS/KF algorithm. The RC-BKF estimator can also be easily embedded into the FI and FL SISO algorithms if the pseudo observations are de-
0
fined in terms of the predicted channel estimates, i.e.,
(13)
Interestingly, the ad-hoc pseudo observations in [11] are obtained from (13) by setting ° ¶= . By inserting (13) into (12) and applying the standard Kalman operations onto the resulting state space model, we obtain a reduced complexity blind Kalman filter bank described as follows: For each instant of time, % p=2 ' , compute for ¦
=2 ]= M ¨ J© ¨ M &¨ ¨¥J©G¨A¨© ¨ 4 «¨ ·¨ ¨ ² ³ ´ ¸A¨ N M ¨ M ¨ · ¨ ° l ¹ 3 ¨ ¨ ¡ · ¨ ¨
−1
10
−2
10 BER
M ¨ 7¥ # 7 ¨ M S µ 1+ 3 ¨ ° ¡ 7 M Sµ 1+ 3 }
10
−3
10
Perfect CSI APP−PSAE BEM−KF BEM−RCKF BEM−KS BEM−RCKS PSP−KF
−4
4
10
−5
10
¡ 7 M 7 ¨ % 6 #
end. In effect, this algorithm referred here to as an APP-SDD-RCKF algorithm closely resembles a normalized version of the APPSDD-LMS estimator which incorporates an adaptive step size parameter º ¨¥ · ¨ ° . It was noticed during the writing of this paper, that a decoupled channel estimator bank based on blind EM decomposition of the received samples into multipath components was also independently proposed in [13]. Despite the conceptually similar approach, our reduced complexity Kalman estimator bank differs structurally from the one in [13] in two important aspects. First, in [13], the EM decomposition technique was used to derive a decoupled ML estimator bank without any channel tracking capabilities. Second, the definition of the pseudo observations in terms of the predicted channel estimates enabled us to derive the bank of computationally efficient blind LMS estimators whose step size parameters are adjusted automatically according to the dynamics of the corresponding channel taps. V. N UMERICAL R ESULTS The performance of the proposed adaptive SISO algorithms was evaluated by computer simulations. Specifically, we investigated the performance of uncoded and coded transmission of QPSK symbols over the frequency-selective Rayleigh fading channel with Jakes’ power spectrum and with normalized Doppler spread »q¼ ¾½ } ½½= and »^¼ m½ } ½= . The channel had three independently fading taps whose standard deviations were set at ( ² tap1 ½®¿½À , ² tap2 ]½,Á=1 and ² tap3 J½®¿½À ). The transmitted symbols were organized into fixed size bursts with &dÃÄ HÅÆ information symbols preceded by &£ÇÈfÉ known preamble symbols and followed by &ËÊ j Æ known tail
4
5
Fig. 1. BER versus modulators are used.
6
ÌGÍ,ÎÐÏZÑ
7
8 Eb/N0 [dB]
9
10
11
12
for uncoded system when iterative A-SISO de-
symbols. Thus the starting and terminating states of the equalizer trellis were known. In addition, the known preamble symbols were used in obtaining the initial CIR estimate. In the case of coded transmission, a 64 state convolutional encoder (CE) with rate 1/2 was used to encode the information bits. After QPSK mapper the symbols were interleaved using a & ÃÄ HÅÆ L & ÃÄ HÅÆ block interleaver (BI) in conjunction with the assumption of burst-to-burst independent channel. At the receiver side, the adaptive SISO demodulator and the SISO decoder were iteratively connected through the block interleaver and the block deinterleaver (BD). The bit error rate (BER) curves as a function of a Å ± ' ( a Å is the averaged signal energy per bit) for uncoded transmission system with various adaptive demodulators are presented in Fig. 1. The fading rate was » ¼ J½ } ½½= and the burst parameters were as follows: &ZÇÈÉ Ò=½ , &dÃÄ HÅÆ Óq½ and &BÊ j Æ Ô . The simulated performance is also shown for the demodulator which acquired the CIR estimate by using only the preample pilot symbols at the beginning of each burst (referred to as the APP-PSAE algorithm). The iterative BEM-based equalizers gave only a moderate performance gain compared to the performance of the APP-PSAE equalizer. An interesting observation from this figure is, however, that the BER curve of the BEM-KS equalizer practically coincides with the BER curve of the PSP-KF equalizer. In any case, a conclusion drawn up from this figure is that the iterative BEM-based equalizers or demodulators exploiting the soft statistics about the data symbols do not pay back the increased computational effort, at least not in the uncoded transmission systems. In contrast to the uncoded systems, the performance of the coded transmission systems seems to be enhanced when the soft decision-directed channel estimators are used instead of the hard decision-directed CIR estimators. As seen from Fig. 2, the Turbo receiver with APP-SDD-KF equalizer achieved slightly
0
SDD estimators were able to exploit in a very efficient way the a priori information about the data symbols obtained from the SISO decoder. As seen from Fig. 3, adaptive Turbo receivers employing the SDD channel estimators achieved a remarkable performance gain when the number of iteration was increased. In particular, an excellent channel tracking capability of the soft decision directed Kalman filter was clearly demonstrated in this frequency-selective fast fading channel. In addition, the APP-SDD-RCKF demodulator achieved practically same performance as the APP-PSP-KF demodulator althought its computational complexity is considerable smaller.
10
−1
10
−2
BER
10
−3
10
Perfect CSI APP−PSAE APP−SDD−KF APP−SDD−RCKF APP−PSP−KF
−4
10
1st Iteration 4th Iteration
−5
10
3
3.5
4
Õ,ÖoרÙ1Ú ÙoÙ1Û
4.5
5
5.5 Eb/N0 [dB]
6
6.5
7
7.5
8
ÜyØÞÝoÙ
Fig. 2. Comparison of various A-SISO algorithms in coded system with fading and . rate 0
10
−1
10
VI. C ONCLUSION We introduced two versions of the BEM-based MAP detection and APP demodulation algorithms and simulated their performance in uncoded and coded QPSK transmission system over the frequency-selective Rayleigh fading channel. Especially, soft decision directed KF and RCKF channel estimators were derived and their embedding into the well-known forwardbackward processing SISO algorithms was illustrated. The simulation results showed that an application of the adaptive APPSDD algorithms to the iterative “turbo-processing” receivers can provide significant performance gain compared to the APP algorithm with pilot symbol based CIR estimation.
−2
BER
10
−3
10
Perfect CSI APP−PSAE APP−SDD−KF APP−SDD−RCKF APP−PSP−KF
−4
10
1st Iteration 4th Iteration −5
10
6
8
Õ®Ö2רÙ1Ú Ù1Û
10
12
14 Eb/N0 [dB]
16
18
20
22
ÜyØ(Û®ß
Fig. 3. Comparison of various A-SISO algorithms in coded systems with fading rate and .
better performance than the Turbo receivers with APP-PSPKF and APP-SDD-RCKF equalizers. The APP-PSP-KF is a forward-only version of the PSP-based A-SISO presented in [8]. Interestingly, the APP-SDD-RCKF equalizer achieved the same performance as the APP-PSP-KF equalizer while its computational complexity is significantly smaller. Even at the relatively small fading rate àq¼2á ½ } ½½= , all these adaptive Turbo receivers exhibited a significant performance gain compared to the Turbo receiver with PSAE demodulator. The performance of the adaptive Turbo receivers was also simulated with the following system parameters: à ¼ á K½ } ½= , &ZÇ2ÈfÉ â=1½ , &ªÃÐÄ HÅ|Æ ã=1 and &ËÊ j Æ >Ô and the results are presented in Fig. 3. Turbo receivers with the soft decision directed KF and RCKF estimators exhibited, after first iteration, an error floor at an intolerable high BER value. However, these
R EFERENCES [1] J. G. Proakis, Digital communications. New York: McGraw-Hill, second ed., 1989. [2] C. Douillard, M. J´ez´equel, C. Berrou, A. Picard, P. Didier, and A. Glavieux, “Iterative correction of intersymbol interference: Turboequalization,” Europ. Trans. Telecommun. (ETT), vol. 6, pp. 507–511, Sept.–Oct. 1995. [3] M. J. Gertsman and J. H. Lodge, “Symbol-by-symbol MAP demodulation of CPM and PSK signals on Rayleigh flat-fading channels,” IEEE Trans. Commun., vol. 45, pp. 788–799, July 1997. [4] X. Yu and S. Pasupathy, “Innovations-based MLSE for Rayleigh fading channels,” IEEE Trans. Commun., vol. 43, pp. 1534–1544, Feb./Mar./Apr. 1995. [5] E. Chiavaccini and G. M. Vitetta, “MAP symbol estimation on frequencyflat Rayleigh fading channels via a Bayesian EM algorithm,” IEEE Trans. Commun., vol. 49, pp. 1869–1872, Nov. 2001. [6] A. Logothetis and V. Krishnamurthy, “Expectation maximization algorithms for MAP estimation of jump Markov linear systems,” IEEE Trans. Signal Proc., vol. 47, pp. 2139–2156, Aug. 1999. [7] E. Baccarelli and R. Cusani, “Combined channel estimation and data detection using soft statistics for frequency-selective fast-fading digital links,” IEEE Trans. Commun., vol. 46, pp. 424–427, Apr. 1998. [8] A. Anastasopoulos and K. M. Chugg, “Adaptive soft-input soft-output algorithms for iterative detection with parametric uncertainty,” IEEE Trans. Commun., vol. 48, pp. 1638–1649, Oct. 2000. [9] L. M. Davis, I. B. Collings, and P. Hoeher, “Joint MAP equalization and channel estimation for frequency-selective and frequency-flat fast-fading channels,” IEEE Trans. Commun., vol. 49, pp. 2106–2114, Dec. 2001. [10] B. D. O. Anderson and J. B. Moore, Optimal filtering. Englewood Cliffs, NJ: Prentice Hall, 1979. [11] M. E. Rollins and S. J. Simmons, “Simplified per-survivor Kalman processing in fast frequency-selective channels,” IEEE Trans. Commun., vol. 45, pp. 544–553, May 1997. [12] M. Feder and E. Weinstein, “Parameter estimation of superimposed signals using the EM algorithm,” IEEE Trans. Signal Proc., vol. 36, pp. 477– 489, Apr. 1988. ¨ [13] A. O. Berthet, B. S. Unal, and R. Visoz, “Iterative decoding of convolutionally encoded signals over multipath Rayleigh fading channels,” IEEE J. Select. Areas Commun., vol. 19, pp. 1729–1743, Sept. 2001.