Kernel-Based Nonlinear Beamforming Construction Using ... - CiteSeerX

5 downloads 0 Views 117KB Size Report
Abstract— This letter shows that the wireless communication system capacity is greatly enhanced by ... classification, Fisher ratio for class separability measure.
IEEE SIGNAL PROCESSING LETTERS, VOL. , NO. , 2003

1

Kernel-Based Nonlinear Beamforming Construction Using Orthogonal Forward Selection with Fisher Ratio Class Separability Measure S. Chen, L. Hanzo and A. Wolfgang Abstract— This letter shows that the wireless communication system capacity is greatly enhanced by employing nonlinear beamforming and the optimal Bayesian beamformer outperforms the standard linear beamformer significantly in terms of a reduced bit error rate, at a cost of increased complexity. Block-data adaptive implementation of the Bayesian beamformer is realized based on an orthogonal forward selection procedure with Fisher ratio for class separability measure. Keywords—Nonlinear beamforming, orthogonal least squares, Bayesian classification, Fisher ratio for class separability measure

I. I NTRODUCTION Spatial processing with adaptive antenna array has shown real promise for substantial capacity enhancement in mobile communication [1]–[5]. Adaptive beamforming can separate signals transmitted on the same carrier frequency, provided that they are separated in the spatial domain. The beamforming processing is classically done by forming a linear combination of the signals received from the different elements of an antenna array. We refer to this classical beamforming as linear beamforming. Recent work [6] has investigated a linear beamforming technique based directly on minimizing the system bit error rate (BER) and developed an adaptive algorithm for realizing the linear minimum BER (LMBER) beamforming. The results in [6] have demonstrated that the LMBER beamforming provides considerable performance gains in terms of a reduced BER over the usual linear minimum mean square error (LMMSE) beamforming. The spatial separation in angles of arrival between the desired signal and the closest interfering signal determines the system performance and hence capacity. When this separation is below certain threshold, linear beamforming ultimately fails because the system becomes linear inseparable, a situation that is similar to the single-user channel equalization [7],[8]. For the sake of notational simplicity and for highlighting the basic concepts, we assume that the modulation scheme is binary phase shift keying (BPSK), the channel is non-dispersive with additive white Gaussian noise, and narrow-band beamforming is considered. We derive the optimal solution for nonlinear beamforming, which we refer to as the Bayesian beamforming solution. Blockdata kernel-based adaptive beamformer is proposed to realize the optimal Bayesian beamformer solution using an orthogonal forward selection (OFS) procedure with Fisher ratio for class separability measure [9]. The proposed nonlinear beamformer construction algorithm is compared with the state-of-art sparse kernel modeling based on the relevance vector machine (RVM) for classification [10],[8]. Manuscript received May 1, 2003; revised July 31, 2003. The authors are with School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, U.K.

II. S YSTEM M ODEL users (sources), It is assumed that the system consists of and each user transmits a BPSK signal on the same carrier frequency . The baseband signal of user is given by

 

     "!$#&%$'()%+* *  (1) where the complex-valued  is the channel coefficient for user

multiplying by the transmitted signal amplitude of user (therefore ,  , - denotes user received signal power) and .  is the  -th bit of user . Without the loss of generality, source 1 is

assumed to be the desired user and the rest of the sources are interfering users. The linear antenna array is considered which consists of uniformly spaced elements, and signals at the element antenna array are given by

/

/

4 0213  5 ;:=?A@ )B 13DCE FHGJIK13 M02L 1GJIK13 (2) 7698 for %N*POQ* / , where B 1 RC  is the relative time delay at element O for source , C is the direction of arrival for source , and IS1 is white Gaussian noise with zero mean and TVU a, I complex-valued 1  , -WXZYK[- . The desired signal to noise ratio is defined as SNR \,  8 , -E]Z$YK[ - , and the desired signal to interferer ratio is given by SIR ^,  8 , -_],  , - for `a .bcb.b . In vector form, the array input can be written as

d)  U 0 8 ebcb.b0gfh Wjik d)L  Glm onNp  Gqlr (3) TsU where lrlQt& Wuv$YK[> - w f with w f denoting the /Xxq/ identity matrix, the system matrix is defined by

n  U  8z8  - z - bcb.b  4 z 4 W  y (4) U with the steering vector for source being zE  :=S7@ )B 8ZRC  :U = 7@ )B - RC {bcb.b:= A@ )B fRC  W i , and the bit vector p    8 9 - {b.bcb 4  W i . Traditionally, a linear beamformer is used, whose output is given by |  t d) o} (5) where } is the complex-valued beamformer weight vector. The decision for the transmitted bit  8  is made according to |Z€ ‚ .~ 8$ M G&„ %Z%Z |Z€  ƒ (6) *ƒ‚ |…€ U |  W denotes the real part of |  . The claswhere  ‡† sical LMMSE beamforming solution is given by }sˆ‰ˆ‰Š‹Œ

2

IEEE SIGNAL PROCESSING LETTERS, VOL. , NO. , 2003

8   n+n tƒG $YK[- w f 8 , with 8

n

being the first column of . Recently we have developed the LMBER beamforming solution [6]. For the linear beamformer to work adequately, the system must be linearly separable in the noise-free case. When the minimum spatial separation in angles of arrival between the desired user and interfering users is below certain threshold, the system inevitably becomes linearly inseparable. In such a situation, the linear bermformer exhibits a high irreducible BER floor, and a nonlinear processing has to be adopted. III. BAYESIAN B EAMFORMING S OLUTION

d)

Given the observation vector , the optimal solution to the beamforming problem is the maximum a posteriori probability solution, which we derive as follows. Denote the     . Further denote possible sequences of as , the first element of , corresponding to the desired user, as

 . Obviously only takes values from the signal state set

p  p %u* e*

d)L  p defined as:   ! d L & nNp  % * *  8

v 4

-

where the kernel variance G is related to the noise variance . The RVM method [10],[8] can be applied to construct a sparse beamformer of IHKJML terms from (12). A drawback of the RVM method is its high computational complexity. The algorithm contains two loops, with the inner loop for updating the kernel weights and the outer loop for the associated hyperparameters. Both loops involves expensive nonlinear optimization. Furthermore, the RVM method starts with the full model set and removes those kernel terms that have large values in their associated hyperparameters. Because the Hessian matrix associated with the full model set is typically ill-conditioned and may even be non invertible, the RVM method is inherently ill-conditioned and its iterative procedure generally converges with slow rate and may suffer from numerical instability. An alternative way of constructing a sparse kernel model from the full model (12) is the OFS procedure based on Fisher ratio class separability measure [9], which is computationally attractive and numerically robust. Define the modeling residual as N

O . Then the kernel model (12) over the training data set can be collected together as

YK[-

    „ |    8  „ | Rd)  _ E 8      can be divided into two subsets conditioned on : P (14) RQTS GVU   ! d L    c%+* &*     8   #&%Z' (7) P U U   _ % {  c b . b b   .   8   _ % {  c b . b H b .   8  Wi where    !]Z . The posterior probabilities or decision vari- where the target vector  U W O b.b.b W O  W i  8 # the regression matrix QX W with ables for  8   #&% given d) are W U U 3  _ % K  . b . b b   R  ) d   _ %     ) d  Kb.b.b C Rd)  d)  W i  g C 2 C C #%$'& (   Wi  ) d      d „ 5 )         

L "    :+= * -„ , U 8Qbcb.b # (15) 6 Q8  SY [-  f $Y [- , . (8) for %q* *  , the kernel weight vector S  B Wi , U  U N 3%E{b.bcb N    W i . Let anB orthogonal and the residual vector 8 ( decomposition of the regression matrix Q be QX!XZY , where where   #/$0& are a priori probabilities of d L  and , d , -  dSt d . The optimal decision is given by %`_h8  - b.bcb _8  # acbbb ‚ % . . . ... b  " 32 5 46"     c~ 8   G&„ %Z%Z1 (16) Y \  ] [ (9)  otherwise 7 ]] .. . . . . . . _ # 8 # d  . % ]^ ‚ bcb.b ‚ Let we redefine a single decision variable as # & and 9| 8 f 8  8 f 8  bcb.bgf 8  # a bb   5 :==< „ , d) „ d L , (10) f  8 f  - bcb.g b f - # b ZY [- >

69;8 : U e 8 e bcb.b e # X  W{ []] ... ... ... ... d (17) f where

 sgn   8  ]    SY [ -   . Then the optimal deci]^ f #  h 8 f #  - bcb.i b f # # : sion (9) is equivalent to e e 9| 8 with orthogonal columns that satisfy i  ‚ if

 j . The ‚ .~ 8   G&„ %…%… | 8 ??4P (11) kernel model (14) can alternatively be expressed as @P‚ 7 P G U  k Xl m (18) IV. B LOCK - DATA K ERNEL -BASED N ONLINEAR U n 89bcb.b n # where l  W i satisfies the triangular system YoSPk  l. B EAMFORMER C ONSTRUCTION # A I  K H p J L A sparse -term model can be selected by incrementally Given a block of  training data !cd)_8H' 698 , consider maximizing a class separability measure in an OFS procedure the nonlinear beamformer of the form [9]. Define the two class sets q  !cd)  O   #&%$' ,  # | Rd  5 D1 C21 Rd  and let the numbers of points in q  be   , respectively, with G     1 6Q8 B (12)    . The means and variances of training samples e 2 belonging to classes q in the direction of basis 1 are given by  # % where 1 are the real-valued weights and C 1Dd   C Rdhd)DO are 5  O   „ %Es f$  1 1  B chosen kernel basis functions. In our application, C FE;E$ can be 2 8r  2 A6Q chosen as the Gaussian kernel function of the form # % 5    „ %EKKf$ 1 „ 1D 1 - (19) C Rdhd)DO  := <  Y  „ , d „  G d)DO , - >  (13) 2  8r O

2  2 A6Q '

  . The state set 

,

CHEN ET AL.

3

% 5 #    G%_ C 1  8 

  1    #      7 698 r O

% 5  1   O   G%_ f$  1 #   7698 r % 5

 O   G%_  C  1  8  „   1     1   # Y  %   7698 r Y -  1    5 7698 r  O   G%_KDf$  1 „   1D -  (20) and



    1 „   1   1  respectively, where R0{  % for 0  ‚ and D0g  ‚ for 0  j ‚ . 2  1 - G 1 -7 r r Fisher ratio, defined as the ratio of the interclass difference and e 1 cY     _Y     the intraclass spread, in the direction of is given by [11]: 2 Let the index set  be:   !_OQ* *  and passes Test ' .  1 R 2  1 „   1D   1    

(21)  Y - 1 G Y - 1 7 Step 2. Find: 1   " ! -

desired source 1

interferer source 3 o

interferer source 4

45

o

30

interferer source 2

θ θ < 30 o

λ/2 Fig. 1. Locations of the desired and interfering sources with respect to the twospacing, where is the wavelength. element linear antenna array having

/*021

/

4

IEEE SIGNAL PROCESSING LETTERS, VOL. , NO. , 2003

0

0

-1 log10(Bit Error Rate)

log10(Bit Error Rate)

-1

RVM OFS Bayesian

-2 -3 -4 LMMSE LMBER Bayesian

-5

-2

-3

-4

-6

-5 0

2

4

6

8

10

12

14

16

0

2

4

SNR (dB)

(a) 0

-1

10

RVM OFS Bayesian

-1 log10(Bit Error Rate)

log10(Bit Error Rate)

8

(a)

0

-2 -3 -4 LMMSE LMBER Bayesian

-5

-2

-3

-4

-6

-5 0



6

SNR (dB)

2

4

6

8

10

12

14

16

0

4

6

8

10

SNR (dB)

(b)

(b)

 

Fig. 2. Comparison of the bit error rates of three theoretical beamformers: (a) and (b) .

ZY [-

-

%_‚ Y [-

values for G were found to be in the range of to , depending on the SNR. The numbers of RBF centers or kernel terms identified by the two algorithms over the given SNR val# to 20 with the typical ues were similar, ranging from  HKJML 

value of IHKJML . The BERs of the RVM and OFS beamformers are compared in Fig. 3. It can be seen that both kernelbased beamformers have similarly good performance with similar model sparsity. The OFS algorithm based on Fisher ratio however has considerably computational and numerical advantages during the construction process.

 %

2

SNR (dB)

 %

VI. C ONCLUSIONS The optimal nonlinear beamforming assisted receiver has been derived and it has been shown that this optimal Bayesian beamformer outperforms the linear beamformer significantly in terms of a reduced bit error rate. This demonstrates the potential of system capacity enhancement by employing nonlinear beamforming. Block-data kernel-based adaptive implementation of the optimal Bayesian beamformer is investigated using the OFS algorithm based on Fisher ratio for class separability measure. Empirical results have demonstrated that this construction algorithm has excellent performance similar to that of the RVM algorithm, but it is computationally much simpler and numerically much more robust.

12

14

16

Fig. 3. Performance comparison of the Bayesian beamformer with the RBF beamformers constructed by the RVM algorithm and the OFS with Fisher ratio, respectively: (a) and (b) .

 

  

R EFERENCES [1]

J.H. Winters, J. Salz and R.D. Gitlin, “The impact of antenna diversity on the capacity of wireless communication systems,” IEEE Trans. Communications, Vol.42, No.2, pp.1740–1751, 1994. [2] J. Litva and T. K.Y. Lo, Digital Beamforming in Wireless Communications. London: Artech House, 1996. [3] L. C. Godara, “Applications of antenna arrays to mobile communications, Part I: Performance improvement, feasibility, and system considerations,” Proc. IEEE, Vol.85, No.7, pp.1031–1060, 1997. [4] J.H. Winters, “Smart antennas for wireless systems,” IEEE Personal Communications, Vol.5, No.1, pp.23–27, 1998. [5] J.S. Blogh and L. Hanzo, Third Generation Systems and Intelligent Wireless Networking – Smart Antenna and Adaptive Modulation. Chichester: John Wiley, 2002. [6] S. Chen, L. Hanzo and N.N. Ahmad, “Adaptive minimum bit error rate beamforming assisted receiver for wireless communications,” in Proc. ICASSP 2003 (Hong Kong, China), April 6-10, 2003, pp.640–643. [7] S. Chen, B. Mulgrew and P.M. Grant, “A clustering technique for digital communications channel equalisation using radial basis function networks,” IEEE Trans. Neural Networks, Vol.4, No.4, pp.570–579, 1993. [8] S. Chen, S.R. Gunn and C.J. Harris, “The relevance vector machine technique for channel equalization application,” IEEE Trans. Neural Networks, Vol.12, No.6, pp.1529–1532, 2001. [9] K.Z. Mao, “RBF neural network center selection based on Fisher ratio class separability measure,” IEEE Trans. Neural Networks, Vol.13, No.5, pp.1211–1217, 2002. [10] M.E. Tipping, “Sparse Bayesian learning and the relevance vector machine,” J. Machine Learning Research, Vol.1, pp.211–244, 2001. [11] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973. [12] S. Chen, S.A. Billings and W. Luo, “Orthogonal least squares methods and their application to non-linear system identification,” Int. J. Control, Vol.50, No.5, pp.1873–1896, 1989.

Suggest Documents