Subspace Tracking for Mobile Communications - CiteSeerX

Subspace Tracking for Mobile Communications Rolf Weber1

Institute for Network Theory and Circuit Design Munich University of Technology D-80290 Munich Germany

Abstract In recent years space-division multiple access (SDMA) has been suggested in order to increase the capacity of a cellular mobile communication system and to simultaneously combat the losses due to multipath and interference. This is achieved by using spatial diversity introduced by an antenna array and therefore SDMA can easily be implemented as an additional component into existing systems that use time- and/or frequency-division multiple access. An important characteristic of an SDMA system is the direction of arrival of the dierent wavefronts impinging on the antenna array which can be obtained from well known high resolution methods that exploit knowledge of the signal or noise subspace. As the scenario is time-varying the subspaces will also change with time and therefore they have to be tracked to avoid repetitive computation of them at each time step. This subspace tracking has been the topic of many research papers over the last years but all of them only deal with snapshot vectors. On the other hand, many existing mobile communication systems have a TDMA component so that a whole burst of data is transmitted during one periodically recurring time slot. Unfortunately, the existing algorithms for subspace tracking do not work with burst-wise data. Therefore we introduce three algorithms that are based on known tracking algorithms for snapshot vectors which are extended to work with bursty data. Another important problem is the estimation of the current number of waveforms impinging on the antenna array because only exact knowledge of that number ensures correct estimation of the directions of arrival. In a mobile communication system, however, this number may rapidly change because of the varying number of re ections of the signal. We therefore propose an ecient algorithm for detecting changes in the number of signals that works with burst-wise data again. Simulation results demonstrate the good performance of all proposed algorithms in a time-varying mobile communication scenario. 1

Email: [email protected]

1

1 Introduction Recently, space-division multiple access (SDMA) has received much attention as a method of reducing losses due to multipath and interference and thus increasing the capacity of a cellular mobile communication system. This is achieved by using spatial diversity introduced by an antenna array in addition to time- and/or frequency-division multiple access (TDMA/FDMA). For example, dierent users transmit in the same frequency band and in the same time slot and are only separated by the dierent directions of arrival (DOA's) of their signals at the base station. The prerequisite, however, is that these users are spatially well separated at the base station. Therefore, exact knowledge of the DOA's of the mobile users is crucial to a SDMA system which - unfortunately - is a heavy computational burden and thus the need for ecient computational methods arises. We now introduce the used data model. Assume that there are r narrow-band signals having the same center-frequency f0 and being characterized by their complex envelopes [Pro95] si(t), 1 i r, impinging on an antenna array consisting of n, n r, identical sensors under the directions of arrival i . The narrow-band assumption implies [RK89]

si(t ; ) si(t) e;j2f0 ;

1 i r;

for all possible propagation delays . Let x(k) 2 C n be the data vector observed at the kth snapshot. The previous stated assumptions then lead to the model for the sensor array narrow-band signal processing problem

x(k) =

r X i=1

a (i ) si(k) + n(k)

= A() s(k) + n(k)

(1)

with the array steering matrix A() = [a (i) a (r )] 2 C nr , depending on the vector of the directions of arrival, and the complex valued measurement noise vector n(k) 2 C n which is assumed to be spatially white with the equal variance 2 and uncorrelated with the signal vector s(k). This yields the following expression for the correlation matrix:

C = E x(k) xH (k) = A() CS AH () + 2 I

(2)

where Efg denotes expectation, ()H denotes complex transposition and I is the identity matrix. If we assume that the signals are not completely correlated it can be shown that

2

1 INTRODUCTION

the signal correlation matrix CS = E s(k) sH (k) 2 C rr has full rank r [Pil89] and is thus positive de nite. As the array steering matrix is assumed to be nonsingular it follows that A() CS AH () 2 C nn will have r positive eigenvalues again [Str88]. Therefore, the eigendecomposition of C is

C = U UH with the diagonal matrix = diag (1 n) and the unitary matrix U = [u1 un]. The n eigenvalues i can be ordered as follows:

1 r > r+1 = = n = 2 : The r largest eigenvalues are termed the signal eigenvalues and the remaining (n ; r) ones are the noise eigenvalues. Accordingly,

US = [u1 ur ]

and

UN = [ur+1 un]

span the signal and the noise subspace, respectively. This terminology becomes obvious when we set 2 = 0 in equation (2) resulting in the eigendecomposition

C = A() CS AH () = US UHS and therefore

R (A()) = R (US ) where R() is the space spanned by the columns of the matrix in brackets. The noise subspace is then the orthogonal complement of the signal subspace. A second approach to calculate the two subspaces uses a data matrix X = [x(1) x(m)], consisting of m snapshots x(k), instead of the correlation matrix C. Computing the singular value decomposition of this matrix provides again the two subspaces. However, this approach might be numerically more stable [Got95]. This splitting of a larger space into two smaller subspaces is crucial to the subspace based high resolution methods for direction of arrival nding like MUSIC [Sch82], ESPRIT [RK89] and Weighted Subspace Fitting [VOK91]. On the other hand, this means a very costly numerical task, especially if the subspaces change with time and therefore have to be computed over and over again in order to use the mentioned high resolution methods to obtain the

3 DOA's. This shows the necessity to obtain and track the subspaces eciently which is the basic topic of this work. Usually the problem of tracking subspaces is stated for the model given in (1). But in many mobile communication systems, e.g. the GSM system, there is not a continuous data ow but the mobile user periodically transmits a whole data burst during one time slot. In the GSM system there are eight periodically recurring time slots of 576:9 s building a TDMA frame of 4:616 ms. In each time slot a burst of m = 156 symbols is transmitted2. Therefore, we assume that r data bursts, consisting of m symbols each, arrive simultaneously at an antenna array resulting in the matrix model:

X(p) = A() S(p) + N(p)

(3)

with the (n m) data matrix X, the (r m) signal matrix S and the (n m) noise matrix N, all complex. To distinguish the matrix model in (3) from the vector model (1) a dierent time parameter p is introduced that corresponds to the TDMA frame. Because of the dierent times p and k a sequential use of the known subspace tracking algorithms for snapshot vectors cannot be used with data bursts which emphasizes the need for tracking algorithms that are able to work with burst-wise data. As the time between consecutive bursts of one user is usually small, e.g. 4:616 ms in the GSM system, the directions of arrival of the mobile users do not change signi cantly during that time so that an update of the interesting subspace from burst to burst is sucient. In the following we derive ecient algorithms for the subspace tracking problem taking the burst-wise data ow into account.

2

Actually, 156 25 symbols per burst are transmitted. :

4

2 INVARIANT SUBSPACE UPDATING (ISU)

2 Invariant Subspace Updating (ISU) 2.1 The Algebraic Riccati Equation A subspace of a linear transformation is called invariant if the image of that subspace under the linear transformation is contained within the original subspace [GLR86]. An example of an invariant subspace is the space spanned by a set of eigenvectors of a matrix. Let C 2 C nn and let X1 be an orthonormal basis for an invariant subspace of C. Then,

R(C X1) R(X1 ) ;

(4)

where R(C) is the range of the matrix C. Let now the unitary matrix [X1 X2 ] be a basis for the space spanned by the columns of C and consider "

#

C11 C12 : [X1 X2]H C [X1 X2 ] = C21 C22

(5)

Since C21 = XH2 C X1 and XH2 X1 = 0 it follows from equation (4) that C21 = 0. Now, consider a matrix C0, close to C, and its orthonormal basis [X01 X02 ], where X01 is a basis for the invariant subspace that corresponds to X1 . De ne a unitary matrix, U, that relates [X01 X02] to [X1 X2 ] by [X01 X02] = [X1 X2] U :

(6)

Following the suggestion in [Ste73], U can be factored in the form: "

H I ; P U= P I

#"

(I + PH P);1=2

0

#

0 ; (I + PPH );1=2

(7)

for P chosen appropriately. The terms on the main diagonal of the second factor are the inverses of the Hermitian positive de nite square roots of the positive de nite matrices (I + PH P) and (I + PPH ). As the square roots are supposed to be Hermitian they can be made unique [SK94]. Now, build the partitioned matrix equivalent to the one in (5) from the new perturbed matrix C0 and the bases of the original invariant subspace and its orthogonal complement, X1 and X2, respectively: "

#

C011 C012 : [X1 X2]H C0 [X1 X2 ] = C021 C022

(8)

2.1 The Algebraic Riccati Equation

5

In order for X01 to be a basis for an invariant subspace of C0 ,

X02H C0 X01 = 0

(9)

must be satis ed. From equations (6) and (7),

X01 = (X1 + X2 P) (I + PH P);1=2

(10)

X02 = (;X1PH + X2) (I + PPH );1=2

(11)

and

follow and substituting these two equations into (9) yields

0 = X02H C0 X01 = (;X1PH + X2)(I + PPH );1=2 H C0 (X1 + X2P) (I + PH P);1=2 = (;X1 PH + X2 )H C0 (X1 + X2 P) = ;P XH1 C0 X1 ; P XH1 C0 X2 P + XH2 C0 X1 + XH2 C0 X2 P ;

(12)

where from line two to line three the square root factors could be cancelled because they are positive de nite. Equation (12) simpli es to

P C011 ; C022 P = C021 ; P C012 P :

(13)

This equation is known as the algebraic Riccati equation [MV96]. Any P that satis es this equation can be used to obtain X01 and X02 in terms of X1 and X2 by equations (10) and (11). In the preceding derivation it would be sucient for X1 to be an arbitrary basis of an invariant subspace and X2 to be any orthogonal complement. But for X01 and X02 to be orthogonal again, we must have

0 = X02H X01 = ;P XH1 X1 + XH2 X2 P ; which is satis ed if X1 and X2 consist of orthonormal columns.

6


2.2 Solution of the Algebraic Riccati Equation In [IK66] the splitting method is suggested for the solution of a linear system of equations, A p = b. The linear operator A is split as A = M ; N and the solution p is computed iteratively by

M pi+1 = N pi + b ;

i 1; ;

with p1 = 0. MacInnes and Vaccaro [Mac95, MV96] suggest a similar splitting of a general linear operator in order to solve the algebraic Riccati equation (13) eciently. Thereto, the operators

F (P) := P C011 ; C022 P (P) := P C012 P g := C021

(14)

are introduced, so that (13) can be written as

F (P) = g ; (P) :

(15)

Now, the linear operator F (P) is split into

F (P) = (M ; N )(P) with

M (P) := P C011

and

N (P) := C022 P :

This yields the iterative scheme

Pi+1 C011 = C022 Pi + C021 ; Pi C012 Pi

(16)

for the calculation of the matrix P. There are two reasons for choosing the above splitting:

C011 will usually be a small square matrix ensuring an ecient solution of (16). In the following application, C022 will correspond to the noise subspace and therefore it is supposed to have small norm, and so F (P) P C011 .

2.2 Solution of the Algebraic Riccati Equation

7

The remaining problem is to prove under which conditions the iteration (16) will converge. To solve this problem, the following theorem is used:

Theorem 2.1 ([Mac95]) Let T be a linear operator on a Banach space, B, with a bounded inverse on B and consider the equation T (Pi+1) = g ; (Pi ) :

(17)

Let (P) satisfy

jj(P)jj jjPjj + jjPjj2 ;

; 2 R + ;

and

jj(P) ; (Q)jj [ + 2 max fjjPjj; jjQjjg ] jjP ; Qjj : Let

= jjT ;1jj;1 ;

= jjgjj :

Then if

; > 2 p and P0 = 0, the sequence P1; P2 ; P3 ; : : : generated by (17) above converges to the unique solution P of (17) that satis es

jjPjj (1 + ) < 2 ; ; where is the smallest root of the equation

= (1 + ) + (1 + )2 : 2 Moreover,

; Pijj jjP ; Pijj jjPi+1 1; i;k jjP1 k;+1; Pk jj ;

k i;

(18)

8


where

= + (4 ; ) < 1 :

Proof: See [Mac95]

Consider equation (16) and set

T (P) := P C011

and

(P) := C022 P ; P C012 P :

It is then straightforward to show [MV96] that

jj(P)jj jjC022 jj jjPjj + jjC012jj jjPjj2 and

jj(P) ; (Q)jj (jjC022 jj + 2jjC012 jj max fjjPjj; jjQjjg) jjP ; Qjj and thus the conditions for jj(P)jj in the theorem above are satis ed. Assuming there exists a solution P to equation (17) and we start the iteration with P0 = 0, 1 then P1 = C022 C0; 11 and it follows from equation (18) 0 0 ;1 jjPjj jjC122;C11 jj

and thus, the solution is bounded.

2.3 Application to DOA-Estimation In this section, we demonstrate how to use the above presented theory to solve the DOA tracking problem. The expression

C(p) =

p X t=1

p;t X(t) XH (t) = C(p ; 1) + X(p) XH (p)

with the positive forgetting factor 1 is a possible estimate of the correlation matrix at time p.

2.3 Application to DOA-Estimation

9

Let C in equation (5) be C(p ; 1) and let X1 2 C nr be an orthonormal eigenbasis for the signal subspace and X2 2 C n(n;r) the corresponding orthonormal eigenbasis for the noise subspace at time instant p ; 1. C0 is then C(p) and we are looking for the new basis of the subspace of interest at time p. In order to update the bases the algebraic Riccati equation (13) has to be solved. As a rst step C0 = C(p) is partitioned as stated in equation (8) by using the know matrices X1 and X2 . Then the matrix P can be computed iteratively from equation (16) with the initialization P0 = 0 and P1 = C022 C011;1. In [Mac95] it is shown that three iteration steps are usually enough. Knowing P, the matrices X01 and X02 can be obtained from equations (10) and (11), respectively. These two matrices, however, will not contain orthonormal columns in general and therefore a QR factorization of X01 is performed yielding the new orthonormal bases3 X1 and X2 at time p [GL89].

3

X1 consists of the rst columns of Q 2 C nn and X2 of the remaining ; ones. r

n

r

10

3 PROJECTION APPROXIMATION SUBSPACE TRACKING

3 Projection Approximation Subspace Tracking In [Yan95a] Yang introduced a scalar cost function J (W) of a matrix W 2 C nr (r < n) and proved that by nding the global minimum of that unconstrained cost function the signal subspace is also found. In a further work [Yan95b], Yang extended his ideas to both rank and subspace tracking. Unfortunately, both the derivation of the theoretical results and the resulting computational ecient tracking algorithms heavily depend on the rank-one update of the sample correlation matrix and are therefore not directly applicable in connection with a matrix update. However, the known cost function can be generalized and it will be shown that an optimization of that modi ed cost function again provides the signal subspace. Subsequently, we introduce ecient algorithms for the optimization of the modi ed cost function.

3.1 Subspace Identi cation by Optimization The Frobenius norm of a n m matrix M = [m1 m2 mm], mj 2 C n , 1 j m, is de ned as follows [GL89]:

jjMjj2F

:=

n X m X i=1 j =1

m2ij

=

m X j =1

jjmj jj22 :

Consider now the cost function

J 0 (W) = E jjX ; W WH Xjj2F

(19)

with the matrix argument W 2 C nr and the data matrix X = [x1 x2 xm ], xj 2 C n , 1 j m. Without loss of generality, we assume W to have full rank r. Equation (19) can be written in the form

J 0 (W) = E

(

m X j =1

jjxj ; W WH xj jj22

)

=

m X j =1

E jjxj ; W WH xj jj22 :

If we assume a stationary signal xj - which is implicitly assumed in almost all articles covering subspace techniques - the expectation is independent of the discrete time j and we get

J 0 (W) = m E jjx ; W WH xjj22 = m J (W)

(20)

3.1 Subspace Identi cation by Optimization

11

where

J (W) = E jjx ; W WH xjj22 = tr(C) ; 2 tr(WH C W) + tr(WH C W WH W)

(21)

is the original cost function introduced by Yang [Yan95a]. The operation tr() denotes the trace of a square matrix. As the only dierence between the new and the old cost function is a constant factor, the burst length m, the two theorems concerned with the stationary points of J (W) that were stated and proved in [Yan95a] are still valid. For the sake of completeness they are repeated here together with the (detailed) proof of Theorem 3.1.

Theorem 3.1 ([Yan95a]) W is a stationary point of J (W) if and only if W = Ur Q where Ur 2 C nr contains any r distinct eigenvectors of C and Q 2 C rr is an arbitrary

unitary matrix. At each stationary point, J (W) equals the sum of eigenvalues whose eigenvectors are not involved in Ur .

Proof: For simplicity, the proof is stated for real valued data and the extension to complex

valued data can be found in the original article by Yang. For the proof of the theorem the following lemma will be needed.

Lemma 3.1 ([Yan95a]) For real symmetric matrices A and B, A B + B A = 0 implies B = 0 if A is positive de nite. Proof: Let UB B UTB be the eigendecomposition of B with the eigenvalues Bi . Denote UTB A UB with A . A B + B A = 0 implies then A B + B A = 0, which means that the diagonal elements must satisfy 2 aii Bi = 0 : Since A is positive de nite, aii > 0 for all i and we therefore conclude that Bi must be zero for all i, which ensures that B = 0.

Let U UT be the eigendecomposition of the correlation matrix C, W = [w1 w2 wr ] and let r = [r1 r2 rr ] = @ @W

12


be the gradient matrix, where ri is the gradient operator with respect to the vector wi for 1 i r [Gra81, Hay91]: 8 > > > > > > > > > > > > >
i

3

2

2 > > > > > > > 1 6 6 > > 2 4 > > > :

..

@ @ wni

real valued case

7 5

@ @ Refw1i g

+ j @ Im@fw1 g ... @ @ @ Refw g + j @ Imfw g i

ni

;

3 7 7 5

complex valued case

ni

where Refg and Imfg denote the real and imaginary part of a complex number, respecp tively, and j = ;1 is the imaginary unit. It is straightforward to show, that the gradient of the cost function (21) is as follows 1 r J = [;2 C + C W WT + W WT C] w ; i 2 i

1 i r:

Therefore, we get for the gradient matrix [Gra81] the expression4 1 r J = [;2 C + C W WT + W WT C] W : 2

(22)

Q is orthogonal and Ur contains r eigenvectors of C which are orthonormal because of C being symmetric. As WT W = I, equation (22) becomes

W = Ur Q:

;2 C + C + Ur UTr C Ur Q :

(23)

As Ur contains r eigenvectors of C, it follows that C Ur = Ur r , with the diagonal matrix r of the corresponding eigenvalues. With UTr Ur = I, equation (23) becomes [;2 Ur r + Ur r + Ur r ] Q = 0 ; and therefore we showed that r J = 0.

4

For complex valued date the gradient matrix is r = [;2 C + C WWH + W WH C] W, [Yan95a]. J

3.1 Subspace Identi cation by Optimization

r J = 0:

13

r J = 0 implies WT 21 r J = WT C W (WT W ; I) + (WT W ; I) WT C W = 0 :

Since both WT C W and WT W are symmetric and WT C W as a congruent transformation of a positive de nite matrix is again positive de nite [Str88], it follows from Lemma 3.1 that WT W = I. With (22) follows [;2 C W + C W + W WT C W] = 0

) C W = W WT C W :

(24)

Let QT r Q be the eigendecomposition of WT C W. As there is a set of orthonormal eigenvectors for every symmetric matrix [Str88], QT = Q;1 . Then, multiplying both sides of equation (24) by QT yields

C Ur = Ur r with Ur = W QT . Since r is a diagonal matrix, the full rank matrix Ur must contain r distinct eigenvectors of C. Lets assume now - without loss of generality - that the eigenvalues of C are arbitrarily ordered and r = diag(1 r ) is a diagonal matrix built with the rst r eigenvalues and Ur contains the corresponding eigenvectors. As a similarity transformation of a matrix does not change the eigenvalues of this matrix and the trace of that matrix is the sum of the eigenvalues, the trace is itself invariant to a similarity transformation of its argument. Additionally, keeping in mind that WT W = I we get from (21)

J (Ur Q) = tr(C) ; tr(QT r Q) = tr() ; tr(r ) =

n X

i=r+1

which completes the proof.

i ;

Theorem 3.2 ([Yan95a]) All stationary points of J (W) are saddle points except when Ur contains the r dominant eigenvectors of C. In this case, J (W) attains the global minimum.

14


Proof: See [Yan95a].

Referring to the two theorems above, the following statements can be made [Yan95a]:

At the global minimum of J (W), R(W) is equal to the signal subspace of C and as

there is no other local minimum the signal subspace can be found via iterative methods and these algorithms will always converge.

There are no constraints on the orthonormality of the columns of W. As C is Hermitian, the eigenvectors coming from dierent eigenvalues are orthogonal to one another [Str88] and thus the solution W of the optimization problem satis es WH W = I.

At the global minimum of J (W), W does not contain an eigenbasis of the signal subspace but consists of an arbitrary orthonormal basis of the signal subspace.

As J (W) in equation (21) is invariant with respect to a rotation, W is not unique. The outer product W WH = Ur UHr , however, is unique.

3.2 Subspace Tracking In a real application, the aim will be to eciently estimate the signal subspace recursively by using the data bursts X(p), p 2 N 0 . Furthermore, the signals and/or the scenario will be time-varying so that the subspace has to be tracked adaptively. In the sequel, we will propose two algorithms that ful ll these needs.

3.2.1 Stochastic Gradient-Based Algorithm Since (20) describes an unconstrained cost function to be minimized, a steepest descent algorithm can be used. With equation (22), the gradient of J 0(W) is:

r J 0(W) = m [;2 C + C W WH + W WH C] W : With this expression for the gradient matrix, we get the recursive relation [Hay91] h

W(p) = W(p ; 1) ; m ;2 C^ (p) + C^ (p) W(p ; 1) WH (p ; 1) + i +W(p ; 1) WH (p ; 1) C^ (p) W(p ; 1) (25)

3.2 Subspace Tracking

15

where > 0 is the step size that has to be chosen carefully and C^ (p) is an appropriate estimate of the correlation matrix C at time p. The simplest choice for an estimator of C is [Hay91]

C^ (p) = m1 X(p) XH (p) = m1

m X j =1

xj (p) xHj (p) ;

(26)

the (burst-wise) instantaneous estimate5 of the correlation matrix C at time p. This leads to the well known least-mean-square (LMS) algorithm, introduced by Widrow and Ho [WS85, Hay91]. A further simpli cation can be made by observing that W(p) will converge to a matrix with orthonormal ( ! 0) or nearly orthonormal ( = const but small) columns (at least for stationary signals) [Yan95a]. Therefore, the approximation WH (p ; 1) W(p ; 1) I is justi ed, yielding

W(p) = W(p ; 1) + [X(p) ; W(p ; 1) Y(p)] YH (p) ;

p2N;

(27)

where the abbreviation Y(p) = WH (p ; 1) X(p) is used. The recursion (27) is initialized by W(0) = 0 [Hay91].

3.2.2 QR-RLS-Based Algorithm We now replace the expectation in equation (19) through the exponentially weighted sum p

X J 0 (W(p)) = p;t jjX(t) ; W(p) W(p)H X(t)jj2

t=1

(28)

F

with the forgetting factor 0 < 1, generally taken close to 1 [Hay91]. This factor ensures the tracking capability of the system while working in a non stationary environment because past data is gradually forgotten. The cost function (28) is a fourth order function of the elements of W(p) which makes the use of iterative algorithms necessary [Yan96]. In order to obtain a recursive minimization, we approximate WH (p) X(t) in equation (28), the projection of the columns of X(t) onto the columns of WH (p), by the expression Y(t) = WH (t ; 1) X(t), 1 t < p, which can be instantaneously calculated at time p. This yields (26) is actually the average over one burst of the instantaneous estimate xj ( ) xHj ( ) of the correlation matrix C. 5

p

p

16


a modi ed cost function

J~0 (W(p)) = =

p X t=1

p;t jjX(t) ; W(p) Y(t)jj2F

p X m X t=1 j =1

p;t jjxj (t) ; W(p) yj (t)jj22

(29)

that is quadratic in the elements of W(p). For stationary or slowly varying signals the dierence between WH (p) X(t) and WH (t ; 1) X(t) should be small, especially when t is close to p. The dierence may be larger for t p, but because of the forgetting factor the contribution of past data to the cost function is decreasing. We therefore assume J~0 (W(p)) in (29) to be a good approximation for J 0 (W(p)) and W(p) minimizing J~0(W(p)) to be a good estimate of the signal subspace. Equation (29) can be written in the form

J~0 (W(p)) =

p X m X t=1 j =1

p;t xHj (t) xj (t) ; xHj (t)W(p) yj (t);

;yHj (t) WH (p) xj (t) + yHj (t) WH (p) W(p) yj (t) :

Now, using the equalities [Gra81, Hay91]

@ xHj (t) W(p) yj (t) = 0; @ W(p) @ yHj (t) WH (p) xj (t) = xj (t) yHj (t) ; and @ W(p) @ yHj (t) WH (p) W(p) yj (t) = W(p) yj (t) yHj (t) @ W(p) we get p X m @ J~0(W(p)) = X p;t ;x (t) yH (t) + W(p) y (t) yH (t) ; j j j j @ W(p) t=1 j =1

yielding the normal equations [Bel87] p X m X t=1 j =1

p;t xj (t) ; W(p) yj (t) yHj (t) = 0 :

(30)


17

Splitting the term in brackets we can write

W(p)

p X t=1

W(p)

p ;t p X t=1

m X j =1

yj (t) yHj (t)

p;t Y(t) YH (t)

= =

p X t=1 p X t=1

p;t

m X j =1

xj (t) yHj (t)

p;t X(t) YH (t)

W(p) CY Y (p) = CXY (p)

(31)

where we used the estimates

CY Y (p) = CXY (p) =

p X t=1 p X t=1

p;t Y(t) YH (t) = CY Y (p ; 1) + Y(p) YH (p)

(32)

p;t X(t) YH (t) = CXY (p ; 1) + X(p) YH (p)

(33)

of the exponentially weighted sample correlation matrix CY Y (p) and cross-correlation matrix CXY (p), respectively. Thus, the cost function (29) is minimized by

W(p) = CXY (p) C;Y 1Y (p) ; where it is assumed that the inverse matrix exists. However, direct calculation of the required inverse of CY Y (p) will usually be numerically instable and is therefore replaced by a more robust method. In the vector case that is presented in [Yan95a] the update in equation (32) will be of rank one and therefore the matrix inversion lemma [Bel87] can be used to update the inverse of CY Y (p) from one time-step to another without the necessity to perform an actual matrix inversion, leading to the numerically ecient PAST algorithm. In our case, however, the update in equation (32) is not of rank one and although applying the matrix inversion lemma would be still possible an actual matrix inversion cannot be omitted. We therefore propose another approach to update the needed inverse matrix based on a QR-RLS approach [Bel87, Hay91, YB92]. Assume that CY Y (p) is positive de nite. Then the Cholesky factorization of CY Y (p) is

CY Y (p) = RH (p) R(p)

18


where R(p) is the unique upper triangular Cholesky factor with positive diagonal elements [HJ96]. Using this factorization in equation (31) yields

W(p) RH (p) R(p) = CXY (p) RH (p) R(p) WH (p) = CHXY (p) ;H (p) CH (p) : R(p) WH (p) = R | {z XY } =: ;(p)

(34)

Lets assume R(p ; 1) 2 C rr and ;(p ; 1) 2 C rn are known from the previous time instant and at time p new data X(p) and Y(p) become available. We now show how the updated matrices R(p) and ;(p) can be eciently computed from the known ones. We start with the (r + m) (r + n) pre-array "

p R(p ; 1) p ;(p ; 1) # : YH (p) XH (p)

This pre-array is now multiplied by a unitary matrix Q(p) 2

post-array

"

C (r+m)(r+m)

# " # A11 A12 = Q(p) p R(p ; 1) p ;(p ; 1) ; YH (p) XH (p) 0 A22

such that the

p2N;

(35)

has a block-zero in the marked position. This can be achieved by mr Givens rotations Qi(p), yielding Q(p) = Qmr (p) Qmr;1 (p) Q1 [Hay91, Got95]. To create a block-zero the order of the single rotations has to be chosen carefully to avoid overwriting previously created zero entries again. Now, we have to identify the entries A11, A12 and A22 in the post-array which can be done by observing the fact that for given matrices A1, A2 , B1, B2 2 C nm , n m, BH1 B2 = AH1 A2 is valid for [B1 B2 ] = Q [A1 A2 ] where Q 2 C nn is an arbitrary unitary matrix. Setting "

B1 = B2 = A11 0

#

and

A1 = A2 =

"

p R(p ; 1) #

we get

AH11 A11 = RH (p ; 1) R(p ; 1) + Y(p) YH (p) :

YH (p)


19

Comparing this result with equation (32) we identify A11 to be the Cholesky factor of CY Y (p):

A11 = R(p) : Now, setting "

#

"

#

B1 = A11 = R(p) ; 0 0 and

A1 =

"

p R(p ; 1) #

YH (p)

;

"

B2 = A12 A22

A2 =

"

#

p ;(p ; 1) #

XH (p)

yields

RH (p) A12 = RH (p ; 1) ;(p ; 1) + Y(p) XH (p) = RH (p ; 1) R;H (p ; 1) CHXY (p ; 1) + Y(p) XH (p) = CHXY (p) : Comparing this with the de nition of ;(p) in equation (34) we get

A12 = ;(p) : The remaining task to obtain W(p) is to solve the system of linear equations (34). Since R(p) is an upper triangular matrix, W(p) can be eciently and numerically stable computed by back-substitution [GL89]. The great advantage of the QR-RLS updating in equation (35) is that it can be exactly p initialized by R(0) = I and ;(0) = 0, where is a non-negative constant [YB92]. Furthermore, it is well known that QR-RLS algorithms have good numerical properties and suit for parallel implementation on systolic arrays [Hay91, YB92].

20

4 DETECTING CHANGES IN THE NUMBER OF SIGNALS

4 Detecting Changes in the Number of Signals So far we assumed the number r of signals impinging on the senor array to be known. In general, this will not be the case but r rather has to be estimated. There are two very well known criteria to do this [WK85]: the Akaike Information Criterion (AIC) and Minimum Description Length (MDL). But it is also known that for example the AIC method tends to overestimate the number of signals. Another problem arises from the fact that the rank of the estimated covariance matrix does not change immediately if an additional signal occurs or if one vanishes whenever a sliding rectangular window or exponentially weighting is used for calculation of the correlation matrix. Therefore, MacInnes and Vaccaro [MV96] proposed a new method of detecting changes in the number of signals immediately which is extended to a burst-wise update in the following.

4.1 Increase in the Number of Signals Assume the vector of the most recent DOA's is ^ and let A^ = A(^ ) denote the array steering matrix computed using these DOA's. Let US 2 C nr be the most recently computed orthonormal basis for the signal subspace6 . If the number of signals in the new data burst X 2 C nm increases from r to r + 1, the data burst contains a component that is not in the range of the matrix A^ . Denote with X~ = [~x1 x~ m ] the modi ed data matrix whose columns have a Euclidean norm of one. X~ can be obtained from X by a multiplication from the right with a diagonal matrix where the diagonal elements are the Euclidean norms of the respective columns of X. Then, the orthogonal projections of the m columns of X~ onto the subspace spanned by the r columns of A^ is given by [Str88] ;1

X0 := [x01 x0m ] = A^ A^ H A^

A^ H X~ 2 C nm :

The norm jjx0j jj, 1 j m, will be one if x~ j is in the range of A^ and less then one if not and therefore there will be a spike in at least one of the plots of 1 ; jjx0j jj, 1 j m, versus the iteration number. Let j 0 be

j 0 = arg 1max 1 ; jjx0j jj : j m Then we set US = orth ([US xj ]), where orth ([US xj ]) means that the columns of [US xj ] are orthonormalized which can be done by Gram-Schmidt orthogonalization. 6 U is any orthonormal basis for the signal subspace and not necessarily an eigenbasis S 0

0

0

4.2 Decrease in the Number of Signals

21

4.2 Decrease in the Number of Signals Assume the vector of the most recent DOA's is ^ and let A^ = A(^ ) denote the array steering matrix computed using these DOA's. Let US 2 C nr be the most recently computed orthonormal basis for the signal subspace. If the number of signals decreases from r to r ; 1, the new data burst X 2 C nm will be in the range of only r ; 1 of the r estimated steering vectors. Assuming that the ith signal source disappeared, X will be in the range of

A^ i := [â1 aî;1 aî+1 a^r ] : Denote with X~ = [~x1 x~ m] again the modi ed data matrix whose columns have a Euclidean norm of one. Then, as X~ lies in the range of A^ i, the projection of each vector x~ j , 1 j m, on the columns of A^ i will be approximately of norm one:

jjA^ i (A^ Hi A^ i);1 A^ Hi x~ j jj 1 ;

8j : 1 j m:

(36)

Therefore, a decrease in the number of signals is detected when for some 1 j m the norm of the left-hand side of equation (36) exceeds a threshold close to one for all 0 j m, and we set US = orth(A^ i ), where orth(A^ i) means that the columns of A^ i are orthonormalized. In order to ensure that the correlation matrix C contains only components in the range of the remaining steering vectors, the modi ed correlation function

C = AiAHi is suggested [MV96].

22

5 SIMULATIONS

5 Simulations In the following, we present some simulation results to demonstrate the applicability and the performance of the proposed algorithms. In all cases, a base station equipped with a uniform linear array consisting of 9 equal sensors spaced half of the wavelength apart, d= = 0:5, is used and the TLS-ESPRIT algorithm [RK89] is applied to calculate the directions of arrival. For the uniform linear array the array steering vector is given by 2 6 6 6 6 6 6 6 4

3

1

ej(2f ) a(i) = ej2(2f ) ... ej8(2f ) i

i

7 7 7 7 7 7 7 5

;

i

where the spatial frequency is

fi = d sin(i ) : In the rst experiment, two signals positioned in the same frequency range and time slot impinge at the antenna array and are thus only spatially separated. We consider a GSM like system as described in the introduction and simulate the transmission of 6500 bursts consisting of 156 symbols each. The 6500 bursts correspond to a duration of approximately 30 s. The number of users is assumed to be known to the system. Unlike the GSM modulation scheme, the symbols are complex Gaussian random variables uncorrelated from each other. The rst signal has a xed spatial frequency of f1 = ;0:2 and its signal to noise ratio is 0 dB. The spatial frequency f2 of the second signal increases linearly from 0.2 to 0.3 within the 6500 bursts. The signal to noise ratio of this signal is chosen to be 5 dB. The forgetting factor is set to 0.97. The simulation results are shown on Figure 1. These results demonstrate that all presented algorithms are capable of tracking the signal subspace reliable in a time varying environment. The numerically most ecient algorithm is the stochastic gradient-based algorithm but as Figure 1(d) shows with almost 35 bursts it also has the slowest convergence properties. The reason for this is that we do not have any prior knowledge to initialize the algorithm and therefore chose W(0) to be the zero matrix. On the other hand, the QR-RLS algorithm is computationally complex but it is numerically highly stable because of the use of Cholesky factors and solving a system of linear equations by back-substitution. Furthermore, fast parallel implementations on a systolic array are

23 −5

b) Stoch. Grad. Algorithm (µ=8x10 )

a) ISU−algorithm

0.2

0.2

f

fi

0.4

i

0.4

0

0

−0.2

−0.2 0

2000 4000 Number of Bursts

6000

0

c) QR−RLS Algorithm


6000

d) Comparison

0.4 −0.16 −0.18 0.2

−0.2 f

f

i

1

−0.22

0

−0.24

ISU Stoch. Grad. QR−RLS

−0.26 −0.2

−0.28 0


6000

−0.3

0


30

Figure 1: Tracking behaviour of the dierent algorithms: (a) ISU Algorithm, (b) Stochastic Gradient-Based Algorithm with = 8e;5, (c) QR-RLS Algorithm, (d) Results of the rst 35 bursts for all three algorithms possible. To overcome the problem of slow convergence in the case of the stochastic gradient-based algorithm, we suggest to perform one singular value decomposition (SVD) of the rst burst and by doing so nding an estimate for an orthonormal basis of the signal subspace. Using this estimate of the signal subspace to initialize the stochastic gradient-based algorithm leads to a much better performance as it is shown in Figure 2. It can be seen that the deviation of the estimated direction of arrival from the correct one at f1 = ;0:2 is negligible from the rst burst on if a singular value decomposition is used to initialize the algorithm. In a second experiment, the detection of additional and vanishing signals is tested. This time we simulate the transmission of 500 bursts only. The rst signal is present all the time

24

5 SIMULATIONS Stochastic Gradient Algorithm with and without SVD −0.18

−0.2

−0.22

f

1

Stoch. Grad. Stoch. Grad. + SVD −0.24

−0.26

−0.28

−0.3

0

5

10


25

30

35

Figure 2: Performance of the Stochastic Gradient based Algorithm ( = 8e;5) for the rst 35 bursts with and without an initial SVD at a spatial frequency of f1 = ;0:2. The second signal is present from burst 100 to 149, from burst 200 to burst 399 and from burst 420 to 500 at a constant spatial frequency of f2 = 0:1. The third signal is only present from burst 100 to burst 399 and its spatial frequency f3 increases from 0.2 to 0.3 during that time. The signal to noise ratio for all signals is 20 dB. Therefore, the experiment covers the cases of a single de- and increase in the number of signals and the cases of simultaneous de- and increase in the number of signals by two as well. The forgetting factor is again chosen to be 0:97 and we use the ISU Algorithm to perform the subspace tracking. But it should be noted that the choice of the ISU algorithm is arbitrarily and the presented tests for increase and decrease in the numbers of signals given in section 4 are generally applicable, in particular they can be used with all three algorithms for subspace tracking proposed in this work. The thresholds for increase and decrease are chosen to be 0:15 and 0:98, respectively. As there occur only a few wrong detections, the result of the simulation depicted in Figure

25 0.3

Signal 3 0.2

0.1 Signal 2

f

i

0

−0.1

−0.2 Signal 1 −0.3

−0.4

0

50

100

150

200 250 300 Number of Bursts

350

400

450

500

Figure 3: Detection of changes in the number of signals in conjecture with the ISU algorithm 3 demonstrates the good capability of the proposed algorithms for detecting changes in the number of signals even if the time between changes is relatively short. Furthermore, it can be seen that a simultaneous de- or increase of more than one signal is detected in consecutive processing steps as it is expected from the description of the algorithms in section 4.

26

6 CONCLUSIONS

6 Conclusions In this report, we justi ed the need for subspace tracking algorithms in conjecture with an SDMA system. As the used transmission schemes usually include a TDMA component so that one user transmits a burst of data during his assigned time slot the subspace tracking algorithm must also be able to deal with burst-wise data and not only with snapshot vectors. Therefore we introduced three subspace tracking algorithms that can handle the case of burst-wise date and which are based on known results for the snapshot vector case. The tracking capabilities of these three algorithms is vary ed through simulations. Another problem is the detection of changes in the number of signals. One possible solution to this problem that again works with burst-wise signals is suggested and its performance is tested by simulations. Further work should include a careful investigation of the computational complexity of the proposed algorithms especially in comparison with the singular value decomposition of the data matrix and simulations with real data. Moreover, the main goal should not be forgotten: an update algorithm for the directions of arrival is needed for an SDMA system and the considered subspace tracking algorithms are only the rst step in that direction.

Acknowledgements The author wishes to thank Dr.-Ing. Bin Yang and Dr. Craig MacInnes, Ph.D., for valuable suggestions and generous help on questions concerning their work as well as for providing some of their programs to him. The author is also grateful to Prof. Dr. techn. Josef A. Nossek and Prof. Dr.-Ing. Jurgen Gotze for introducing him to the topic and for discussions on some details in the sequel.

REFERENCES

27

References [AG93]

S. Thomas Alexander, Avinash L. Ghirnikar: A Method for Recursive Least Squares Filtering Based Upon an Inverse QR Decomposition, IEEE Transactions on Signal Processing, Vol. 41, No. 1, pp. 20-30, January 1993

[Bel87]

Maurice G. Bellanger: Adaptive Digital Filters and Signal Analysis, Marcel Dekker, Inc., New York, 1987

[BS92]

Christian H. Bischof, Gautam M. Shro: On Updating Signal Subspaces, IEEE Transactions on Signal Processing, Vol. 40, No. 1, pp. 96-105, January 1992

[CM97] Alberto Carini, Enzo Mumolo: Fast square-root RLS adaptive ltering algorithm, Signal Processing, Vol.57, pp. 233-250, 1997 [CG90]

Pierre Comon, Gene H. Golub: Tracking a Few Extreme Singular Values and Vectors in Signal Processing, Proceedings of the IEEE, Vol. 78, No. 8, August 1990

[DeG92] Ronald D. DeGroat: Noniterative Subspace Tracking, IEEE Transactions on Signal Processing, Vol. 40, No. 3, pp. 571-577, March 1992 [FP90]

W. Ferzali, John G. Proakis: Adaptive SVD Algorithm For Covariance Matrix Eigenstructure Computation, in Proceedings of the 1990 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 2615-2618, Toronto, Canada, 1990

[Gie97]

Christoph H. Gierull: A Fast Subspace Estimation Method For Adaptive Beamforming Based on Covariance Matrix Transformation, AEU International Journal of Electronics and Communication, Vol. 51, No. 4, pp. 196-205, July 1997

[Got95] Jurgen Gotze: Orthogonale Matrixtransformationen, Oldenbourg Verlag, Munchen, 1995

28

REFERENCES

[GLR86] Israel Gohberg, Peter Lancaster, Leiba Rodman: Invariant Subspaces of Matrices with Applications, John Wiley & Sons, New York, 1996 [GL89]

Gene H. Golub, Charles V. van Loan: Matrix Computations, 2nd edition, Johns Hopkins University Press, Baltimore, 1989

[Gra81] Alexander Graham: Kronecker Products and Matrix Calculus: With Applications, Ellis Horwood Limited, Chichester, 1981 [Gus98] Tony Gustafsson: Instrumental Variable Subspace Tracking Using Projection Approximation, IEEE Transactions on Signal Processing, Vol. 46, No. 3, pp. 669-681, March 1998 [Hay91] Simon Haykin: Adaptive Filter Theory, 2nd edition, Prentice Hall, Englewood Clis, 1991 [HJ96]

Roger A. Horn, Charles R. Johnson: Matrix Analysis, reprint, Cambridge University Press, New York, 1996

[IK66]

Eugene Isaacson, Herbert B. Keller: Analysis of Numerical Methods, John Wiley & Sons, New York, 1966

[KV96]

Hamid Krim, Mats Viberg: Two Decades of Array Signal Processing Research: The Parametric Approach, IEEE Signal Processing Magazine, Vol. 13, No. 4, pp. 67-94, July 1996

[LOS94] K. J. Ray Liu, Dianne P. O'Leary, Gilbert W. Stewart, Yuan-Jye J. Wu: URV ESPRIT for Tracking Time-Varying Signals, IEEE Transactions on Signal Processing, Vol. 42, No. 12, pp. 3441-3448, December 1994 [Mac95] Craig. S. MacInnes: Analysis of small and large perturbations of matrix subspaces, Ph.D. Thesis, University of Rhode Island, Kingston, 1995 [MV96] Craig S. MacInnes, Richard J. Vaccaro,: Tracking directions-of-arrival with invariant subspace updating, Signal Processing, Vol. 50, pp.137-150, 1996

REFERENCES

29

[MDV92] Marc Moonen, Paul van Dooren, Joos Vandewalle: A Singular Value Decomposition Updating Algorithm For Subspace Tracking, SIAM Journal on Matrix Analysis and Applications, Vol. 13, No. 4, pp. 10151038, October 1992 [PP97]

Arogyaswami J. Paulraj, Constantinos B. Papadias: Space-Time Processing for Wireless Communications, IEEE Signal Processing Magazine, Vol. 14, No. 6, pp. 49-83, November 1997

[Pil89]

S. Unnikrishna Pillai: Array Signal Processing, Springer Verlag, New York, 1989

[Pro95]

John G. Proakis: Digital Communications, 3rd edition, McGraw-Hill, New York, 1995

[RK89]

Richard Roy, Thomas Kailath: ESPRIT - Estimation of Signal Parameters Via Rotational Invariance Techniques, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 37, No. 7, pp. 984-999, July 1989

[SK94]

Ali H. Sayed, Thomas Kailath: A State-Space Approach to Adaptive RLS Filtering, IEEE Signal Processing Magazine, Vol. 11, No. 4, pp. 18-60, July 1994

[Sch82]

Ralph O. Schmitt: A Signal Subspace Approach To Multiple Emitter Location And Spectral Estimation, Ph.D. Thesis, Stanford University, Stanford, 1982

[Ste73]

Gilbert W. Stewart: Error and Perturbation Bounds for Subspaces Associated with Certain Eigenvalue Problems, SIAM Review, Vol. 15, No. 4, pp. 727-764, October 1973

[Ste90]

Gilbert W. Stewart: Stochastic Pertubation Theory, SIAM Review, Vol.32, No.4, pp. 579-610, December 1990

[Ste92]

Gilbert W. Stewart: An Updating Algorithm for Subspace Tracking, IEEE Transactions on Signal Processing, Vol. 40, No. 6, June 1992

30 [Str88]

REFERENCES Gilbert Strang: Linear Algebra And Its Application, 3rd edition, Harcourt Brace & Company, Orlando, 1988

[VOK91] Mats Viberg, Bjorn Ottersten, Thomas Kailath: Detection and Estimation in Sensor Arrays Using Weighted Subspace Fitting, IEEE Transactions on Signal Processing, Vol. 39, No. 11,pp. 2436-2449, November 1991 [WK85] Mati Wax, Thomas Kailath: Detection of Signals by Information Theoretic Criteria, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 33, No. 2, pp. 387-392, April 1985 [WS85]

Bernard Widrow, Samuel D. Stearns: Adaptive Signal Processing, Prentice-Hall, Englewood Clis, 1985

[WZR90] Kon Max Wong, Qi-Tu Zhang, James P. Reilly, P. C. Yip: On Information Theoretic Criteria for Determining the Number of Signals in High Resolution Array Processing, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 38, No. 11, pp. 1959-1970, November 1990 [YB92]

Bin Yang, Johann F. Bohme: Rotation-Based RLS Algorithms: Uni ed Derivations, Numerical Properties, and Parallel Implementations, IEEE Transactions on Signal Processing, Vol. 40, No. 5, pp. 1151-1166, May 1992

[Yan95a] Bin Yang: Projection Approximation Subspace Tracking, IEEE Transactions On Signal Processing, Vol. 43, No. 1, pp. 95-107, January 1995 [Yan95b] Bin Yang: An Extension of the PASTd Algorithm to both Rank and Subspace Tracking, IEEE Signal Processing Letters, Vol. 2, No. 2, pp. 179-182, September 1995 [Yan96] Bin Yang: Asymptotic convergence analysis of the projection approximation subspace tracking algorithm, Signal Processing, Vol. 50, pp. 123-136, 1996