IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 11, NOVEMBER 2000. Square-Root QR Inverse Iteration for Tracking the. Minor Subspace.
2994
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 11, NOVEMBER 2000
Square-Root QR Inverse Iteration for Tracking the Minor Subspace Peter Strobach, Senior Member, IEEE
Abstract—A new algorithm for tracking the eigenvectors associcovariance matrix ated with the smallest eigenvalues of an is introduced. The method is sequential inverse iteration based on a recursive square-root QR factor updating of the covariance matrix with ( 2 ) operations per time update. The principal operations count of this new tracker is justified by a significantly better 2) performance compared with the recently introduced fast ( minor subspace tracker of Douglas et al.. Index Terms—Eigenvalues, inverse iteration, QR-factorization, subspace tracking.
I. INTRODUCTION
M
ANY signal processing tasks involve the constrained extrace , where tremization of functions is a (possibly time varying) covariance matrix of dimension , and is a set of linearly independent vectors of di. The most frequently used constraint mension with in this extremization is the orthonormality of the desired vector . set , i.e., denote the ordered eigenvalue decompoLet in sition (EVD) of with eigenvalues diag , and denote as the associated orthonormal set of eigenvectors. It is now well is achieved with known that a maximization of , where is the associated maximum. Conversely, a minimization of is reached with . . The associated minimum is is a function of time. In this case, Suppose that the maximization problem requires a tracking of the principal , whereas the minimization subspace spanned by problem requires a tracking of the so-called “minor” subspace . The associated dominant (largest) or spanned by minor (smallest) eigenvalues are eventually tracked as well. In recent years, a considerable interest has grown around the principal subspace tracking problem and tracking algorithms. An early review paper often cited in this context is [2]. One of the most powerful concepts for tracking the principal subspace is orthogonal or simultaneous iteration [3], [4]. Owsley [5] was probably the first who proposed the application of this method in its classical form for principal subspace operations per time tracking with a single iteration and update. In a fully recursive form, “fast” simultaneous iteration
subspace trackers with a complexity of and operations per time update can be derived [6]–[8]. In this paper, the simultaneous iteration concept is applied to the minor subspace tracking problem. We exploit the fact that the minor subspace is the principal subspace of the inverse covariance matrix. Hence, a tracking of the minor subspace can be achieved by applying simultaneous iteration to the inverse covariance matrix. The method is called inverse iteration [3]. Of course, the inverse is never computed or updated explicitly in an inverse iteration concept. All operations can be lead back to solving systems of linear equations. Note, therefore, that “Owsley-type” variants of inverse iteration require on the operations per time update. We propose a sequential order of square-root form of the method with a reduced complexity of operations per time update. The core of the algorithm is a sequential QR-factor tracker for the covariance matrix, as is known from array processing [9], [10]. The overall algorithm is very regular in structure and, besides the recursive QR-factor tracker, requires only standard operations like back-substitution and QR-factorization. The results of detailed computer simulations have revealed that this method performs significantly better than the recently introduced fast minor subspace tracker of Douglas et al. [1]. Unlike in principal subspace tracking, where it was often conand algorithms jectured that the step to fast does not cause any significant loss in performance, this does not seem to hold true in the area of minor subspace tracking. We observed a severe performance degradation in minor subspace tracking when fast algorithms are used in comparison with the new square-root QR inverse iteration minor subspace tracker. This paper is organized as follows. In Section II, we develop the new algorithm. In Section III, representative results of computer simulations and comparisons are shown. Section IV summarizes the conclusions. II. SQUARE-ROOT QR INVERSE ITERATION MINOR SUBSPACE TRACKER Inverse iteration for computing the smallest eigenvalues and eigenvectors of a symmetric, positive definite matrix of , is constituted by the following dimension two-term recurrence:
Manuscript received November 16, 1999; revised July 31, 2000. The associate editor coordinating the review of this paper and approving it for publication was Prof. S. M. Jesus. The author is with the Fachhochschule Furtwangen, Furtwangen, Germany. Publisher Item Identifier S 1053-587X(00)09305-3. 1053–587X/00$10.00 © 2000 IEEE
until convergence iterate
QR-factorization (1)
STROBACH: SQUARE-ROOT QR INVERSE ITERATION FOR TRACKING THE MINOR SUBSPACE
where
is an auxiliary matrix of dimension , and is an matrix with orthonormal column vectors that converge toward an orthonormal set of minor subspace basis can be any arbitrary orthonormal vectors. The initial basis . The triangular matrix set that satisfies converges toward the diagonal matrix of the inverse minor eigenvalues as follows: diag
as known from array processing [9]–[11]. The matrix hence, comprises elementary rotors of the type
2995
,
(8) , , and . The appliwhere cation of these rotors is illustrated in the following example of , where the symbol denotes a nonzero matrix element:
(2)
The method is easily identified as the equivalent of a classical with known orthogonal iteration over the inverse matrix convergence properties [3], [4]. Suppose that is a function of time and is updated recursively according to (3) In practice, the exponential forgetting factor will be in the . In this case, the eigenvectors of range are smoothly changing functions of time. A single inverse iteration per time update is therefore sufficient for tracking the minor subspace. operaThe classical routine (1) requires on the order of tions per iteration. A sequential variant of the algorithm is therefore introduced. This method reduces the principal complexity operations per time update. Consider the following to : triangular square-root factorization of
and as illustrated above, the overall Givens rotation matrix is defined as a chain product of individual plane rotations
(4) is an upper-right triangular matrix. Obwhere serve that according to (3), a time updating of the factorization can be expressed as
(5) We now introduce a sequence of Givens plane rotations repmatrix that satisfies resented by a . We can write
(6) The rotors in
can now be determined so that
(7) The procedure is easily identified as a classical QR-factor update of the type “annihilate update vector by circular rotation,”
Table I is a quasicode listing of this square-root QR inverse iteris never formed ation minor subspace tracker. Note that explicitly by this algorithm. The algorithm can be initialized with a zero triangular matrix, provided the forward- and backsubstitution routines, as well as the QR factorizer, are equiped with the usual exception processing to handle degenerate cases. This will be required anyway in cases of nonpersistent excitation, i.e., temporary vanishing input signals. Of course, the algorithm could also be started from a batch solution, if desired.
2996
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 11, NOVEMBER 2000
Like most numerical linear algebra routines, this algorithm is also easily generalized to complex data.
ratio (SNR). To see this, consider the orthogonal decomposition of with respect to the true minor subspace spanned by the : column vectors of
III. SIMULATION RESULTS AND COMPARISONS Simultation results are shown in this section for the new square-root QR inverse iteration minor subspace tracker (QRI-MST) and for the recently introduced Douglas-Kung-Amari minor subspace tracker (DKA-MST) of [1, eq. (3)]. The DKA-MST is used as a reference method for comparisons.
(15) denotes the true signal component In this expression, that can be represented in the true subspace spanned by in . Additionally, we have . Hence, represents an orthogonal error subspace. Consequently, the SSNR criterion comprises the ratio of the squared Frobenius norms of signal and noise components in the estimated subspace as follows:
A. Specification of Simulation Data Sequences We first describe the generation of the simulation data sequences. A first method is the given data covariance case. In is specified by a given this case, the a test data sequence covariance matrix . The EVD of and a basis set is computed as follows: (9) A zero mean white Gaussian vector process as
is generated (10) generated
A simulation data sequence by
(11) , as desired. satisfies In a second series of experiments, a simulation data sequence is only specified by the eigenvalues of its covariance matrix. A zero-mean random matrix is generated, and an “eigenvector” matrix is computed via QR factorization of :
QR-factorization
SSNR
tr tr tr tr
(16)
Another distance function often used in linear algebra for comparing the closeness of subspaces are the principal angles between two subspaces [3]. Sometimes, only the dominant principal angle performance is investigated and displayed. However, this is generally a rather incomplete analysis as all the smaller principal angles represent error components in a subspace estimate as well. Consequently, all principal angles must be considered in a complete analysis of a subspace estimator performance. Since the SSNR is a criterion that comprises all error components of a subspace estimate, it must be possible to express it in terms of the complete set of principal angles. Let denote the principal angles between the suband . It can be shown (see [12, Apspaces spanned by pendix]) that the SSNR comprises all principal angles in the following form:
(12)
with coAgain, a simulation sequence can be generated, as shown above. variance
SSNR
(17)
B. Performance Criterion The distance between the tracked minor subspace spanned and the true minor subspace spanned by must be by monitored in each time step. As a performance criterion, we use the so-called subspace SNR (SSNR) as introduced in [12]. In and summary, this criterion is defined as follows. Let denote orthonormal basis sets of two subspaces of dimension . Introduce
C. Results We show results from two experiments. In a first series of experiments, we duplicated the experimental conditions from with covariance [1] and generated a vector process
(18)
(13) and define SSNR
tr tr
(14)
The SSNR is a particularly meaningful criterion for estimating the quality of subspace estimates because it represents the logratio of true signal and error powers in a subspace estimate and is hence expressed in decibels like a conventional signal-to-noise
, , The eigenvalues of this matrix are , and . Each trial run comprises consecutive realizations of . Following the experiments in [1], the tracked minor subspace dimension was . fixed to a value of Besides the usual start-up characteristics from an empty memory, we also wanted to study the adaptation characteristics of the algorithms. For this purpose, a nonstationarity was in terms of an abrupt 90 subspace implemented at
STROBACH: SQUARE-ROOT QR INVERSE ITERATION FOR TRACKING THE MINOR SUBSPACE
Fig. 1. Average SSNR characteristics of the DKA-MST for different values of the learning factor in an experiment with N = 4 and r = 2.
Fig. 2. Average SSNR characteristics of the QRI-MST for different values of the forgetting factor in the same experiment with N = 4 and r = 2.
rotation. This subspace rotation was implemented as a simple eigenvector exchange in the eigenvector matrix of the random simulation sequence generator (9). Hence, all elements in the underlying are changed abruptly at while the eigenvalues in remain unchanged. The tracking characteristics of both QRI-MST and DKA-MST were studied using this data. Three experiments were carried out for each algorithm using different values of the exponential forgetting in the case of the QRI-MST and different values factor of the learning factor of the DKA-MST. Each experiment comprises 32 statistically independent trial runs. The SSNR values were computed in each time step of each trial run. The average SSNR of the 32 independent trial runs in each experiment was monitored as a measure of tracking and steady state performance. Fig. 1 shows the three average SSNR tracks of the DKA-MST , , and algorithm with learning factors . Fig. 2 shows the corresponding average SSNR tracks of the QRI-MST algorithm with exponential forgetting factors , , and . We can see that the operation of the DKA-MST is dominated by a tradeoff between adaptation speed and steady-state performance, as expected with gradient techniques. Small values of the learning factor yield a better steady-state performance manifested in higher average
2997
Fig. 3. Minor eigenvalue tracks of the QRI-MST algorithm in the N = 4 and r = 2 experiment. Tracks of 16 independent trial runs are overlaid. The exponential forgetting factor was = 0:993. Dashed lines indicate true eigenvalues.
SSNR values. The price paid is a significant loss in start-up and tracking speed. Comparing this to the SSNR tracks of the QRI-MST algorithm shown in Fig. 2, we see that this algorithm produces subspace estimates of a significantly better quality. For a comparable tracking characteristics, the gain in steady-state SSNR performance is approximately 6 dB. Next, we compare the start-up characteristics of the two algorithms. The empty memory start-up characteristics of the QRI-MST is practically constant and is not a function of the particular value of the exponential forgetting factor . Comparing this with the DKA-MST, we see that this algorithm is significantly slower, and additionally, the start-up characteristics is a function of the learning factor . Besides a tracking of the minor subspace basis, a simultaneous tracking of the associated minor subspace eigenvalues can be of interest as well. We can see from (2) that the QRI-MST produces estimates of the inverse minor eigenvalues in reverse . Fig. 3 shows these estimated order on the main diagonal of minor eigenvalue tracks of the QRI-MST algorithm for 16 inde. pendent trial runs with exponential forgetting factor The dashed lines indicate the true eigenvalues. The DKA-MST algorithm produces no minor eigenvalue estimates. In this first example, a fast algorithm does not speed up the is simply too computations because the order/rank ratio and small. We therefore consider a larger problem with . In this case, we specified the eigenvalues of the data co, , , linearly variance matrix as . The associated eigendescending in steps of 0.1 to vectors were generated according to (12) as the orthonormal matrix in the QR-factorization of a 20 20 random matrix . Fig. 4 shows the average SSNR tracks of the DKA-MST, again computed from 32 independent trial runs in three independent experiments with different learning factors. In all cases, we observed a very slow start-up and tracking characteristics of , we can expect an acceptable the DKA-MST. For steady-state performance, but the start-up and tracking characteristics will be unacceptably slow. On the other hand, if the and , learning factor is increased to values the algorithm exhibits only a slightly faster start-up and tracking
2998
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 11, NOVEMBER 2000
Fig. 4. Average SSNR characteristics of the DKA-MST for different values of the learning factor in an experiment with N = 20 and r = 4.
Fig. 6. Minor eigenvalue tracks of the QRI-MST algorithm in the N = 20 and r = 4 experiment. Tracks of eight independent trial runs overlaid. The exponential forgetting factor was = 0:996. Dashed lines indicate true eigenvalues.
Fig. 5. Average SSNR characteristics of the QRI-MST for different values of the forgetting factor in the same experiment with N = 20 and r = 4.
Fig. 7. Deviation from orthogonality tracks as a function of the learning factor for the DKA-MST in the N = 20 and r = 4 experiment.
characteristics; the steady-state performance, however, is se, the algorithm diverged verely degraded. For values with a numerical data processor (NDP) error. Additionally, we found that the learning characteristics of the DKA-MST are not independent from the input data power. Even slight temporal fluctuations of the input data power can drive the DKA-MST into an unstable state when the value for is too large. In the QRI-MST, on the other hand, the adaptation characteristics will be completely independent from the input data power. Additionally, the QRI-MST is unconditionally stable in all cases. Fig. 5 displays the average SSNR characteristics of the QRI-MST aland . A compargorithm using the same data with ison with Fig. 4 reveals that the QRI-MST performance is superior over the DKA-MST in every respect. Generally, our experiments have shown that the performance loss of the DKA-MST and inbecomes more drastic for increasing problem size . creasing order/rank ratios Fig. 6 shows the four minor eigenvalue tracks as obtained from eight independent trial runs of the QRI-MST with an ex. ponential forgetting factor of Finally, we should note that the DKA-MST does not produce unconditionally orthonormalized minor subspace basis vectors. We observed that the deviation from orthogonality increases as we step to higher values of the learning factor. This is visualized
in Fig. 7, where we display a criterion devortho tion from orthogonality) that is defined as follows:
devortho sum
sumdiag sumdiag
(devia-
(19)
denotes the absolute sum of all elements of the where sum denotes the absolute sum of argument matrix, and sumdiag all diagonal elements of the argument matrix. We therefore applied an additional column orthonormalization to all minor subspace basis estimates produced by the DKA-MST before we computed the SSNR for correct assessment. A question sometimes raised with subspace-iteration type algorithms like the QRI-MST is whether an application of more than a single iteration in each time step could improve the overall estimation and tracking results. The answer is that no further improvement can be expected beyond the first iteration in each time step. To see this, we computed the true minor subspace of the exponentially updated covariance matrix (3) up to machine accuracy in each time step for the experiment and used it as a reference for the estimated subspace of the QRI-MST tracker (see Table I). The gap between this estimated minor subspace of the updated covariance matrix
STROBACH: SQUARE-ROOT QR INVERSE ITERATION FOR TRACKING THE MINOR SUBSPACE
QUASICODE
OF THE
TABLE I SQUARE-ROOT QR INVERSE ITERATION MINOR SUBSPACE TRACKER
2999
“nonfast” QR-based recursive algorithm—the QRI-MST—for minor subspace tracking and used it as a basis for detailed experimental comparisions with the DKA-MST algorithm, which is a fast minor subspace tracker proposed in [1]. Our experiments have shown that it can be rather worthwhile to expend the operations of the QR-based tracker because of its superior performance, excellent stability, and versatile characteristics. All operations required in the QRI-MST algorithm are standards in numerical data processing, and most of them have even been casted in systolic arrays. REFERENCES
Fig. 8. Dominant principal angle trajectories between true and QRI-MST estimated subspaces in the = 20 and = 4 experiment. Three curves for different exponential forgetting factors displayed in one diagram.
N
r
and the exact minor subspace of the same updated covariance matrix is displayed in Fig. 8 in terms of the dominant principal angle in each time step for the three exponential forgetting . It can be seen factors that after a very short initial transient, the dominant principal . This remarkable angle drops down to values of less than accuracy of the tracker holds for all reasonable choices of the exponential forgetting factor and holds even in the case of sudden 90 subspace rotations. Again, we emphasize that this single iteration per time step optimality is characteristic for all trackers of the subspace iteration class [6]–[8], [12]. IV. CONCLUSIONS The recent advent of fast algorithms for minor subspace tracking has motivated this study. We introduced a rather simple,
[1] S. C. Douglas, S.-Y. Kung, and S. Amari, “A self-stabilized minor subspace rule,” IEEE Signal Processing Mag., vol. 5, pp. 328–330, Dec. 1998. [2] P. Comon and G. H. Golub, “Tracking a few extreme singular values and vectors in signal processing,” Proc. IEEE, vol. 78, pp. 1327–1343, Aug. 1990. [3] G. H. Golub and C. F. VanLoan, Matrix Computations, 2nd ed. Baltimore, MD: John Hopkins Univ. Press, 1989. [4] G. W. Stewart, “Methods of simultaneous iteration for calculating eigenvectors of matrices,” in Topics in Numerical Analysis II, J. J. H. Miller, Ed, New York: Academic, 1975, pp. 169–185. [5] N. L. Owsley, “Adaptive data orthogonalization,” in Proc. IEEE ICASSP, Tampa, FL, 1978, pp. 109–112. [6] P. Strobach, “Low-rank adaptive filters,” IEEE Trans. Signal Processing, vol. 44, pp. 2932–2947, Dec. 1996. , “Bi-iteration SVD subspace tracking algorithms,” IEEE Trans. [7] Signal Processing, vol. 45, pp. 1222–1240, May 1997. , “Square hankel SVD subspace tracking algorithms,” Signal [8] Process., vol. 57, no. 1, pp. 1–18, Feb. 1997. [9] H. T. Kung and W. M. Gentleman, “Matrix triangularization by systolic arrays,” in Proc. SPIE Int. Soc. Opt. Eng., vol. 298, 1981, p. 16. [10] J. G. McWhirter, “Recursive least squares minimization using a systolic array,” in Proc. SPIE Int. Soc. Opt. Eng., vol. 431, 1983, pp. 18–26. [11] P. Strobach, Linear Prediction Theory: A Mathematical Basis for Adaptive Systems. Berlin, Germany: Springer, 1990, vol. 21. , “Equirotational stack parametrization in subspace estimation and [12] tracking,” IEEE Trans. Signal Processing, vol. 48, pp. 712–722, Mar. 2000.
Peter Strobach (M’86–SM’91) received the Engineer’s degree in electrical engineering from Fachhochschule Regensburg, Regensburg, Germany, in 1978, the Dipl.-Ing. degree from Technical University Munich, Munich, Germany, in 1983, and the Dr.-Ing. (Ph.D.) degree from Bundeswehr University, Munich, in 1985. From 1976 to 1977, he was with CERN Nuclear Research, Geneva, Switzerland. From 1978 to 1982, he was with Messerschmidt-Boelkow-Blohm GmbH, Munich. From May 1986 to December 1992, he was with Siemens AG, Zentralabteilung Forschung und Entwicklung (ZFE), Munich. In the summer of 1990, he held the first adaptive filter course ever held in Germany at the University of Erlangen, Erlangen, Germany. In January 1993, he joined the Faculty of Fachhochschule Furtwangen (Black Forest), Furtwangen, Germany. From March to September 1998, he was a Visiting Professor with the Department of Mathematics, University of Passau, Passau, Germany. Dr. Strobach is listed in all major Who’s Whos including Who’s Who in the World.