An iterative projection algorithm and some simulation results - CiteSeerX

0 downloads 0 Views 160KB Size Report
COMPSTAT '96 Proceedings in. Computational Statistics. Physica Verlag, Heidelberg, 1996. An iterative projection algorithm and some simulation results.
Paper to appear in Prat, A. (ed.) COMPSTAT '96 Proceedings in Computational Statistics. Physica Verlag, Heidelberg, 1996.

An iterative projection algorithm and some simulation results Michael G. Schimek

Medical Biometrics Group, University of Graz Medical Schools, A-8036 Graz, Austria and Department of Mathematics and Statistics, University of Klagenfurt, A-9020 Klagenfurt, Austria

Abstract. An iterative projection method for large linear equation systems is described. It has favourable properties with respect to many statistical applications. A major advantage is that convergence can be established without restrictions on the system matrix. Hence diagonal dominance or regularity are not required. The reason why this numerical method has not been much used in computational statistics is its slow convergence behaviour. In this paper we introduce a relaxation concept and the optimal choice of the relaxation parameter, even for nearly singular systems, is studied in a simulation experiment. Keywords. Iterative projections, lack of diagonal dominance, linear equation systems, non-parametrics, regression, relaxation, regression tting, singularity, simulations, time series tting

1 Introduction

The estimation of many parametric, non- as well as semiparametric statistical models involves the solution of large linear equation systems. Up to now iterative procedures like Jacobi, Gauss-Seidel or back tting have been applied. As Schimek, Neubauer and Stettner (1994) pointed out, only a small portion of the established numerical methods are actually implemented in statistical software packages. Especially there is a lack of acceleration techniques. Our main criticism of the above mentioned methods for the solution of linear equations is the fact that they require certain characteristics of the system matrix. Features like diagonal dominance or regularity cannot be taken granted for all statistical estimation problems of interest. Very often we have a situation of nearsingularity. For instance when generalized additive models (Hastie and Tibshirani 1990) or nonlinear additive autoregressive models (Hardle and Chen 1995, p.379 ) are evaluated by means of linear scatterplot smoothers such as smoothing splines, ill-posed normal equations cannot be ruled out. The weighting scheme imposed on the data by the smoothing spline (its equivalent kernel) is likely to cause such problems. We propose a column-oriented iterative projection method for which convergence can be established independent of certain features of the system matrix. Hence we can deal with many applications where e.g. back tting does not always

provide a proper solution. Such applications are the non-parametric estimation of additive and some semiparametric regression models as well as certain time series models.

2 An iterative projection method

The linear equation systems we have to solve are of the form Ax = b in x. In most statistical applications we can assume a square (n  n) system matrix A and n-dimensional vectors x and b. The matrix A is usually large and sparse. Di erent from standard iterative procedures which are equation-oriented, the iterative projection method is column-oriented. This idea has been rst brought up by de la Garza (1951). Modern computing power has made the projection concept interesting again. Schimek, Stettner and Haberl (1995) developed the idea further. They could give an algorithm but the convergence behaviour remained a problem. In this paper we introduce relaxation from the start to gain a higher eciency of the algorithm. We assume the matrix A = (a1 ; a2 ; : : : ; a ) to consist of column vectors a for i = 1; 2; : : : ; n. The vectors x and b are de ned as above. Let us have a linear space sp(A) generated by the columns of A and n

i

b 2 sp(A): We de ne two real sequences, the one is ( ) with b?

X j

 a ;a i

i

!

j

j

= 0;

j = 1; 2 : : : ;

i=1

where a are the column vectors of A as de ned above. But we could also consider a := a  with p = 1 + (i ? 1) mod n for i > n. The a 1 ; a 2 ; : : : ; a are a permutation of the a1 ; a2 ; : : : ; a . According to Murty (1983, p.457) this should improve the speed of convergence. The other sequence is (s ) de ned by i

i

p

n

n

X

s = jk

jk

;

j = 1; : : : ; n;

i



i

k = 1; 2; : : :

(1)

nk;ai =aj

In the terminology of Maess (1988, p.113 ) this method produces an instationary iteration process. Further it is geometrically motivated. In the j -th iteration step  is determined by the orthogonal ("perpendicular") projection of the previous "unexplained" residual component u ?1 onto the dimension a . This means that the coecients  can be calculated by dot (inner) products. Hence  = (u(a?;1a; a) ) (2) where ?1 X u ?1 = b ?  a ; j

j

j

j

j

j

j

j

j

j

i

j

i

i=1

which makes the geometric interpretation clear. Because the norm (length) of  is shrinking convergence can be expected. Because of (1) each element x of the solution vector x can be calculated by X x =  ; l = i + nk; k = 1; 2; : : : j

i

i

l

l

The necessary condition is that the residual components u tend to zero. For a formal proof see Stettner (1994, p.156f). In summary, the proposed iterative projection method can be characterized as follows: Firstly, it always converges because convergence does not depend on special features of the system matrix A, such as positive de niteness or diagonal dominance. Many statistical computations meet these requirements but they cannot always be guaranteed. Secondly, even for singular systems a solution can be obtained. This is a highly important aspect because the condition of a matrix A can deteriorate in a complicated algorithm. As already pointed out, some statistical problems tend to produce ill-conditioned systems. These strong points go along with one drawback. The iterative projection method is slower than the usual iterative procedures. But to improve the speed of convergence an acceleration concept can be brought into e ect. j

3 Relaxation

Most recently there has been some e ort to establish acceleration techniques. The simplest idea, again geometrically motivated, is to introduce a relaxation parameter ! in equation (2). This leads to ?1 ; a ) 0 = (!u (3) (a ; a ) where ?1 X u ?1 = b ?  a : j

j

j

j

j

j

i

j

i

i=1

! should in uence the sequence of orthogonal projections (dot products) in a way that less iteration steps are necessary until convergence (Euclidean norm ju j less some ). The crucial point is whether the iterative projection method maintains its desirable characteristics, rst of all convergence. In addition we have to learn how to choose appropriate ! values. As a matter of fact Stettner (1994) could prove convergence under relaxation for an admissible interval of the relaxation parameters !. The admissible values of ! can be estimated from j



2

j0 j2 = u ? !(au;2 a) a = juj2 ? (2r ? r2 ) (u;aa2 ) : 2

(4)

Because of (4) 0 < ! < 2 is required in the case of convergence.

4 Algorithm and implementation

Let us have the following starting values: u = b; x = 0, where u and x are vectors, mu = 0, k = 0, where mu and k are scalars. For the relaxed iterative projection method the algorithm can be written like this:

while not break

uTemp = u for i = 1 to n mu(i) = InnerProd(omega  uTemp; a(i)) uTemp = uTemp ? mu(i)  a(i); for j = 1 to n x(j ) = x(j ) + mu(j ); u = uTemp; term = EuclidNorm(u); if term < 1:0e ? 12

a(i); a(i));

/ InnerProd(

break;

k =k+1 if k > MaxIter break;

It has a recursive structure and its primary calculation is the inner (dot) product. The dot products are accumulated in double precision and the dot product itself has a good relative numerical error (see Golub and van Loan 1989, p.65 for details). Hence our algorithm is very reliable. In addition we can take advantage of patterns in the system matrix A (e.g. structural zeros) during the calculation of the inner products. For instance for bandlimited systems substancial computer time can be saved. The program is coded using Microsoft Visual C++. It is based on the Microsoft foundation classes and the document view architecture. For the purpose of using the relaxed iterative projection algorithm on platforms other than 486 or pentium under Windows it is implemented in a separate class.

5 Outline of simulations

Because there is no mathematical theory how to choose the optimal parameter !^ a simulation experiment was undertaken. For a range of feasible ! values extensive simulations were carried out for non-singular as well as singular system matrices. We had the dimension n of the system vary between 3 and 50. Permutations were not applied to the column vectors a of A. Better performance of the algorithm was solely achieved through relaxation. Applying equation (4) we took 20 equally spaced values covering the interval 0 < ! < 2. The value ! = 1 for unrelaxed iterations was included. The emphasis was on arbitrary system matrices A either diagonally non-dominant or nearly singular. Singular (square) systems in a strict sense are of no relevance from a statistical point of view. Further a few bandlimited systems have been studied too. They are most frequent in statistical applications. Double precision arithmetic was used throughout. The iterative solutions were calculated up to machine precision (Euclidean norm term in algorithm less  = 1:0e ? 12) on a pentium platform under Microsoft Windows. The criterion for the evaluation of the relaxed iterative projection algorithm was number of iterations until convergence. The minimum number was identi ed and the associated estimate compared with the unrelaxed result. As an additional indicator computer time in milliseconds (Windows does not allow for a more precise measuring) was calculated. i

6 Simulation results

We rst consider an example where A is regular and does not have diagonal dominance: 0 31 0 1 1 2 1 1 0 x1 1 B@ 1 ?1 2 1 CA B@ x2 CA = B@ ?3 CA : ?2 x3 3 ?1 0 1 4 x4 2 ?1 0 1 The minimum iteration number was obtained for !^ = 1:7. In Table 1 the exact and the estimated results for the vector x are displayed. Table 1: Results for standard and relaxed iterative projections

exact x -6 9.5 -13 25.5

estimated x^

! = 1, l = 1905 -5.999999999866273 9.499999999853367 -12.999999999750250 25.499999999505280

!^ = 1:7, l = 268 -5.999999999849062 9.499999999971044 -12.999999999854770 25.499999999958805

This example has been chosen because it asks for a highly ecient algorithm comprising an acceleration technique. It turns out that relaxation works extremly well. The required number of iterations l is reduced from 1905 to 268 for ! = 1:7. At this point it should be mentioned that the majority of equation systems require signi cantly less iterations. As a second example we analyse the above equation system with column vector a3 modi ed such that A becomes almost singular: 0 31 0 1 1 2:2 1 1 0 x1 1 B@ 1 ?1 0:1 1 CA B@ x2 CA = B@ ?3 CA ?2 x3 3 ?1 ?0:1 1 4 x4 2 ?1 0:2 1 The minimum iteration number was obtained for !^ = 1:8. In Table 2 the exact versus the estimated results for ! = 1:0 and !^ = 1:8 are shown. Table 2: Results for standard and relaxed iterative projections

exact x 3.750 -31.125 32.500 -41.125

estimated x^

! = 1, l = 7095 3.749999999878557 -31.124999998740990 32.499999998768710 -41.124999998419210

!^ = 1:8, l = 631 3.749999999659472 -31.124999999893610 32.499999999689920 -41.124999998419210

Again we nd an excellent approximation to the exact solution vector x. Of course the required number of iterations has drastically increased due to near-singularity.

Moreover the standard and the relaxed result is essentially the same. In our simulation experiment it became obvious that this is not true for singular systems in a strict sense. Relaxed iterations tend to converge to di erent solutions. Summing up all the simulation results we can say that relaxation always reduces the computational burden. The range of !^ values actually seen in the simulations was between 1.3 and 1.8. Hence it is sucient to consider in practice this smaller interval compared to the theoretical result of equation (4). For nearly singular cases larger ! values can be recommended. For arbitrary regular systems the improvement in convergence speed is so essential that the performance of the iterative projection algorithm can be compared to the overall performance of classical iterative techniques. Moreover it is also a powerful tool for the solution of bandlimited linear equation systems. As pointed out earlier, ill-posed linear equation systems should never be solved with classical techniques as regularity of the system matrix is required throughout. The proposed algorithm has the potential to bridge this gap in numerical methodology. As a matter of fact it should become standard in modern regression- and time series-oriented statistical software.

7 References

De la Garza, A. (1951) An iterative method for solving linear equations. Oak Ridge Gaseous Di usion Plant, Rep. K-731, Oak Ridge, TN. Golub, G. H. and van Loan, C. F. (1989). Matrix computations. John Hopkins University Press, Baltimore. Hastie, T. J. and Tibshirani, R. J. (1990) Generalized additive models. Chapman and Hall, London. Hardle, W. and Chen, R. (1995) Nonparametric time series analysis, a selective review with examples. Bulletin of the International Statistical Institute, LVI, 1, 375-394. Maess, G. (1988). Projection methods solving rectangular systems of linear equations. J. Comp. Appl. Math. 24, 107-119. Murty, K. G. (1983). Linear programming. Wiley, New York. Schimek, M. G., Neubauer, G. and Stettner, H. (1994) Back tting and related procedures for non-parametric smoothing regression: A comparative view. In Grossmann, W. and Dutter, R. (eds.) COMPSTAT'94 Proceedings in Computational Statistics. Physica, Heidelberg, 63-68. Schimek, M. G., Stettner, H. and Haberl, J. (1995) An iterative projection method for nonparametric additive regression modelling. In Sall, J. and Lehman, A. (eds.) Computing Science and Statistics, 26. Interface Foundation of North America, 192195. Stettner, H. (1994) Iterierte Projektionen bei groen linearen Systemen. In Friedl, H. (ed.) Was ist angewandte Statistik? Festkolloquium anlalich des 65. Geburtstages von Universitatsprofessor Dr. Josef Golles. Grazer Mathematische Berichte, 324, 155-158.

Acknowledgement: The implementation of the algorithm in Microsoft Visual C++ by G. Orasche is greatly acknowledged.

Suggest Documents