Solution of large-scale sparse least squares problems ... - CiteSeerX

3 4456 0452343 5

3

.

ORNL/CSD-63

UCC-ND

S oIut ion of La rge-S c ale Sparse Least Squares Problems Using Auxiliary Storage A . George M. T. Heath R . J. Plemmons

Printed in the United States of America. Available from National Technical Information Service U.S. Department of Commerce 5285 Port Royal Road, Springfield, Virginia 22161 NTlS price codes-Printed Copy: A03; Microfiche A01

This report was prepared as an account of work sponsored by an agency of the United StatesGovernment Neither theU nited StatesGovernment nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United StatesGovernment or any agency thereof The views and opinions of authors expressed herein do not necessarily state or reflect those of theunited StatesGoverrirnent or any agency thereof

ORNL/CSD-63 D i s t r i b u t i o n Category UC-32

SOLUTION OF LARGE-SCALE SPARSE LEAST SQUARES PROBLEMS USING AUXILIARY STORAGE

A. George' M. T. Heath R. J . Plemmons2

Sponsor: Originator:

D. A. Gardiner M. T. Heath

' U n i v e r s i t y o f Waterloo--Ontario 2 U n i v e r s i t y o f Tennessee--Knoxvi 1 l e

Date Published

-

August 1980

COMPUTER SCIENCES D I V I S I O N at Oak Ridge N a t i o n a l L a b o r a t o r y Post O f f i c e Box Y Oak Ridge, Tennessee 37830

Research j o i n t l y sponsored by t h e A p p l i e d Mathematical Sciences Research Program, O f f i c e o f Energy Research; Canadian N a t u r a l Sciences and Engineering Research Council Grant A8111; The U.S. Army Research O f f i c e

Union Carbide Corporation, Nuclear D i v i s i o n operating the Oak Ridge Gaseous D i f f u s i o n P l a n t Paducah Gaseous D i f f u s i o n P l a n t Oak Ridge N a t i o n a l L a b o r a t o r y Oak Ridge Y-12 P l a n t under C o n t r a c t No. W-7405-eng-26 f o r the Department o f Energy

3 4456 0452341 5

iii CONTENTS L

Page

............... TABLES . . . . . . . . . . . . . . . . . . ABSTRACT . . . . . . . . . . . . . . . . . 1. INTRODUCTION AND OVERVIEW . . . . . . 1.1 I n t r o d u c t i o n . . . . . . . . . . 1.3

........... ........... ........... ............ ........... Examples o f Large-Scale Least Squares Problems . . . . Numerical Methods f o r Sparse Least Squares . . . . . .

1.4

The Computational Module

ILLUSTRATIONS

1.2

2.

DECOMPOSITION OF A

............... USING AUXILIARY STORAGE . . . . . . . . . .....................

2.1

Introduction

2.2

F i n d i n g an A p p r o p r i a t e O r d e r i n g and P a r t i t i o n i n g

. . . . . . . . . . . . . . . . . 2.3 Reduction o f A Using A u x i l i a r y Storage . . . . . . . . 3. NUMERICAL EXPERIMENTS AND OBSERVATIONS . . . . . . . . . . . 3.1 I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . 3.2 T e s t Problems . . . . . . . . . . . . . . . . . . . . . 3.3 Numerical Experiments . . . . . . . . . . . . . . . . . 3.4 Observations . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . f o r t h e Columns o f A

V

vii 1 2 2 2 3 5

7 7

7 13 21 21 21 24

27 29

.

V

.

ILLUSTRATIONS Page

Figure 2.1

An example of a nested dissection partitioning of a graph and an induced numbering of i t s nodes . . .

..

10

2.2

A 23 by 9 sparse matrix A partitioned by columns according t o P and the induced row partitioning R

. .

11

2.3

Pictorial description of the f i r s t step of the IPI step reduction of A t o upper trapezoidal form .

. . .

14

2.4

Depiction of the second step of the block-reduction of A t o upper trapezoidal form . . . . . . . . .

15

2.5

Block s t r u c t u r e of Ai

16

2.6

Diagram of the data flow of the algorithm f o r reducing A t o upper trapezoidal form . . . . . . . . . . . . .

20

3.1

A 3 by 3 f i n i t e element grid and i t s associated 16 by 9 l e a s t squares c o e f f i c i e n t matrix a r i s i n g in the natural f a c t o r formulation of the f i n i t e element method . . . . . . . . . . . . . . . . . . . . . . .

22

Idealization o f a 3 by 3 geodetic network having connecting survey chains of length 4 . . . . .

23

3.2

.. .................

...

vi i

TABLES

Tab1 e 3.1

3.2

Page Summary of t e s t r e s u l t s showing storage and execution times f o r a sequence of problems, where the maximum t o t a l amount of f a s t storage i s fixed a t a b o u t 30,000words. . . . . . . . . . . . . . . . . . . . .

25

Summary of r e s u l t s f o r the solution of a single problem from each c l a s s , using d i f f e r e n t levels of blocking . . . . . . . . . . . . . . . . . . . .

26

.

SOLUTION OF LARGE-SCALE SPARSE LEAST SQUARES PROBLEMS USING AUXILIARY STORAGE

A. George M. T. Heath R. J . Plemmons ABSTRACT Very l a r g e sparse l i n e a r l e a s t squares problems a r i s e i n a v a r i e t y o f a p p l i c a t i o n s , such as geodetic network adjustments, photogrammetry, earthquake s t u d i e s , and c e r t a i n types o f f i n i t e element analysis.

Many o f these problems a r e so l a r g e t h a t i t i s i m p o s s i b l e t o

s o l v e them w i t h o u t u s i n g a u x i l i a r y s t o r a g e devices.

Some problems a r e

so massive t h a t t h e s t o r a g e needed f o r t h e i r s o l u t i o n exceeds t h e v i r t u a l address space o f t h e l a r g e s t machines.

I n t h i s paper we de-

s c r i b e a method f o r s o l v i n g such problems on a t y p i c a l ( l a r g e ) computer and p r o v i d e t h e r e s u l t s o f some experiments i l l u s t r a t i n g t h e e f f e c t i v e -

ness o f o u r approach.

The method i n c l u d e s an automatic p a r t i t i o n i n g

scheme which i s e s s e n t i a l t o t h e e f f i c i e n t management o f t h e data on auxiliary files.

1

2 1,

Introduction and Overview

1.1 Introduction

I n this paper a method i s presented f o r solving the l i n e a r l e a s t squares problem

when the m

rank n.

x

n matrix A i s very large and sparse and has f u l l column

The problems we w i s h t o solve a r e so large t h a t the use of

a u x i l i a r y storage i s essential regardless of the numerical method employed.

I n some cases storage requirements may even exceed the v i r t u a l

address space o f the l a r g e s t machines, and therefore a u x i l i a r y space cannot be managed i m p l i c i t l y by a paging algorithm.

O u r approach i s t o

break the large problem u p i n t o smaller subproblems which a r e processed sequentially, eventually producing the solution t o the original problem. Such an approach requires a method f o r p a r t i t i o n i n g the large problem,

a computational module f o r processing the subproblems, and an algorithm f o r managing external f i l e s containing intermediate r e s u l t s .

I n the

remainder o f t h i s s e c t i o n , a f t e r g i v i n g some s p e c i f i c examples o f largescal e 1e a s t squares probl ems , we survey possi bl e numerical methods and then describe the p a r t i c u l a r technique we have chosen f o r the computational module.

In Section 2 , problem p a r t i t i o n i n g and data management

a r e discussed.

Section 3 presents numerical t e s t r e s u l t s and observa-

tions. 1 . 2 Examples o f Large-Scale Least Squares Problems

In recent years l e a s t squares problems o f ever-increasing s i z e have a r i s e n with ever-increasing frequency.

One reason f o r this i s

3

t h a t modern data acquisition technology allows the collection of massive amounts of data.

Another f a c t o r i s the tendency of s c i e n t i s t s t o

formulate more and more complex and comprehensive models i n order t o obtain f i n e r resolution and more r e a l i s t i c d e t a i l i n describing physical systems.

P a r t i c u l a r areas i n which such large-scale l e a s t squares

problems occur include geodetic surveying (Avila and Tomlin [1979], Go1 u b and Plemmons [ 1980]), photogrammetry (Go1ub , L u k and Pagano [1980]), earthquake s t u d i e s (Vanicek, E l l i o t t and C a s t i l e [1979]) , and i n the natural f a c t o r formulation o f the f i n i t e element method (Argyris

and Bronlund [1975],

Argyris e t a1 [1978]).

An example of t r u l y spec-

t a c u l a r s i z e i s the l e a s t squares adjustment of coordinates ( l a t i t u d e s and longitudes) o f s t a t i o n s comprising the North American Datum, t o be completed i n 1983 by the U.S.

National Geodetic Survey (Kolata [1978]).

T h i s enormous task requires solving, perhaps several times, a l e a s t

squares problem having six million equations i n four hundred thousand

unknowns. 1.3 Numerical Methods f o r Sparse Least Squares Several methods have been proposed f o r solving sparse l i n e a r l e a s t squares problems ( D u f f and Reid [1976], [1976]).

Bjorck [1976], Gill and Murray

For our purposes the most important q u a l i t i e s i n a numerical

method will be storage requirements, numerical s t a b i l i t y , and convenience i n u t i l i z i n g a u x i l i a r y storage. The c l a s s i c a l approach t o solving the l i n e a r l e a s t squares problem

i s via the system of normal equations T A Ax = AT b.

4 T The n by n symmetric p o s i t i v e d e f i n i t e m a t r i x B = A A i s f a c t o r e d u s i n g T Cholesky's method i n t o R R, where R i s upper t r i a n g u l a r , and t h e n x i s

T T computed by s o l v i n g t h e two t r i a n g u l a r systems R y = A b and Rx = y. T h i s a l g o r i t h m has s e v e r a l a t t r a c t i v e f e a t u r e s f o r l a r g e sparse problems. The Cholesky f a c t o r i z a t i o n does n o t r e q u i r e p i v o t i n g f o r s t a b i l i t y so t h a t the ordering f o r B (i.e.,

column o r d e r i n g f o r A) can be chosen

based on s p a r s i t y c o n s i d e r a t i o n s alone.

Moreover, t h e r e e x i s t s w e l l -

developed s o f t w a r e f o r d e t e r m i n i n g a good o r d e r i n g i n advance o f any numerical computation, thereby a l l o w i n g use o f a s t a t i c d a t a s t r u c t u r e . Another advantage i s t h a t t h e row o r d e r i n g o f A i s

r r e l e v a n t so t h a t

t h e rows o f A can be processed s e q u e n t i a l l y from an a u x i l i a r y i n p u t f i l e i n a r b i t r a r y o r d e r , and A need never be represented i n f a s t s t o r a g e i n i t s e n t i r e t y a t any one time. may be n u m e r i c a l l y u n s t a b l e .

U n f o r t u n a t e l y t h e nc:mal

equations method

T h i s i s due t o t h e p o t e n t i a l l o s s o f i n f o r -

T T mation i n e x p l i c i t l y f o r m i n g A A and A b y and t o t h e f a c t t h a t t h e cond i t i o n number of B i s t h e square o f t h a t o f A.

A well-known s t a b l e a l t e r n a t i v e t o t h e normal equations i s p r o v i d e d by orthogonal f a c t o r i z a t i o n (Golub [1965]).

An orthogonal m a t r i x Q i s

computed which reduces A t o upper t r a p e z o i d a l f o r m

(1.3)

QA =

EJ

,

Qb =

k] ,

where R i s n by n and upper t r i a n g u l a r . two-norm, we have

(1.4)

Since Q does n o t change the

and t h e r e f o r e t h e s o l u t i o n t o (1.1) i s o b t a i n e d by s o l v i n g t h e t r i a n g u l a r system Rx = y.

The m a t r i x Q u s u a l l y r e s u l t s from Gram-Schmidt

o r t h o g o n a l i z a t i o n o r from a sequence o f Householder o r Givens transformations.

Both t h e Gram-Schmidt and Householder a l g o r i t h m s process t h e

unreduced p a r t o f t h e m a t r i x A by columns and can cause severe intermediate fill-in.

The use o f Givens r o t a t i o n s i s much more a t t r a c t i v e i n

t h a t t h e m a t r i x i s processed by rows, g r a d u a l l y b u i l d i n g up R, and i n t e r m e d i a t e f i l l - i n i s c o n f i n e d t o t h e working row.

T h i s approach, i m -

plemented w i t h a good column o r d e r i n g and an e f f i c i e n t data s t r u c t u r e , i s t h e b a s i s f o r t h e computational module d e s c r i b e d i n s e c t i o n 1.4. Other d i r e c t non-normal -equations methods f o r sparse l e a s t squares , i n c l u d i n g those o f Peters and W i l k i n s o n [1970] (as implemented by B j o r c k and D u f f [1980])

and Hachtel [1976],

were considered, b u t these e l i m i n a -

t i o n methods r e q u i r e row and column p i v o t i n g f o r s t a b i l i t y as w e l l as s p a r s i t y , n e c e s s i t a t i n g access t o t h e e n t i r e m a t r i x .

This feature

g r e a t l y i n h i b i t s t h e p a r t i t i o n i n g o f l a r g e problems and t h e f l e x i b l e use o f a u x i l i a r y storage. 1.4 The Computational Module The numerical method we use i s developed i n d e t a i l i n George and Heath [1980].

Our m o t i v a t i o n i s t o combine t h e f l e x i b i l i t y , convenience

and low s t o r a g e requirements of t h e normal equations w i t h t h e s t a b i l i t y o f orthogonal f a c t o r i z a t i o n .

The b a s i c steps o f t h e a l g o r i t h m a r e as

f o l 1ows : Algorithm 1 1.

Orthogonal Decomposition o f A

T Determine t h e s t r u c t u r e ( n o t t h e numerical values) o f B = A A.

6 2.

F i n d an o r d e r i n g f o r B (column o r d e r i n g f o r A ) which has a

sparse Cholesky f a c t o r

3.

R.

S y m b o l i c a l l y f a c t o r i z e t h e r e o r d e r e d B y g e n e r a t i n g a row-

oriented data s t r u c t u r e f o r

4.

R.

Compute R by processing t h e rows o f A one by one u s i n g Givens

rotations. Steps 1 through 3 o f A l g o r i t h m 1 a r e t h e same as would be used i n a good implementation o f t h e normal equations method.

These steps may

be c a r r i e d o u t v e r y e f f i c i e n t l y u s i n g e x i s t i n g well-developed sparse m a t r i x s o f t w a r e (George and L i u [1979]). t h a t t h e data s t r u c t u r e f o r

I t i s i m p o r t a n t t o emphasize

R i s generated i n advance o f any numerical

computation, and t h e r e f o r e dynamic s t o r a g e a l l o c a t i o n t o accommodate f i l l - i n d u r i n g t h e numerical computation i s unnecessary.

The o r d e r i n

which t h e rows o f A a r e processed i n Step 4 does n o t a f f e c t t h e s t r u c t u r e o f R o r t h e s t a b i l i t y o f computing i t .

Therefore t h e rows may be

accessed from an e x t e r n a l f i l e one a t a t i m e i n a r b i t r a r y o r d e r .

A

suboptimal row o r d e r i n g t o reduce t h e amount o f computation a s s o c i a t e d w i t h i n t e r m e d i a t e f i l l - i n d u r i n g t h e orthogonal decomposition phase was shown t o be e f f e c t i v e i n George and Heath [1980] and i s used here. Thus A l g o r i t h m 1 r e q u i r e s t h e same s t o r a g e and e x p l o i t s s p a r s i t y a t l e a s t t o t h e same degree as t h e normal equations, a l l o w s convenient use o f a u x i l i a r y storage, and i n a d d i t i o n i s n u m e r i c a l l y s t a b l e .

7 2.

Decomposition o f A Using A u x i l i a r y Storage

2.1 I n t r o d u c t i o n T h i s s e c t i o n c o n s i s t s o f two p a r t s .

The f i r s t describes an a l g o -

r i t h m f o r f i n d i n g a column o r d e r i n g and p a r t i t i o n i n g o f A which lends i t s e l f t o t h e e f f i c i e n t use o f a u x i l i a r y storage.

The method uses

s l i g h t l y m o d i f i e d ideas and techniques which have a l r e a d y been described i n d e t a i l by George and L i u [1978],

so o u r p r e s e n t a t i o n i s b r i e f and

l i m i t e d t o showing o u r m o d i f i c a t i o n s and t h e relevance o f t h e scheme t o t h e l e a s t squares problem.

I t i s i m p o r t a n t t o n o t e t h a t i n some con-

t e x t s such p a r t i t i o n i n g s / o r d e r i n g s a r i s e as a n a t u r a l by-product o f t h e m o d e l l i n g procedure ( f i n i t e element a n a l y s i s ) o r data a c q u i s i t i o n (geodesy).

I n these cases, t h i s "preprocessing" a l g o r i t h m would n o t be

r e q u i red. The second p a r t o f t h i s s e c t i o n deals w i t h t h e u t i l i z a t i o n o f t h e s o f t w a r e d e s c r i b e d i n S e c t i o n 1.4 and t h e p a r t i t i o n i n g p r o v i d e d by A l g o r i t h m 2 o f S e c t i o n 2.2,

a l o n g w i t h f i l e s on a u x i l i a r y storage, t o

s o l v e very l a r g e l e a s t squares problems. 2.2 F i n d i n g an A p p r o p r i a t e O r d e r i n g and P a r t i t i o n i n g f o r t h e Columns o f A

I n t h e sequel i t i s convenient t o work w i t h t h e symmetric graph T a s s o c i a t e d w i t h t h e normal equations m a t r i x B = A A. G = (X,E)

a s s o c i a t e d w i t h B has n nodes xi,

i = 1, 2,

The graph

..., n,which

form

X, and an edge s e t E c o n s i s t i n g o f unordered p a i r s o f nodes w i t h x . } € E i f and o n l y i f Bij = Bji # 0, i # j . Thus t h e nodes xi J correspond t o t h e v a r i a b l e s o f t h e l e a s t squares problem; i . e . , t o t h e {xi,

columns o f A.

I m p l i c i t here i s t h e assumption t h a t t h e nodes o f G have

been l a b e l l e d as t h e columns of A.

I

Thus, a r e l a b e l l i n g o f t h e nodes o f

8

G corresponds t o a symmetric permutation o f t h e m a t r i x B y o r e q u i v a l e n t l y , a column p e r m u t a t i o n o f A.

Given G w i t h o u t any l a b e l s , f i n d i n g an

a p p r o p r i a t e permutation o f t h e columns o f A can be viewed as f i n d i n g an a p p r o p r i a t e l a b e l l i n g f o r G. A graph G ' = (X',E') For Y

C

i s a subgraph o f G = (X,E) i f X ' c X and E ' C E.

X, G(Y) r e f e r s t o t h e subgraph (Y,E(Y)) o f G, where

E(Y) = {{u,v) f E(u,v € V I . Nodes x and y a r e s a i d t o be a d j a c e n t i f {x,y)

i s an edge i n E.

For

Y c X, t h e a d j a c e n t s e t o f Y i s d e f i n e d as

A p a t h o f l e n g t h R i s a sequence o f R edges {xo,xl

1 , {xl ,x21y

...

w h e r e a l l t h e nodes a r e d i s t i n c t except f o r p o s s i b l y xo and xR.

A graph G i s connected i f t h e r e i s a p a t h j o i n i n g each p a i r o f d i s t i n c t nodes.

A p a r t i t i o n i n g P o f G i s an ordered c o l l e c t i o n o f node s e t s

where Yi fl Y

=

0,

i f j, and

P

u

Yi

= X.

i=1

j

Given a p a r t i t i o n i n g P o f G, t h e o n l y numberings ( l a b e l l i n g s ) we

w i l l be concerned w i t h i n t h i s paper w i l l be compatible w i t h P. i s , each Yi

i s numbered c o n s e c u t i v e l y , and nodes i n Yi

That

a r e numbered

before those i n Yi+l.

A subset Y

C

X o f t h e node s e t o f G i s a s e p a r a t o r o f G i f G(X-Y)

c o n s i s t s o f two o r more connected components.

ifno subset o f i t i s a separator.

A s e p a r a t o r i s minimal

9

AS we shall see below, a desirable column ordering and p a r t i t i o n i n g f o r A i s provided by a nested dissection p a r t i t i o n i n g of the graph of B

The algorithm we use here can be

(George, Poole and Voigt [1978]).

i s the graph of B and u i s a user-

described as follows, where G = (X,E) supplied parameter. Algorithm 2

Incomplete Nested Dissection Partitioning 0 , and n =

1.

Set V = X, p

2.

I f V = fl, go t o step 4.

=

If IT1

1x1.

Otherwise, l e t G(T) be a connected com-

s e t S = T ; otherwise, f i n d S a minimal P P' separator of G(T) which disconnects i t i n t o two o r more components of ponent of G ( V ) .

z p ,

approximately equal s i z e .

-

3.

Set V = V

4.

Set P = ( Y 1 ,

S

P' Y2,

n

=

n - ISp\, p

..., Y P I ,

= p

+ 1 , and go t o s t e p 2 .

where Y i = Sp+l -i *

Apart from the inclusion of the threshold

1~.,

and the omission of any

our implementation corresponds P' exactly t o t h a t described by George and L i u [1978, pp. 1054-10601, so s p e c i f i c l a b e l l i n g s t r a t e g y f o r each S

the reader i s referred there f o r d e t a i l s .

For our purposes, any order-

i n g compatible w i t h ? i s acceptable, and the choice of

Section 3 .

I t should be obvious t h a t

ness" of the dissection procedure.

1.1

p

i s discussed i n

governs the r e l a t i v e "complete-

An example of a nested dissection

p a r t i t i o n i n g along w i t h a compatible ordering i s g i v e n i n Figure 2.1. In order t o avoid unnecessarily complicated notation i n the follow-

i n g s e c t i o n , we assume from now on t h a t A has been reordered by columns and t h a t the Yi simply c o n s i s t of the appropriate consecutive subsequences of the f i r s t n integers.

been rep1 aced by thei r 1abel s .

In other words, the nodes of Y i have

10

s4

P = (Y

p = 7 F i g u r e 2.1 -

1’

Y2,

...,

Y 7 1 where Yi = S8-i.

An example o f a nested d i s s e c t i o n p a r t i t i o n i n g o f a graph

and an induced numbering o f i t s nodes.

We now d e f i n e a p a r t i t i o n i n g R = {Z1, Z2,

..., Z P 1

o f t h e row

numbers o f A, induced by t h e column p a r t i t i o n i n g P , as f o l l o w s .

w i t h Zo =

8.

Let

I t i s a l s o h e l p f u l t o d e f i n e t h e column s e t s

The manner i n which t h e Zi a r e d e f i n e d above i m p l i e s t h a t i n general, f o r a sparse m a t r i x A, i f t h e rows of A a r e permuted so t h a t those s p e c i f i e d by Zi

appear above those s p e c i f i e d by Zi+l,

t h e n A w i l l have

11

b l o c k upper t r a p e z o i d a l form, as d e p i c t e d i n F i g u r e 2.2,

f o r three diag-

Again, t o keep t h e p r e s e n t a t i o n simple, we assume t h a t t h e

onal b l o c k s .

rows of A have been r e l a b e l l e d so t h a t t h e Zi i n t e g e r s , w i t h those i n Zi

consist o f consecutive

preceding those i n Zi+l.

We denote t h e sub-

m a t r i c e s o f A by Ai j, 1 5 i, j 5 p, and o u r o b j e c t i v e i s t o f i n d a f o r m

f o r A such t h a t Aij

= 0 f o r i > j.

X

X

x x

X

-

Z1 = {1,2,...,8}

X

I

I

X

I

I

1

I I I

X

I

I

1

A F i g u r e 2.2

A 23 b y 9 sparse m a t r i x A p a r t i t i o n e d by columns a c c o r d i n g

t o P and t h e induced row p a r t i t i o n i n g R .

12

As mentioned e a r l i e r , t h i s process o f d i s s e c t i o n o f t h e problem v a r i a b l e s i n t o independent subsets may a r i s e more o r l e s s a u t o m a t i c a l l y o r may be c a r r i e d o u t by some means o t h e r t h a n t h e one proposed above. For example, i n geodesy, observations between o r among c o n t r o l p o i n t s a r e c o l l e c t e d and t a b u l a t e d by geographic r e g i o n , so t h e v a r i ables ( c o o r d i n a t e s ) and observations a r e n a t u r a l l y arranged i n a h i e r archical structure.

Moreover, automatic s u b d i v i s i o n can be implemented

on t h e b a s i s o f s p e c i f y i n g l a t i t u d e and l o n g i t u d e boundaries as separat o r s f o r the coordinate sets.

For a more d e t a i l e d d e s c r i p t i o n o f t h i s

process, c a l l e d Helmert b l o c k i n g by geodesists, see Golub and Plemmons [1980] and t h e r e f e r e n c e s c i t e d t h e r e i n . I n t h e a n a l y s i s of s t r u c t u r e s , t h e v a r i a b l e s i n t h e model a r e o f t e n s u b d i v i d e d a c c o r d i n g t o subassemblies, w i t h each component b e i n g processed by an i n d i v i d u a l design group.

T h i s may occur a t s e v e r a l l e v e l s ,

l e a d i n g t o "mu1 t i - l e v e l s u b s t r u c t u r i n g . "

T h i s p a r t i t i o n i n g procedure

t o g e t h e r w i t h t h e n a t u r a l f a c t o r f o r m u l a t i o n of t h e f i n i t e element method ( A r g y r i s e t a1 [1978])

leads t o a sparse l e a s t squares problem

having b l o c k upper t r a p e z o i d a l s t r u c t u r e , as i l l u s t r a t e d i n F i g u r e 2.2.

A s p e c i a l case o f t h e b l o c k upper t r a p e z o i d a l form i s t h e s o - c a l l e d b l o c k a n g u l a r form, where Aij

= 0 unless j = i o r j = p.

T h i s form

a r i s e s n a t u r a l l y i n many mathematical programming contexts, b u t more i m p o r t a n t from o u r p o i n t o f view i s t h a t Weil and K e t t l e r [1971] have p r o v i d e d a h e u r i s t i c a l g o r i t h m f o r permuting a general sparse m a t r i x i n t o b l o c k a n g u l a r form.

We have n o t used t h e i r a l g o r i t h m , b u t i n some

c o n t e x t s i t m i g h t serve as an a1t e r n a t e "preprocessor" ( p a r t i t i o n generat o r ) t o t h e one proposed i n t h i s s e c t i o n .

Golub and Plemmons C1980-J

13 have e x p l o i t e d t h i s b l o c k a n g u l a r form s t r u c t u r e i n connection w i t h computing orthogonal decompositions o f problems a r i s i n g i n geodesy. 2.3 Reduction o f A Using A u x i l i a r y Storage Before d e s c r i b i n g i n d e t a i l how t h e p a r t i t i o n i n g P i s used i n e x p l o i t i n g a u x i l i a r y storage, we f i r s t r e v i e w t h e b a s i c computational procedure, w i t h o u t any r e f e r e n c e t o t h e a c t u a l implementation.

For

d e f i n i t e n e s s , we assume \PI = 3 and t h a t A has t h e form shown i n F i g u r e 2.3.

The computation c o n s i s t s o f p major steps; t h e i - t h s t e p

r e s u l t s i n t h e g e n e r a t i o n o f IYi o f A.

I

rows o f t h e upper t r i a n g u l a r f a c t o r R

The computation i n v o l v e d i n t h e f i r s t s t e p i s d e p i c t e d i n

F i g u r e 2.3,

and a l l t h e o t h e r major steps a r e s i m i l a r , g e n e r a t i n g t h e

sequence o f s u c c e s s i v e l y s m a l l e r m a t r i c e s A1

, A2 , A3,

. .. , A P ' 1%

f i r s t step, which t r a n s f o r m s A = A1 t o A2, t h e m a t r i x [All;A1]

A t the i s re-

duced t o upper t r i a n g u l a r form u s i n g A l g o r i t h m 1 d e s c r i b e d i n s e c t i o n 1.4.

%

The m a t r i x A1 c o n s i s t s o f those columns o f A12 and A13 which a r e 'L

(Thus, A1 i s s i m p l y a "compressed" v e r s i o n o f A12 and A13.1

non-null.

The f i r s t IYl f i r s t IY1

I

I

rows o f t h e r e s u l t i n g upper t r i a n g u l a r m a t r i x a r e t h e

rows o f R, except t h a t t h e l a s t I Y ; / 'L

columns a r e "compressed,"

b e a r i n g t h e same r e l a t i o n s h i p t o R as A1 does t o [Al2;Al3].

IYiI

The l a s t

rows o f t h e upper t r i a n g u l a r m a t r i x a r e "expanded" and p u t on t h e

t o p o f t h e second row-block o f A, y i e l d i n g A2. computation i s d e p i c t e d i n F i g u r e 2.4, features.

The n e x t s t e p o f t h e

and t h e f i n a l s t e p has no s p e c i a l

14

A11

A 1 2 A13

0

yl

*22

0

A23

A33

A2

Figure 2 . 3

P i c t o r i a l description of the f i r s t s t e p of the IPI step

reduction o f A t o upper trapezoidal form.

15

R (COMPRESSED)

Figure 2.4

Depiction of the second step of the block-reduction o f A

t o upper trapezoidal. form.

16

Thus in the general case, a f t e r i-1 steps o f the reduction process, the matrix Ai will have the form shown in Figure 2.5.

....

... Y

I

1

h , i Ai-l,i+l

'

...

ii Ai,ii+l

... .. ... . .

....

.. Ai =

0

0

Figure 2.5

Block s t r u c t u r e o f A i .

17

Note t h a t since we assume t h a t the rows of Ai a r e stored on a u x i l i ary storage a t a l l times, the only storage needed a t the i - t h stage of the computation i s t h a t required f o r the presumably sparse upper triangul a r matrix Ri

, where

Another important practical observation is t h a t i n Algorithm 1 o f Section 1.4, and i n the one t h a t i s described i n the sequel, there i s

no r e s t r i c t i o n on the order t h a t the rows of any of the Ai a r e stored

so t h a t any beneficial row ordering may be used. In what follows i t i s helpful t o have names for c e r t a i n subsets of

the rows of Ai.

We define the s e t o f rows of Ai having nonzeros i n

columns which i n t e r s e c t Yi

rows of A i .

by

ri, and we use

Z; t o denote the remaining

T h u s , i t i s precisely those rows i n

Ti which a r e

involved

i n the i - t h major step of the computation.

For s i m p l i c i t y , we have not included the r i g h t hand side i n our Figures 2.3-2.5,

b u t we intend t h a t i t be processed simultaneously w i t h

the rows of A , and carried along i n p a r a l l e l .

In p a r t i c u l a r , throughout

t h i s paper, when we r e f e r t o processing a ''row of A" o r an "equation," we i m p l i c i t l y include the corresponding element of b y the weight ( i f any), and so on.

Our program allows f o r weights, and i n order t o handle

them uniformly ( i . e . , t o avoid d i s t i n g u i s h i n g between v i r g i n rows in Ai and those t h a t have been transformed), a weight of 1 i s associated w i t h transformed equations.

18

We a r e now ready t o d e s c r i b e t h e a l g o r i t h m f o r r e d u c i n g A, which i n v o l v e s t h e use o f f o u r f i l e s .

These f i l e s may be s t o r e d on tapes, The f i l e s

d i s k s , drums; o n l y s e r i a l access t o t h e f i l e s i s necessary. are: 1.

R-file:

accepts t h e rows of R as t h e y a r e computed.

2.

C-file:

( -c u r r e n t f i l e )

a t t h e b e g i n n i n g o f t h e i - t h major

s t e p o f t h e computation, c o n t a i n s t h e rows o f Ai

in

a r b i t r a r y order. 3.

Z-file:

d u r i n g t h e i - t h s t e p o f t h e computation, c o n t a i n s t h e rows i n t h e s e t

Ti; t h a t i s , those rows o f

A t h a t a r e i n v o l v e d i n t h e computation a t t h e i - t h major step.

4.

Z'-file:

a t t h e beginning o f t h e i - t h step, t h i s f i l e i s empty. The C - f i l e i s read, and s p l i t i n t o t h e 7 - f i l e and t h i s f i l e , which r e c e i v e s t h e rows o f Ai s e t 2;.

i n the

A f t e r t h e computation o f Ri i s p e r -

formed, t h e rows o f

Xii

a r e w r i t t e n on t h i s

f i l e , and t h e Z ' - f i l e becomes t h e C - f i l e f o r t h e n e x t s t e p o f t h e computation. of A i + l

(It now c o n t a i n s t h e rows

-1

Thus t h e 7 - f i l e i s a s c r a t c h f i l e , and t h e r o l e s of t h e C - f i l e and Z ' f i l e a l t e r n a t e a t each succeeding major s t e p o f t h e computation. Algorithm 3

Block Reduction o f A t o Upper T r i a n g u l a r Form

For i = 1, 2, 1.

..., p

do t h e f o l l o w i n g :

Rewind t h e C - f i l e , 7 - f i l e , and Z ' - f i l e s .

Read t h e equations

from t h e C - f i l e one by one, and f o r each do t h e f o l l o w i n g :

.

19

1.1.

If the equation number i s in

Ti, write the equation on

the 7 - f i l e and record the s t r u c t u r e i t contributes t o the T normal equations matrix H i H i , corresponding t o the matrix Hi 1.2

'L

E

[Aii[Ai].

If the equation i s in Z ; , write the equation on the Z ' f i l e .

2.

Order the equations so t h a t HTi H i s u f f e r s low f i l l - i n , r e s t r i c t -

ing the ordering so t h a t the variables in Y; appear l a s t . 3.

Create the data s t r u c t u r e f o r R i .

4.

Rewind the z - f i l e .

Read the equations from i t and compute R i ,

a l s o applying the transformations t o b. 5.

Write the rows of R i i on the R-file along with the transformed

elements o f b . %

6. Write the rows of R i i on the Z ' - f i l e , along with the transformed elements of b . 7.

Reverse the names of the C-file and Z ' - f i l e s .

I t . s h o u l d be c l e a r t h a t the combination of the s t r u c t u r e recording p a r t of S t e p 1.1 along with Steps 2 , 3 , and 4 i s e s s e n t i a l l y Algorithm 1 , described in Section 1.4.

The only modification i s the adjustment of

the ordering provided by Algorithm 1 so t h a t variables in Y; l a s t ; a l l others remain i n t h e i r same r e l a t i v e position.

appear

The ordering

o f the variables of Y i t h a t i s provided in Step 2 i s recorded s o t h a t

the rows of R written on the R f i l e can be processed in the correct order i n the back s u b s t i t u t i o n .

.

(Note t h a t Algorithm 2 in Section 2.1

only provided t h e ' p a r t i t i o n i n g and d i d not provide an ordering f o r each Y..) 1

20

Z

FILE BECOMES C-FILE FOR THE :*' NEXT STEP (STEP 7 ) .**

...**'

STEP iOFTHE REDUCTION

'

-

Z i ROWS

ry

-

R ii

F i g u r e 2.6

STEPS 2-6 OF THE REDUCTION PROCEDURE

Diagram o f t h e d a t a f l o w o f t h e a l g o r i t h m f o r r e d u c i n g A

t o upper t r a p e z o i d a l form.

A t t h e c o n c l u s i o n o f t h e r e d u c t i o n o f A t o upper t r a p e z o i d a l form,

a l o n g w i t h t h e simultaneous r e d u c t i o n o f b, t h e rows o f R and correspondi n g r i g h t - h a n d s i d e elements (y i n (1.3) o f S e c t i o n 1 ) w i l l be on t h e R - f i l e , w i t h t h e " w r i t e head" p o s i t i o n e d a t t h e end o f t h e l a s t r e c o r d . The f o l l o w i n g s i m p l e back s u b s t i t u t i o n i s t h e n used t o compute x. For i = n, n

- 1,

..., 1 do

the following:

1.

backspace t h e R - f i l e ;

2.

r e a d t h e i - t h row o f R and corresponding r i g h t - h a n d s i d e element yi and compute xi;

3.

backspace t h e R - f i l e .

.

21 3.

Numerical Experiments and Observations

3.1 Introduction T h i s section contains r e s u l t s of some experiments performed using

an implementation of the algorithms described in Section 1 . 4 and Section 2 .

Our t e s t problems are of various s i z e s (ranging from 1,444

equations and 400 unknowns up t o 17,946 equations and 4,554 unknowns) and of two d i f f e r e n t types.

One c l a s s of problems i s typical of those

t h a t would a r i s e in the natural f a c t o r formulation of the f i n i t e e l e ment method (Argyris and Br'dnlund [1975]), and the second c l a s s typif i e s those a r i s i n g in geodetic adjustment problems (Golub and Plemmons [1980]).

The descriptions of the problems we provide include only the

d e t a i l s necessary t o characterize t h e i r s i z e and s t r u c t u r e ; f o r import a n t infGrmationon the physical origins and t h e i r mathematical models,

the reader i s referred t o the references. 3 . 2 Test Problems

O u r f i n i t e element t e s t problems a r e associated with a q by q grid consisting of (9-1)

2

small squares, as shown i n Figure 3.1 w i t h q = 3 .

Associated with each of the n = q2 grid points i s a variable, and associated with each small square are four equations (observations) involving the four corner grid points ( v a r i a b l e s ) of the square. T h u s , 2 2 the associated c o e f f i c i e n t matrix i s n = q by m = 4(q-1) , as i l l u s t r a t e d in Figure 3.1.

Our second set of t e s t problems were a l s o derived from a q by q mesh, b u t of a r a t h e r d i f f e r e n t type.

.

The purpose here i s t o construct

problems typical of those a r i s i n g in geodetic adjustments.

The mesh

can be viewed as b e i n g composed of q2 ''junction boxes," connected t o

22

1

2

3

4

5

6

7

8

9

7

8

9

1

2

3

rr

1

x

X

2

x

X

3

x

X

4

x

X

5

X

6

X

7

X

8

X

9

X

10

X

11

X

12

X

13

X

14

X

15

X

16

F i n i t e element g r i d

X L

A F i g u r e 3.1

A 3 by 3 f i n i t e element g r i d and i t s a s s o c i a t e d 16 by 9

l e a s t squares c o e f f i c i e n t m a t r i x a r i s i n g i n t h e n a t u r a l f a c t o r f o r m u l a t i o n o f t h e f i n i t e element method. t h e i r neighbors by chains o f l e n g t h R , as shown i n F i g u r e 3.2 where q = 3 and R = 4. There a r e two v a r i a b l e s a s s o c i a t e d w i t h each o f t h e 2 5q + Z q ( q - 1 ) ( 3 ( ~ - 1 ) + 1 ) v e r t i c e s i n t h e mesh, TI o b s e r v a t i o n s a s s o c i a t e d w i t h each p a i r of nodes j o i n e d by an edge ( i n v o l v i n g f o u r v a r i a b l e s ) ,

23

F i g u r e 3.2

I d e a l i z a t i o n o f a 3 by 3 g e o d e t i c network h a v i n g c o n n e c t i n g

survey c h a i n s o f l e n g t h 4. and

TI

o b s e r v a t i o n s a s s o c i a t e d w i t h each t r i a n g l e i n t h e mesh ( i n v o l v i n g

s i x variables).

Thus, t h e a s s o c i a t e d l e a s t squares problems have

2 n = 1Oq + 4 q ( q - 1 ) ( 3 ( Q - l ) + l ) observations. R

v a r i a b l e s , and m = 12q2 + Z q ( q - l ) ( l l I I - l )

I n t y p i c a l r e a l problems

II i s around 5 or 6, so we s e t

= 5 i n o u r experiments, y i e l d i n g n = 62q2

I n o u r experiements we s e t

TI

-

= 2, y i e l d ng m/n

52q and m = Q(120q2 -108q). 4 f o r l a r g e q.

24

3.3 Numerical Experiments Since one of o u r main o b j e c t ves i s t o p r o v de a means o f s o l v i n g v e r y l a r g e problems on computers h a v i n g 1 i m i t e d main storage, o u r f i r s t experiments i n v o l v e s o l v i n g a sequence o f problems o f i n c r e a s i n g s i z e , u s i n g a f i x e d amount o f a r r a y s t o r a g e .

O f course, as t h e problems i n -

crease i n s i z e , t h e d i s s e c t i o n process which p r o v i d e s t h e p a r t i t i o n i n g ( A l g o r i t h m 2 i n S e c t i o n 2) must be a l l o w e d t o proceed f u r t h e r , c r e a t i n g The r e s u l t s o f these experiments a r e summarized i n

more b l o c k s . Table 3.1.

A l l t e s t s were made on an IBM 3033 a t t h e Oak Ridge N a t i o n a l

Laboratory.

The times r e p o r t e d a r e i n seconds and t h e s t o r a g e i n words.

The f a c t o r and s o l v e t i m e i n c l u d e s t h e o r t h o g o n a l decomposition t i m e i n a p p l y i n g A l g o r i t h m 1 , t o g e t h e r w i t h t h e f i n a l back s u b s t i t u t i o n t i m e i n computing t h e l e a s t squares s o l u t i o n .

The t o t a l elapsed t i m e a l s o

i n c l u d e s 1/0 p r o c e s s i n g and r e p r e s e n t s t h e t o t a l amount o f IBM 3033 t i m e i n s o l v i n g t h e problem. The maximum s t o r a g e used i n c l u d e s a l l a r r a y storage, s t o r a g e f o r permutations, bookkeeping, e t c . ,

i n words on t h e IBM 3033.

Our second s e t o f experiments was designed t o demonstrate t h e influence o f the block s i z e l i m i t requirements.

p

on e x e c u t i o n times and t o t a l s t o r a g e

We s o l v e d one moderately l a r g e - s c a l e problem from each

o f t h e two c l a s s e s a number o f times, w i t h d i f f e r e n t values o f p, l e a d i n g t o d i f f e r e n t values f o r t h e r e s u l t i n g number o f b l o c k s p. r e s u l t s of these experiments a r e summarized i n Table 3.2.

The

25

1.

Geodetic Network 'Block s i z e

J u n c t ion ' l i m i t i n Jnknowns Equations n m

boxes 9

words

u

r

Factor IMaximum and s o l v e 1 block: s t o r a g e t i m e i n P used seconds

-

Total elapse( time i r second:

402

1,512

3

500

1

7,430

1.69

4.69

1,290

4,920

5

1,500

1

25,367

7.21

22.77

2,674

10,248

7

1,500

3

32,070

22.27

57.69

4,554

17,469

9

1,500

7

30,141

46.18

107.40

2.

n knowns Equations n m

Fi n i t e Element G r i d Problems

Nodes

A

Ilock s i z e lumber vlax imum imit in words )locks torage used u P

-actor m d solve time i n seconds

Total e l apsec time i r second!

400

1,444

20

500

1

10,029

3.10

5.97

1,225

4,624

35

1,000

3

30,547

20.78

32.23

2,500

9,604

50

1,000

7

28,462

72.98

96.10

4,225

16,384

65

500

23

28 ,987

142.81

208.20

-----

Table 3.1.

Summary o f t e s t r e s u l t s showing s t o r a g e and e x e c u t i o n times

f o r a sequence o f problems, where t h e maximum t o t a l amount o f f a s t s t o r a g e i s f i x e d a t about 30,000 words.

26

Geodetic Network

1.

n

=

2,674 unknowns

m = 10,248 equations !lock size 1 imi t

Factor and solve time in seconds

Total elapsed time in seconds

n words

Number blocks

IJ

P

Maxi mum storage used

3,000

1

53,858

17.73

71.40

1,500

3

32,070

22.27

57.69

800

7

17,966

23.66

52.71

500

18

13,699

27.56

61.80

2.

F i n i t e Element Grid

n = 2,500 unknowns

.

m = 9,604 equations Block s i z e 1 imi t i n words

Number bl ocks

Factor and solve time in seconds

Total elapsed time i n seconds

v

A

Maximum storage used

2,000

3

73,002

72.58

106.80

1,500

5

46,178

68.73

96.00

1,000

7

28,462

72.98

96.10

500

15

23,332

Table 3.2

I

69.11

I

100.80

Summary of r e s u l t s f o r the solution of a s i n g l e problem from

each class, using d i f f e r e n t levels of blocking.

27

3.4 Observations 1.

In Table 3.1, the f a c t o r and solve times , as well as the t o t a l

elapsed times, increase in a roughly l i n e a r fashion as the problem s i z e s go u p under the c o n s t r a i n t of a fixed maximum amount of in-core storage. 2.

For the two problems represented

n Table 3.2, the f a c t o r and

solve times a r e r e l a t i v e l y i n s e n s i t i v e t o the block s i z e l i m i t

1-1.

These

times gradually increase f o r the geodetic network and o s c i l l a t e f o r the f i n i t e element g r i d .

Also, the maximum s orage used drops considerably

a t f i r s t and then l e v e l s o u t as the block s i z e l i m i t decreases in each problem. 3.

The f a c t t h a t f i n i t e element problems generally r e s u l t in a

l e s s sparse observation matrix than geodetic network problems has the obvious r e s u l t .

In each of our t a b l e s the storage and execution times

a r e l a r g e r f o r the f i n i t e element problems. 4. program

One possible disadvantage of the use of our automatic blocking (Algorithm 2 ) i s t h a t the e n t i r e s t r u c t u r e of ATA must be

i n i t i a l l y stored in-core in order t o apply the nested dissection scheme. However, in practice t h i s would not often be a serious limitation since some preliminary blocking i s usually provided f o r the l a r g e r problems

a s a by-product of the modelling procedure ( f i n i t e element a n a l y s i s ) or data acquisition (geodesy). 5.

I n each of our problems the time required f o r the automatic

blocking provided by Algorithm 2 turns o u t t o be small in comparison

t o the t o t a l elapsed time.

These automatic blocking times a r e n o t

reported separately in Tables 3.1 and 3.2; however, Algorithm 2

28

requires only .78 seconds out o f a t o t a l elapsed time o f 208.20 seconds

for our 1arges t probl em. 6.

Our scheme i s storage e f f e c t i v e .

In each t e s t problem the

maximum storage used i s a modest multiple of the number o f variables n . In summary, our numerical experiments demonstrate t h a t the approach taken i n this paper can be used t o solve large-scale l e a s t squares problems using a r e l a t i v e l y small amount o f in-core storage.

29

REFERENCES J . H . Argyris and 0 . E. Bronlund [1975], "The natural f a c t o r formulation of the s t i f f n e s s matrix displacement method," Computer Methods i n Applied Mech. and Eng. 5 , 97-119. J . H . Argyris, T. L . Johnsen and H . P-Mlejnek [1978], "On the natural

f a c t o r i n nonlinear analysis

,I'

Computer Methods i n Applied Mech.

and Eng. 15, 389-406. J . K. Avila and J . A . Tomlin [1979], "Solution of very large l e a s t squares problems by nested dissection on a p a r a l l e l processor," Proc. Computer Science and S t a t i s t i c s : Twelfth Annual Symposium on the I n t e r f a c e , Waterloo, Canada. A. Bjorck [1976] , "Methods f o r sparse l i n e a r l e a s t squares problems," i n Sparse Matrix Computations, J . R. Bunch and D. J . Rose, eds.,

.

Academic Press , N Y , 177-1 94. A . Bjorck and I . S. Duff [1980], "A d i r e c t method f o r the solution of

sparse l i n e a r l e a s t squares problems," Linear Algebra and I t s

e, t o appear. I . S. Duff and J . K. Reid [1976], "A comparison o f some methods for

the solution of sparse overdetermined systems o f l i n e a r equations," J . I n s t . Math. Applic. 7 , 267-280. A. George and M. T . Heath [1980], "Solution of sparse l e a s t squares

problems u s i n g Givens r o t a t i o n s , " Linear Algebra and I t s

Applic.,

t o appear. A . George and J . L i u [1978], "An automatic nested dissection algorithm f o r i r r e g u l a r f i n i t e element problems," SIAM J . Numer. Anal. 1053-1 069.

15,

30 A . George and J . L i u [1979], "The design of a user i n t e r f a c e f o r a sparse matrix package," ACM Trans. on Math. Software 5 , 134-162. A. George and J . L i u [1980], Computer solution of large sparse symmetric p o s i t i v e d e f i n i t e systems of l i n e a r equations, Prentice Hall, I n c . , Englewood C l i f f s , NJ. A. George, W. Poole and R. Voigt [1978], "Incomplete nested dissection f o r solving n by n g r i d problems," SIAM J . Numer. Anal.

15,662-673.

P. Gill and W. Murray, [1976], "The orthogonal f a c t o r i z a t i o n of a large

sparse matrix," i n Sparse Matrix Computations, J . R. Bunch and D . J .

Rose, eds. , Academic Press , N Y , 201-212. G . H . Golub [1965], "Numerical methods f o r solving l i n e a r l e a s t squares

problems," Numer. Math. 7 , 206-216. G . H . Golub and R. J . Plemons [1980], "Large s c a l e geodetic l e a s t

squares adjustments by dissection and orthogonal decomposition," Linear Algebra and I t s Applic., t o appear. G. H. Golub, F . T. L u k and M. Pagano [1979], "A sparse l e a s t squares

problem i n photogrammetry," Proc. Computer Science and S t a t i s t i c s : Twelfth Annual Symposium on the Interface, Waterloo, Canada. G . Hachtel [1976], "The sparse tableau approach t o f i n i t e element

assembly," i n Sparse Matrix Computations, J . R. Bunch and D. J . Rose, e d s . , 344-364. G. B. Kolata [1978], "Geodesy: Dealing w i t h an enormous computer task,"

Science 200, 421-422, 466. G. Peters and J . H. Wilkinson [1970], "The l e a s t squares problem and

pseudo-i nverses

,'I

The Computer Journal

12 , 309-31 6.

.

31

P. Vanicek, M. R. E l l i o t t and R. 0. C a s t l e [1979], "Four-dimensional modeling o f r e c e n t v e r t i c a l movements i n t h e area o f t h e Southern C a l i f o r n i a u p l i f t , " Tectonophysics 52, 287-300.

R. L. Weil and P. C. K e t t l e r [1971],

"Rearranging m a t r i c e s t o block-

a n g u l a r f o r m f o r decomposition (and o t h e r ) a l g o r i t h m s ," Management

- 98-108. Science 18,

.

33 ORNL/CSD-63 D i s t r i b u t i o n Category UC-32

INTERNAL DISTRIBUTION 1. 2. 3.

4. 5- 7. 8. 9. 10.

11. 12. 13. 1.4 15. 16.

.

17. 18.

.

C e n t r a l Research L i b r a r y Patent O f f i c e ORNL Technical L i b r a r y , Document Reference S e c t i o n L a b o r a t o r y Records - RC L a b o r a t o r y Records Department A. A. Brooks J. A. Carpenter H. P. Carter/CSD X-10 L i b r a r y S.-J. Chang J. E. Cope P. M. D i Z i l l o - B e n o i t J. B. Drake R. E . F u n d e r l i c P. W. Gaffney D. A. Gardiner K. E. Gipson/Biometrics L i b r a r y

19-23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37.

M. T. S. W. J. S. J. D. D. C. J. R. S. G. D.

T. L. K. E. 6. A. E. S. J. A. N. C. 6. W. G.

Heath Hebble Iskander Lawrence Mankin McGuire Park Scott Strickler Serbin Tunstall Ward Watson Westley Wilson

EXTERNAL DISTRIBUTION 38.

Professor V . Ashkenazi, Department o f C i v i l Engineering, The U n i v e r s i t y o f Nottingham, Nottingham, England NG7 2RD

39.

P r o f e s s o r Ake Bjorck, Department o f Mathematics, L i n k o p i n g U n i v e r s i t y , L i n k o p i n g 58183, Sweden

40.

D r . Paul Boggs, Mathematics D i v i s i o n , U.S. 12211, Research T r i a n g l e Park, NC 27709

41.

M r . John Bossler, N a t i o n a l Geodetic Survey, NOAA, U.S. Department o f Commerce, 6001 E x e c u t i v e Boulevard, R o c k v i l l e , MD 20852

42.

Professor James Bunch, UCSD, C-012, APM B u i l d i n g , La J o l l a , CA 92093

43.

D r . T. D. B u t l e r , T-3, Hydrodynamics, Los Alamos S c i e n t i f i c Laboratory, P.O. Box 1663, Los Alamos, NM 87545

44.

D r . B i l l L. Buzbee, C-3, A p p l i c a t i o n s Support & Research, Los Alamos S c i e n t i f i c Laboratory, P.O. Box 1663, Los Alamos, NM 87545

45.

M r . Bernard Chovitz, N a t i o n a l Geodetic Survey, NOAA, U.S. Department o f Commerce, 6001 E x e c u t i v e Boulevard, R o c k v i l l e , MD 20852

Army Research O f f i c e , Box

34

46.

D r . L. Lynn Cleland, Engineering Research D i v i s i o n , Lawrence L i v e r more Laboratory, P.O. Box 808, Livermore, CA 94550

47.

D r . James Corones, Ames Laboratory, Iowa S t a t e U n i v e r s i t y , Ames, I A

50011 48.

D r . I a i n S. D u f f , CSS D i v i s i o n , B u i l d i n g 8.9, AERE-Harwell, Didcot, Oxon, Engl and

49.

D r . Marvin D. Erickson, Computer Technology, Systems Department, P a c i f i c Northwest Laboratory, P.O. Box 999, Rich1 and, WA 99352

50.

D r . A l b e r t M. Erisman, Boeing Computer Services, 565 Andover Park, W. , Tukwila, WA 98188

51.

D r . K i r b y W. Fong, L-560, Lawrence Livermore Laboratory, P.O. 5509, Livermore, CA 94550

52.

D r . David M. Gay, MIT--CCKEMS, Cambridge, MA 02142

5357.

Box

Rm E38-278, 292 Main S t r e e t ,

D r . J. Alan George, Department o f Computer Science, U n i v e r s i t y of Waterloo, Waterloo, Ontario, Canada N2L 361

58.

M r . John Gergen, N a t i o n a l Geodetic Survey, NOAA, U.S. Department o f Commerce, 6001 Executive Boulevard, R o c k v i l l e , MD 20852

59.

Professor Max Goldstein, Courant I n s t i t u t e o f Mathematical Sciences, New York U n i v e r s i t y , 251 Mercer S t r e e t , New York, NY 10012

60.

Professor Gene H. Golub, Department o f Computer Science, S t a n f o r d U n i v e r s i t y , Stanford, CA 94305

61.

D r . Richard J. Hanson, Numerical Mathematics D i v i s i o n , 5122, Sandia L a b o r a t o r i e s , Albuquerque, NM 87115

62.

D r . Richard L. Hooper, S t a t i s t i c s , Systems Department, P a c i f i c Northwest Laboratory, P.O. Box 999, Richland, WA 99352

63.

D r . Robert E. Huddleston, Applied Mathematics D i v i s i o n , 8332, Sandia L a b o r a t o r i e s , Livermore, CA 94550

64.

D r . Robert J. Kee, Applied Mathematics D i v i s i o n , 8331, Sandia Laborat o r i e s , Livermore, CA 94550

65.

Professor Peter D. Lax, D i r e c t o r , Courant I n s t i t u t e o f Mathematical Sciences, New York U n i v e r s i t y , 251 Mercer S t r e e t , New York, NY 10012

66.

D r . Charles Lawson, J e t P r o p u l s i o n Laboratory, Pasadena, CA 91103

35

67.

Professor Peter Meissl, I n s t i t u t f u r Mathematische und Numerische Geodasie, Technical U n i v e r s i t y Graz, T e c h n i k e r s t r a 4, A-8010, Graz, Austria

68.

D r . Paul C. Messina, A p p l i e d Mathematics D i v i s i o n , Argonne N a t i o n a l Laboratory, Argonne, I L 60439

69.

D r . George Michael, Computation Department, Lawrence Livermore Laboratory, P.O. Box 808, Livermore, CA 94550

70.

Professor Cleve Moler, Department o f Computer Science,lJniversity o f New Mexico, Albuquerque, NM 87131

71.

D r . B a s i l Nichols, T-7, Mathematical Modeling and Analysis, Los Alamos S c i e n t i f i c Laboratory, P.O. Box 1663, Los Alamos, NM 87545

72.

Professor James Ortega, A p p l i e d Mathematics Department, U n i v e r s i t y o f V i r g i n i a , C h a r l o t t e s v i l l e , VA 22903

73.

D r . Carl E . O l i v e r , 7305 Fathom Court, Burke, VA 22015

74.

Professor C h r i s Paige, Computer Science Department, M c G i l l U n i v e r s i t y , 805 Sherbrooke S t r e e t , W., Montreal, Quebec, Canada H3A 2K6

75.

D r . Ronald P e i e r l s , A p p l i e d Mathematics Department, Brookhaven N a t i o n a l Laboratory, Upton, NY 11973

76.

D r . Ralph Perhac, E l e c t r i c Power Research I n s t i t u t e , P.O. P a l o A l t o , CA 94303

7781.

Box 10412,

Professor Robert J. Plemmons, Computer Science Department, U n i v e r s i t y o f Tennessee, K n o x v i l l e , TN 37916

82.

D r . James C. T. Pool, O f f i c e o f Basic Energy Sciences, ER-17, GTN, U.S. Department o f Energy, Washington, DC 20545

83.

D r . W i l l i a m G. Poole, Boeing Computer Services, 565 Andover Park, W., Tukwila, WA 98188

84.

M r . Alan Pope, N a t i o n a l Geodetic Survey, NOAA, U.S. Department o f Commerce, 6001 E x e c u t i v e Boulevard, R o c k v i l l e , MD 20852

85.

D r . C a r l Quong, Computer Science and A p p l i e d Mathematics Department, Lawrence B e r k e l e y Laboratory, Berkeley, CA 94720

86.

D r . John K. Reid, CSS D i v i s i o n , B u i l d i n g 8.9, Oxon, Engl and

AERE-Harwell,

5-309,

Didcot,

36

87.

D r . Donald J. Rose, B e l l L a b o r a t o r i e s , Murray H i l l , NJ 07974

88.

D r . M i l t o n E. Rose, D i r e c t o r , ICASE, M/S 132C, NASA Langley Research Center, Hampton, VA 28665

89.

D r . B e r t W. Rust, S c i e n t i f i c Computing D i v i s i o n , Technology B u i l d i n g , N a t i o n a l Bureau o f Standards, Washington, DC 20234

90.

D r . Michael Saunders, Systems O p t i m i z a t i o n Laboratory, Operations Research Department, S t a n f o r d U n i v e r s i t y , Stanford, C A 94305

91.

D r . Lawrence F . Shampine, Numerical Mathematics D i v i s i o n , 5642, Sandia L a b o r a t o r i e s , P.O. Box 5800, Albuquerque, NM 87115

92.

D r . John N. Shoosmith, M/S 125, NASA Langley Research Center, Hampton, VA 23366

93.

Professor G. W. Stewart, Computer Science Department, U n i v e r s i t y o f Maryland, C o l l e g e Park, MD 20742

94.

D r . William 97230

95.

Professor Charles Van Loan, Department o f Computer Science, Cornel1 U n i v e r s i t y , I t h a c a , NY 14853

96.

M r . Ted Vincenty, N a t i o n a l Geodetic Survey, NOAA, U.S. Department o f Commerce, 6001 Executive Boulevard, R o c k v i l l e , MD 20852

97.

D r . Ray A. Waller, S-1, S t a t i s t i c s , Los Alamos S c i e n t i f i c Laboratory, P.O. Box 1663, Los Alamos, NM 87545

98.

D r . Ralph A. Willoughby, Mathematical Sciences Department, I B M Thomas J. Watson Research Center, Yorktown Heights, NY 10598

99.

M r . Robert Z i e g l e r , Defense Mapping Agency, DMAHTC--GSGC, Brookes Lane, Washington, DC 20315

100. 101301.

F. Tinney, B o n n e v i l l e Power A d m i n i s t r a t i o n , P o r t l a n d , OR

6500

O f f i c e o f A s s i s t a n t Manager f o r Energy Research and Development, Department o f Energy, Oak Ridge Operations O f f i c e , Oak Ridge, TN 37830 Given D i s t r i b u t i o n as shown i n TID-4500 under Mathematics and Computers Category

Solution of large-scale sparse least squares problems ... - CiteSeerX

Solution of large-scale sparse least squares problems ... - CiteSeerX

Suggest Documents

ITERATIVE SOLUTION OF LEAST-SQUARES PROBLEMS APPLIED ...

Iterative solution of sparse linear least squares ... - NC State University

Iterative Tomographic Solution of Integer Least Squares Problems with ...

Bayesian Sparse Partial Least Squares - Computational Intelligence ...

Sparse Least Squares Support Vector Regressors

A Sparse Least Squares Support Vector Machine Classifier - CiteSeerX

SMO-Based Pruning Methods for Sparse Least Squares ... - CiteSeerX

a sparse least squares support vector machine classifier - CiteSeerX

On the condition number of linear least squares problems ... - CiteSeerX

Regularization of Linear Least Squares Problems by Total ... - CiteSeerX

Least Squares Solution and Pseudo- Inverse

Alternatives to the Least Squares Solution to

S1 Appendix The ordinary least squares solution

MINIMUM NORM LEAST-SQUARES SOLUTION TO GENERAL

condition numbers for structured least squares problems

Research Article Least Squares Problems with

Solving linear least squares problems by Gram-Schmidt ... - CiteSeerX

Separable Nonlinear Least Squares - CiteSeerX

Sparse recovery via Orthogonal Least-Squares under presence of Noise

Sparse Non-linear Least Squares Optimization for Geometric Vision

Solution of the Least Squares Method problem of

A Sparse Regularized Least-Squares Preference Learning Algorithm

decimated least mean squares for frequency sparse channel estimation

Analysis of solution of the least squares problemâ

Solution of large-scale sparse least squares problems ... - CiteSeerX