Kriging Interpolation on High-Performance Computers - CiteSeerX

30 downloads 653 Views 1MB Size Report
Once this validation testing was completed on all platforms, performance testing was .... The CM5 does not currently have an implementation of the Java Virtual ...
Kriging Interpolation on High-Performance Computers K.E.Kerry? and K.A.Hawick Email: fkatrina,[email protected]

Department of Computer Science, University of Adelaide, SA 5005, Australia

Technical Report DHPC-035

Abstract

We discuss Kriging Interpolation on high-performance computers as a method for spatial data interpolation. We analyse algorithms for implementing Kriging on high-performance and distributed architectures. In addition to a number of test problems, we focus on an application of comparing rainfall measurements with satellite imagery. We discuss our hardware and software system and the resulting performance on the basis of the Kriging algorithm complexity. We also discuss our results in relation to selection of an appropriate execution target according to the data parameter sizes. We consider the implications of providing computational servers for processing data using the data interpolation method we describe. We describe the project context for this work, which involves prototyping a data processing and delivery system making use of on-line data archives and processing services made available on-demand using World Wide Web protocols. Keywords: Distributed systems, parallel computing, Kriging, spatial interpolation, middleware

1 Introduction

Fusion of large remotely sensed datasets with ground-truthing, directly observed data is an important but computationally costly challenge. Satellite imagery data generally represents a data gridding pattern completely di erent from ground truth data. A typical instance of this problem arises for the Soil and Land Management Cooperative Research Centre in a project to investigate correlations between vegetative growth cover and rainfall conditions at high ground resolution. Data sources for rain prediction purposes are from the GMS5 geostationary meteorological satellite and the NOAA polar orbiting satellite. Ground truthing data is available in the form of the Australian \Rainman" data set, representing Bureau of Meteorology ground station measurements of rainfall for the whole of Australia. Figure 1 shows the location of Rainman collection sites around Australia.

Fig. 1.

Location points for Rainman Data Collection.

To carry out a useful comparison between multiple datasets it is necessary to have techniques to interpolate spatial data from one set of arbitrary sample points to another possibly gridded layout. Linear polynomial techniques provide a method for sampling arbitrary points from gridded data, but the problem is more complex if the source data is sampled at arbitrary points. ?

Author for Correspondance, Phone: +61 8 8303 4487, Fax: +61 8 8303 4366

One technique for tackling this is known as Kriging and can be formulated as a matrix problem connecting the datasets. The Kriging problem is an interesting example of a computationally intensive processing component in a geographic information system (GIS). Many of the spatial data manipulation and visualisation operations in a GIS may be relatively lightweight and capable of running on a PC or a low performance computing platform, but Kriging a large data set is too computationally demanding and therefore needs to be run on a separate \accelerator platform". As part of the components technology investigation for the DISCWorld project, we have investigated the use of a networked supercomputer such as a 128-node Connection Machine CM5 as a suitable accelerator for carrying out the Kriging process. The CM5 runs as a server and provides the Kriging service to client applications running on networked lightweight computing platforms. Interesting issues in the construction of such a system are the tradeo s between the communications cost of transferring data from client and server compared to the speed advantages of using a powerful computational server instead of running the Kriging on the client. In the case of Kriging we are able to analyse the problem complexity in terms of the problem size speci ed by the number of input and output dataset points. The parallelism for dense matrix problems is well understood so it is possible to parameterise the problem scaling with di erent numbers of CM5 processors allocated. It is also possible to allow the client application to choose which computational accelerator for the task required. There may be a real money cost associated between fast, medium and slow servers all of which are functionally capable of providing the same service. Other issues include network reachability of the servers and estimated time to deliver the solution compared to the end-user requirements. The problem complexity also varies as other pre-processing and post-processing activities are required by the end-user. The cost tradeo space can be explored using relatively simple linear models and stored coecients from previous problem runs. Work to date has implemented the computational components and carried out simple performance analysis. The problem tradeo space has been mapped out in more detail, varying the problem size and the number of processors. Future work will involve experimenting with di erent computational servers available to the client. We discuss the Kriging algorithm in more detail in Section 2 of this document. In sections 3 and 4 we discuss the computing systems and software environments we have available to us for tackling Kriging problems in a near-interactive time. Section 5 describes the testing process and testing data used for both validation testing and performance analysis. We also analyse the results of this testing and conclude about the suitability of the di erent implementations. Section 6 presents a model for integration of the Kriging implementations into the DISCWorld. Section 7 presents a summary of the results and concludes this paper.

2 Kriging

Given a set of known data values for certain points we can interpolate from these points to estimate values for other points. By placing an evenly spaced grid over the area for which we have known values we can obtain an estimated surface. Kriging interpolation [11, 12], developed by Matheron and Krige [15], is based on the theory of regionalized variables. The basic premise of Kriging interpolation is that every unknown point can be estimated by the weighted sum of the known points. Kriging interpolation also provides a mechanism for estimating the error for an approximated point. Where known values map onto grid points the error is known to be zero. This method works best for known values that are not evenly scattered [11]. The rainfall data that we wish to interpolate is clustered as can be seen in Figure 1, for example there are more known data values for city locations and around cities, and is hence well suited to Kriging interpolation. Generally the estimation of an unknown point only takes a limited range of the known values into consideration. This is done for two reasons. Firstly, known values at a great distance from the unknown point are unlikely to be of great bene t to the accuracy of the estimate and secondly, the operation is less expensive. However, Border [2] states that in general Kriging produces

the best results when the largest possible number of known points is used in estimating each unknown point. Obviously this is the most expensive option and hence it is desirable to develop high performance means of implementing the Kriging algorithm.

2.1 The Kriging algorithm

Suppose we have a set of K known points P . Each point Pi is of the form (xi ; yi ; Zi ) where xi and yi are the coordinates of the known point and Zi the value. We can then estimate the value of an unknown point Eij by calculating the weighted sum of the known points:

E (xi ; yj ) =

X k=1

k=K

wk Zk

(1)

where wk is the weighting given to the kt h known point. We can do this for the set of unknown points, or the target area E . The body of the Kriging algorithm is involved in the selection of the appropriate weights. A seperate set of weights wij must be calculated for each estimation. To calculate the weight set or vector wij we rst construct a variogram V of the known points where each element Dij is the distance from the known point Pi to the known point Pj . 3 2 D11 D12 : : : D1K 6 2K 7 7 V = 64:D: :21: : :D: :22: : :: :: :: : D : : : : :5

(2)

DK 1 DK 2 : : : DKK By also constructing a distance vector, dij , of the distances from the unknown point to each known point, we can calculate the weight vector by wij = dij V ?1 and hence Eij = wij P .

3 Computing Environment

We have implemented the Kriging algorithm described in section 2 on a number of high-performance and distributed computing resources. Two platforms are particularly worthy of description as they represent the two major classes of computational resources we envisage will be available in our DISCWorld system - a large parallel computer and a cluster of workstations.

3.1 The Connection Machine CM5

The Connection Machine CM5 is a multiprocessor computing system con gured with fast vector nodes controlling their own distributed memory system and a front end Unix platform providing le access services. A scalable disk array also provides a fast (but less portable) storage system. Our system is con gured with 128 compute nodes, each with 32MB of memory, which must hold operating system program code and program data. Fast applications on the CM5 generally make use of a mature compilation system based around the Connection Machine Fortran (CMF) language [19] - an early precursor of the High Performance Fortran (HPF) [20] programming language standard. The CM5 is also provided with a powerful library of well optimised matrix and linear algebra operations. The software environment for the CM5 is not ideal for ready integration into a more complex application with potentially distributed components and user interface requirements. One solution is to employ the CM5 as a solver engine - providing a speci c service of solving matrices on-demand by other remote applications. The remote method invocation and infrastructure for this is discussed brie y in Section 6 and in more detail in [17].

3.2 Farm of Alpha Workstations

A more easily integrable hardware platform is a farm of distributed workstations. We employ a farm of twelve DEC Alpha Workstations as a parallel computing resource for implementing the Kriging application. High Performance Fortran is implemented on our system as a native language provided by DEC and utilising a proprietary DEC messaging system for the underlying communications as well as a portable compilation system from the Portland Group which

utilises the MPI-CH message passing implementation. We are investigating the performance of this system as an alternative to a dedicated parallel supercomputer. Networks of workstations become a viable alternative if the communications costs of running a distributed or parallel job on them can be ameliorated. We employ an optical bre based, fast networking interface using Asynchronous Transfer Mode (ATM) protocols to interconnect our workstation farm. Preliminary data indicates this does provide good performance. Using conventional ethernet technology as an interconnect gives communication costs that are too high in relation to the computational load. As part of our testing of the Kriging implementation we have also gathered test data for execution on single Alpha workstations as well as di ering numbers of processors within our Alpha farm of 12 workstations.

4 Implementation Techniques

The Kriging algorithm has been implemented using a variety of implementation techniques due to the varying computational platforms. More information on the implementations of the Kriging algorithm can be found in [18] CMFortran is a variety of Fortran used for programming the CM5. It provides mechanisms for partitioning data across the nodes of the CM5 and for communication between the nodes. CMFortran provides parallel language constructs to perform simple array operations across partitioned data. Available for use with CMFortran are several libraries which include operations for performing more complex parallel array tasks and communication patterns. One of these libraries, the CMSSL library [19], provides operations for inverting large, sparse matrices and for performing ecient multiplication of matrices and vectors. We employ the LU factorisation utility in the CMSSL library for solving the matrix component of the Kriging algorithm. Fortran 90 is a variety of Fortran which also includes several parallel language constructs but which can also be run serially, in which case the bene t of the language constructs is in simpli ed programming. Fortran 90 extends Fortran 77 by providing both these parallel constructs and more complex programming constructs. Fortran 90 was used in the implementation of the Kriging algorithm for execution on a single Alpha workstation.

5 Test Problems and Scalability

For development and testing of the Kriging algorithm, several simple test problems were used to test the development of the variance matrix and the nal interpolated rainfall surface. For each test case, 100 random points within a interpolation target range were chosen and then the Kriging algorithm was executed upon the produced data with a 50  50 element target grid. Upon successful testing of the test problems, functions of increasing diculty were given as input to the Kriging implementation. These functions were chosen because of their large degree of spatial dependency. Finally, real rainfall data was used to produce an approximated rainfall surface for the Rainman data. Many of these problems go beyond the complexity of the interpolation accuracy required for interpolating the rainfall data surface but are still interesting as a test of the power of the interpolation algorithms. Once this validation testing was completed on all platforms, performance testing was undertaken and the results analysed in terms of which implementation gave the best performance for di erent data sizes. As some of the target platforms were speci cally optimised for matrix operations such as inversion it was expected that the best platform would vary according to the ratio of variance matrix size and target matrix size.

5.1 Complexity Analysis

It is important in computationally intensive applications to consider the computational complexity of the entire algorithm to identify which part dominates the computational load. It is not atypical in applications built on linear algebra and matrix methods to nd that di erent components dominate the compute time depending on the problem size. For the Kriging problems

described here, the compute time can be described by a polynomial in the number of non-gridded data points, for a xed desired grid output resolution. This can be written as:

T = A O + A1 N + A 2 N 2 + A3 N 3

(3)

where N is the variable data component. In matrix problems the matrix assembly component is often accounted for by the quadratic term and the matrix solution time is accounted for by the cubic term. However, it is fairly common for A2  A3 so that the matrix assembly and disassembly to dominate for small problem sizes and the solver stage to dominate only for very large problems. In addition, parallel solver methods such as the solvers embodied in the CMSSL library can reduce the impact of the the cubic term for matrix solving by adding processors when the problem size is large. This can reduce the cubic term by a factor proportional to the number of parallel processors [6] and can result in both the solver phase and matrix assembly/disassembly phases to be equally important. The matrices that we produce are e ectively dense and need to be solved using a full matrix method. We analyse the timing coecients in Section 5.2 and report on measured values for test case problems. We envisage a large problem size N will be needed to successfully Krige the Rainman data for the entire Australian landmass.

5.2 Results and Timing

5.2.1 Validation Testing

Figure 2 shows results of the Kriging algorithm for two of the test functions used for validation testing. The rst column shows the function, the second shows the approximated surface produced by the Kriging algorithm and column three shows the error between the approximation and the function value. As can be seen in gure 2, the interpolation method produces good results for simple test functions. The second test function shown is quite complex, more complex than predicted for the Rainman data, and hence the Kriging algorithm performs poorly. It can be seen in the interpolated surfaces produced for the rst test function that the error increases in value on the edges of the target interpolation area. This is explained by observing that all known values are within the target area and therefore the weighted sum of these values will tend towards these values rather than outside of the bounds of the target area. A solution to this problem is to select known values from a larger area than the target area. 200

200

150

150

100

100

50

50

0 50

0 50

2 0 −2 −4

40

40

50 30

−6 −8 50 50 40 30

3

3

2.5

10 0

0

Z (x; y) = 6 + 2x + y

3.5

20

10

10 0

0

20

20

10

10 0

30

40 30

20

20

10

40

50 30

40 30

20

0

2

1

2.5

2

2 1.5

0

1.5 1

1

−1 0.5

0.5 0 50

0 50 40 30 10 0

0

Z (x) = 1 ?

40

30

Q2

i=1

10 0

0

cos

pxii

40 30

20

20

10

50 30

40 20

20

10

50 30

40 20

−2 50 40

50 30

+

P2

20

10

10 0

x2i i=1 4000

(

0

)

Fig. 2. Sample test functions, approximated surface and error Figure 3 shows the interpolated surface produced using Kriging for a selection of Rainman data for South Australia.

8

Rainfall (mm)

6 8

Rainfall (mm)

6

4

2

4

2

0 −25

0 −32

142

−30

140 138

141

−34

140

136

−35

139

134

138

−36

137 −38

Latitude

136 135

−40 130

Latitude

Longitude

Rainman data

132 Longitude

Interpolated surface

Fig. 3. Interpolated rainfall surface for real rainfall values. 5.2.2 Performance Testing

Performance results for the Kriging algorithm needs to be analysed in terms of both the size of the variance matrix to be inverted, ie. the number of known data points that we have, and the size of the target surface. Figure 4 shows the timing results for inverting varied size variance matrices on a single Alpha using Fortran 90 and the CM5 using CMFortran with CMSSL libraries. These times were gathered by removing the execution of the Kriging approximation. Timing was performed using the Unix commancd time on generated test data. Variance Calculation Timing Results for CMF and F90 20 ’f90’ ’CMFortran’

18 16

Time (seconds)

14 12 10 8 6 4 2 0 0

Fig. 4.

100

200

300

400 500 600 Number of known points

700

800

900

1000

Timing Results for inversion of Variance matrix N  N with N = Number of known points.

Figure 4 shows what appears to be only partial results for the single Alpha. This is due to memory allocation restrictions on the alphas preventing execution for larger sized variance

matrices. As can be seen the results for the CM5 are considerably better than those for the single Alpha for large problem sizes. Figure 5 shows the timing results for complete execution of the Kriging algorithm on the platforms used in Figure 4. As can be seen in Figure 5 the times for the CM5 are preferable over those of a single Alpha for large sizes of variance matrices and large target matrices. For smaller sizes, execution is preferable on the Alphas. This is because for small target matrices and small variance matrices the bene ts of inversion on the CM5 do not outwiegh the extra communication costs. For large data sizes, execution is always preferable on the CM5 due to memory limitations on a single processor. It is expected that most interpolation requests for the Rainman data will be large enough to require execution on the CM5. Kriging Timing Results for CMF and F90 with Static Variance Size 3000 ’f90’ ’CMFortran’ 2500

Time (seconds)

2000

1500

1000

500

0 0

100

200

300

400 500 600 Target Surface Size: N x N

700

800

900

1000

Fig. 5. Timing Results for Kriging algorithm, with target surface area, E  E , with E = Target surface size and Number of known points = 100.

Figure 6 shows timing results for large variance sizes and large target surface areas. Due to the memory restrictions for execution on a single Alpha, execution on the CM5 would be required for certain larger data sizes. It can be seen clearly in Figure 6 that for small variance sizes and large target areas, the CM5 is comparable to an Alpha. We estimate that a target surface size of 2000x2000 points with the number of known points varying from 1000 to 4000 would be a standard real application for the Kriging algorithm. The Rainman data set contains approximately 4000 potential known data points. By tting a 3 degree polynomial to the result data gathered we can establish polynomial coecients for equation 5 shown in Section 5.1. The coecients are shown in Table 1 and 2. Table 1 shows the tted coecients for matrix inversion only. It is seen clearly that the CM5 x3 component has a much smaller e ect than that for the Alpha. The second table shows the tted coecients for the full Kriging algorithm with a larger communications component.

A0

A1

A2

A3

CMF 2.5908 -0.2005 0.005 0.0037 F90 -1.5088 2.0505 -0.579 0.081 Table 1.

Fitted coecients for variance matrix inversion.

Performance Comparison of F90 and CMF with large Target Area 4000 "f90" "CMF:2000" "CMF:3000"

3500

3000

Time (Minutes)

2500

2000

1500

1000

500

0 0

Fig. 6.

500

1000

1500 2000 2500 Number of Known Points

3000

3500

4000

Timing Results for Kriging algorithm for Large Tests (Comparison).

A0

A1

A2

A3

CMF 13.99 -16.737 7.0784 -0.0404 F90 -23.93 12.202 9.0459 0.9247 Table 2.

Fitted coecients for full Kriging algorithm.

6 Interpolation Services within DISCWorld

As described in Section 5.2, there is the potential for the Kriging algorithm to be executed on di erent platforms dependent on the data sizes of the problem. This problem requires the development of a middleware environment that will select the appropriate execution platform and perform communication and scheduling with the executing node. Selection could be performed using tted polynomials established from previous runs, producing a estimate of completion time. Hence, the Kriging operation as a service in DISCWorld would be desirable. The combination of the Alpha Farm and the CM5 together represent a heterogeneous network of computational power with di ering interface requirements. Using Java as the glue between these architectures is desirable as Java is platform independent and therefore Java programs can be executed upon any environment for which the Java Virtual Machine has been implemented. The CM5 does not currently have an implementation of the Java Virtual Machine so other communication mechanisms must be sought for this case. Integration with DISCWorld would also provide facilities for data retrieval from spatial and Rainman databases and an easy to use interface for selection of the known data points and the target area required. Figure 7 shows the process involved in processing the Kriging application. Four stages are involved:

{ { { {

Collection of the satellite and Rainman data User request and interpretation of the request Execution of the application on the selected data Return of the application results to the user

Figure 7 shows the speci cation of storage equipment available and the access paths for user communication and data access. The Chain Processing Manager shown in the diagram performs

the interpretation of the user request and handles scheduling of execution upon the selected resource. Caching is also shown connected to the Chain Processing Stage. This caching involves caching intermediate results and nal results from operations to increase eciency for repeated requests. Production of an accurate interpolated rainfall surface requires several steps: recti cation of the satellite data, classi cation, interpolation of the rainfall data and integration with spatial data provided by the satellite data. It is hoped that integration with the spatial data produces a more accurate estimate of unknown values. Interpolation of

Satellite

National Datasets

NOAA Satellite

Rainman Dataset

Rainfall Data Integration with Classification of

Spatial Data

Satellite Data

DELIVERY RAID (UoA)

FTP SITE

STORAGE

CHAIN PROCESSING

RAID (Canb)

MANAGER

Tape Silo (UoA)

WWW PROTOCOLS

Tape Silo (Canb) Flinders

END USER

MECHANISM

Initially Project Hierarchical Access

Collaborators

CACHE

University

Fig. 7.

Framework environment

7 Conclusions

Generation of an interpolated rainfall surface is a computationally intensive task which requires the use of high performance technologies. Using the application of Kriging as a demonstrator application for the DISCWorld system we can motivate the need for a framework which provides access to such high performance distributed and parallel technologies as well as providing means for data retrieval and easy user access. Using WWW protocols as a delivery mechanism provides a portable and easy to use means for accepting user requests and returning application results. Kriging interpolation is a successful method for interpolating rainfall data. However Kriging does not scale well which further motivates the need for optimised solutions which make the best use of the available resources.

8 Acknowledgments

It is a great pleasure to thank Duncan Stevenson for his help in explaining the Kriging process to us and other discussions on remote sensing issues. It is also a pleasure to thank Kim Bryceson for suggesting this application and to the Soil and Land Management Cooperative Research Centre (CRC) for making available the rainfall data described in this paper. Thanks also to Francis Vaughan, Paul Coddington and Jesudas Mathew for helpful discussion of some of the ideas presented here. We acknowledge the support provided by the Research Data Networks and Advanced Computational Systems Cooperative Research Centres (CRC) which are established under the Australian Government's CRC Program.

References

1. Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D., \LAPACK Users' Guide, Second Edition", Pub. SIAM, ISBN 0-89871-345-5. 2. Border, S., \The Use Of Indicator Kriging As A Biased Estimator To Discriminate Between Ore And Waste", Applications of Computers in the Mineral Industry, University of Wollongong, N.S.W., October 1993.

3. Bogucz, E.A., Fox, G.C., Haupt, T., Hawick, K.A., Ranka, S., \Preliminary Evaluation of HighPerformance Fortran as a Language for Computational Fluid Dynamics," Paper AIAA-94-2262 presented at 25th AIAA Fluid Dynamics Conference, Colorado Springs, CO, 20-23 June 1994, 4. Bryceson, K., Bryant, M., \The GIS/Rainfall Connection", in \GIS User", No 4, August 1993, PP32-35. 5. Bryceson, K.P., \Locust Survey and Drought Monitoring - two case studies of the operational use of satellite data in Australian Government Agencies", ITC Journal, 1993-3, PP267-275. 6. Cheng, Gang., Hawick, Kenneth A., Mortensen, Gerald, Fox, Geo rey C., \Distributed Computational Electromagnetics Systems", to appear in Proc. of the 7th SIAM conference on Parallel Processing for Scienti c Computing, Feb. 15-17, 1995. 7. Gosling, J., Joy, B., Steele, G., \The Java Language Reference", Pub. Addison-Wesley, 1996, ISBN 0-201-63451-1. 8. Hawick, K.A., Bogucz, E.A., Degani, A.T., Fox, G.C., Robinson, G., `Computational Fluid Dynamics Algorithms in High-Performance Fortran', Proc. AIAA 25th Computational Fluid Dynamics Conference, June 1995. 9. Hawick, K.A., James, H.A., \Distributed High-Performance Computation for Remote Sensing", DHPC Technical Report DHPC-009, Department of Computer Science, University of Adelaide, 1997, Published in Proc. of Supercomputing '97, San Jose, November 1997. 10. Hawick, K.A., Stuart Bell, R., Dickinson, A., Surry, P.D., Wylie, B.J.N., `Parallelisation of the Uni ed Weather and Climate Model Data Assimilation Scheme', Proc. Workshop of Fifth ECMWF Workshop on Use of Parallel Processors in Meteorology, European Centre for Medium Range Weather Forecasting, Reading November 1992. (Invited paper) 11. Oliver, M.A., Webster, R., \Kriging: a Method of Interpolation for Geographical Information Systems", Int. J. Geographic Information Systems, 1990, Vol 4, No. 3, PP313-332. 12. Mason, D.C., O'Conaill, M., McKendrick, I., \Variable Resolution Block Kriging Using a Hierarchical Spatial Data Structure", Int.J.Geographical Information Systems, 1994, Vol 8. No. 5, PP429-449. 13. \The GMS User's Guide", Pub. Meteorological Satellite Center, 3-235 Nakakiyoto, Kiyose, Tokyo 204, Japan. Second Edition, 1989. 14. Lang, C., \Kriging Interpolation", Department of Computer Science, Cornell University, 1995. 15. Cressie, N.A., \Statistics for Spatial Data", Wiley, New York, 1993. 16. Scho eld, N., \Determining Optimal Drilling Densities For Near Mine Resources", Applications of Computers in the Mineral Industry, University of Wollongong, N.S.W., October 1993. 17. Silis, A.J., Hawick, K.A., \World Wide Web Server Technology and Interfaces for Distributed, HighPerformance Computing Systems", DHPC Technical Report DHPC-017, Department of Computer Science, University of Adelaide, 1997. 18. Kerry, K.E., Hawick, K.A., \Spatial Interpolation on Distributed, High-Performance Computers", DHPC Technical Report DHPC-015, Department of Computer Science, University of Adelaide, 1997. 19. \The Connection Machine CM5 Technical Summary", Thinking Machines Corporation, 1991. 20. Koelbel, C.H., Loveman, D.B., Schreiber, R.S., Steele, G.L., Zosel, M.E., \The High Performance Fortran Handbook", MIT Press 1994.