Kriging Interpolation on High-Performance Computers K.E.Kerry? and K.A.Hawick Email: fkatrina,
[email protected]
Department of Computer Science, University of Adelaide, SA 5005, Australia
Technical Report DHPC-035
Abstract
We discuss Kriging Interpolation on high-performance computers as a method for spatial data interpolation. We analyse algorithms for implementing Kriging on high-performance and distributed architectures. In addition to a number of test problems, we focus on an application of comparing rainfall measurements with satellite imagery. We discuss our hardware and software system and the resulting performance on the basis of the Kriging algorithm complexity. We also discuss our results in relation to selection of an appropriate execution target according to the data parameter sizes. We consider the implications of providing computational servers for processing data using the data interpolation method we describe. We describe the project context for this work, which involves prototyping a data processing and delivery system making use of on-line data archives and processing services made available on-demand using World Wide Web protocols. Keywords: Distributed systems, parallel computing, Kriging, spatial interpolation, middleware
1 Introduction
Fusion of large remotely sensed datasets with ground-truthing, directly observed data is an important but computationally costly challenge. Satellite imagery data generally represents a data gridding pattern completely dierent from ground truth data. A typical instance of this problem arises for the Soil and Land Management Cooperative Research Centre in a project to investigate correlations between vegetative growth cover and rainfall conditions at high ground resolution. Data sources for rain prediction purposes are from the GMS5 geostationary meteorological satellite and the NOAA polar orbiting satellite. Ground truthing data is available in the form of the Australian \Rainman" data set, representing Bureau of Meteorology ground station measurements of rainfall for the whole of Australia. Figure 1 shows the location of Rainman collection sites around Australia.
Fig. 1.
Location points for Rainman Data Collection.
To carry out a useful comparison between multiple datasets it is necessary to have techniques to interpolate spatial data from one set of arbitrary sample points to another possibly gridded layout. Linear polynomial techniques provide a method for sampling arbitrary points from gridded data, but the problem is more complex if the source data is sampled at arbitrary points. ?
Author for Correspondance, Phone: +61 8 8303 4487, Fax: +61 8 8303 4366
One technique for tackling this is known as Kriging and can be formulated as a matrix problem connecting the datasets. The Kriging problem is an interesting example of a computationally intensive processing component in a geographic information system (GIS). Many of the spatial data manipulation and visualisation operations in a GIS may be relatively lightweight and capable of running on a PC or a low performance computing platform, but Kriging a large data set is too computationally demanding and therefore needs to be run on a separate \accelerator platform". As part of the components technology investigation for the DISCWorld project, we have investigated the use of a networked supercomputer such as a 128-node Connection Machine CM5 as a suitable accelerator for carrying out the Kriging process. The CM5 runs as a server and provides the Kriging service to client applications running on networked lightweight computing platforms. Interesting issues in the construction of such a system are the tradeos between the communications cost of transferring data from client and server compared to the speed advantages of using a powerful computational server instead of running the Kriging on the client. In the case of Kriging we are able to analyse the problem complexity in terms of the problem size speci ed by the number of input and output dataset points. The parallelism for dense matrix problems is well understood so it is possible to parameterise the problem scaling with dierent numbers of CM5 processors allocated. It is also possible to allow the client application to choose which computational accelerator for the task required. There may be a real money cost associated between fast, medium and slow servers all of which are functionally capable of providing the same service. Other issues include network reachability of the servers and estimated time to deliver the solution compared to the end-user requirements. The problem complexity also varies as other pre-processing and post-processing activities are required by the end-user. The cost tradeo space can be explored using relatively simple linear models and stored coecients from previous problem runs. Work to date has implemented the computational components and carried out simple performance analysis. The problem tradeo space has been mapped out in more detail, varying the problem size and the number of processors. Future work will involve experimenting with dierent computational servers available to the client. We discuss the Kriging algorithm in more detail in Section 2 of this document. In sections 3 and 4 we discuss the computing systems and software environments we have available to us for tackling Kriging problems in a near-interactive time. Section 5 describes the testing process and testing data used for both validation testing and performance analysis. We also analyse the results of this testing and conclude about the suitability of the dierent implementations. Section 6 presents a model for integration of the Kriging implementations into the DISCWorld. Section 7 presents a summary of the results and concludes this paper.
2 Kriging
Given a set of known data values for certain points we can interpolate from these points to estimate values for other points. By placing an evenly spaced grid over the area for which we have known values we can obtain an estimated surface. Kriging interpolation [11, 12], developed by Matheron and Krige [15], is based on the theory of regionalized variables. The basic premise of Kriging interpolation is that every unknown point can be estimated by the weighted sum of the known points. Kriging interpolation also provides a mechanism for estimating the error for an approximated point. Where known values map onto grid points the error is known to be zero. This method works best for known values that are not evenly scattered [11]. The rainfall data that we wish to interpolate is clustered as can be seen in Figure 1, for example there are more known data values for city locations and around cities, and is hence well suited to Kriging interpolation. Generally the estimation of an unknown point only takes a limited range of the known values into consideration. This is done for two reasons. Firstly, known values at a great distance from the unknown point are unlikely to be of great bene t to the accuracy of the estimate and secondly, the operation is less expensive. However, Border [2] states that in general Kriging produces
the best results when the largest possible number of known points is used in estimating each unknown point. Obviously this is the most expensive option and hence it is desirable to develop high performance means of implementing the Kriging algorithm.
2.1 The Kriging algorithm
Suppose we have a set of K known points P . Each point Pi is of the form (xi ; yi ; Zi ) where xi and yi are the coordinates of the known point and Zi the value. We can then estimate the value of an unknown point Eij by calculating the weighted sum of the known points:
E (xi ; yj ) =
X k=1
k=K
wk Zk
(1)
where wk is the weighting given to the kt h known point. We can do this for the set of unknown points, or the target area E . The body of the Kriging algorithm is involved in the selection of the appropriate weights. A seperate set of weights wij must be calculated for each estimation. To calculate the weight set or vector wij we rst construct a variogram V of the known points where each element Dij is the distance from the known point Pi to the known point Pj . 3 2 D11 D12 : : : D1K 6 2K 7 7 V = 64:D: :21: : :D: :22: : :: :: :: : D : : : : :5
(2)
DK 1 DK 2 : : : DKK By also constructing a distance vector, dij , of the distances from the unknown point to each known point, we can calculate the weight vector by wij = dij V ?1 and hence Eij = wij P .
3 Computing Environment
We have implemented the Kriging algorithm described in section 2 on a number of high-performance and distributed computing resources. Two platforms are particularly worthy of description as they represent the two major classes of computational resources we envisage will be available in our DISCWorld system - a large parallel computer and a cluster of workstations.
3.1 The Connection Machine CM5
The Connection Machine CM5 is a multiprocessor computing system con gured with fast vector nodes controlling their own distributed memory system and a front end Unix platform providing le access services. A scalable disk array also provides a fast (but less portable) storage system. Our system is con gured with 128 compute nodes, each with 32MB of memory, which must hold operating system program code and program data. Fast applications on the CM5 generally make use of a mature compilation system based around the Connection Machine Fortran (CMF) language [19] - an early precursor of the High Performance Fortran (HPF) [20] programming language standard. The CM5 is also provided with a powerful library of well optimised matrix and linear algebra operations. The software environment for the CM5 is not ideal for ready integration into a more complex application with potentially distributed components and user interface requirements. One solution is to employ the CM5 as a solver engine - providing a speci c service of solving matrices on-demand by other remote applications. The remote method invocation and infrastructure for this is discussed brie y in Section 6 and in more detail in [17].
3.2 Farm of Alpha Workstations
A more easily integrable hardware platform is a farm of distributed workstations. We employ a farm of twelve DEC Alpha Workstations as a parallel computing resource for implementing the Kriging application. High Performance Fortran is implemented on our system as a native language provided by DEC and utilising a proprietary DEC messaging system for the underlying communications as well as a portable compilation system from the Portland Group which
utilises the MPI-CH message passing implementation. We are investigating the performance of this system as an alternative to a dedicated parallel supercomputer. Networks of workstations become a viable alternative if the communications costs of running a distributed or parallel job on them can be ameliorated. We employ an optical bre based, fast networking interface using Asynchronous Transfer Mode (ATM) protocols to interconnect our workstation farm. Preliminary data indicates this does provide good performance. Using conventional ethernet technology as an interconnect gives communication costs that are too high in relation to the computational load. As part of our testing of the Kriging implementation we have also gathered test data for execution on single Alpha workstations as well as diering numbers of processors within our Alpha farm of 12 workstations.
4 Implementation Techniques
The Kriging algorithm has been implemented using a variety of implementation techniques due to the varying computational platforms. More information on the implementations of the Kriging algorithm can be found in [18] CMFortran is a variety of Fortran used for programming the CM5. It provides mechanisms for partitioning data across the nodes of the CM5 and for communication between the nodes. CMFortran provides parallel language constructs to perform simple array operations across partitioned data. Available for use with CMFortran are several libraries which include operations for performing more complex parallel array tasks and communication patterns. One of these libraries, the CMSSL library [19], provides operations for inverting large, sparse matrices and for performing ecient multiplication of matrices and vectors. We employ the LU factorisation utility in the CMSSL library for solving the matrix component of the Kriging algorithm. Fortran 90 is a variety of Fortran which also includes several parallel language constructs but which can also be run serially, in which case the bene t of the language constructs is in simpli ed programming. Fortran 90 extends Fortran 77 by providing both these parallel constructs and more complex programming constructs. Fortran 90 was used in the implementation of the Kriging algorithm for execution on a single Alpha workstation.
5 Test Problems and Scalability
For development and testing of the Kriging algorithm, several simple test problems were used to test the development of the variance matrix and the nal interpolated rainfall surface. For each test case, 100 random points within a interpolation target range were chosen and then the Kriging algorithm was executed upon the produced data with a 50 50 element target grid. Upon successful testing of the test problems, functions of increasing diculty were given as input to the Kriging implementation. These functions were chosen because of their large degree of spatial dependency. Finally, real rainfall data was used to produce an approximated rainfall surface for the Rainman data. Many of these problems go beyond the complexity of the interpolation accuracy required for interpolating the rainfall data surface but are still interesting as a test of the power of the interpolation algorithms. Once this validation testing was completed on all platforms, performance testing was undertaken and the results analysed in terms of which implementation gave the best performance for dierent data sizes. As some of the target platforms were speci cally optimised for matrix operations such as inversion it was expected that the best platform would vary according to the ratio of variance matrix size and target matrix size.
5.1 Complexity Analysis
It is important in computationally intensive applications to consider the computational complexity of the entire algorithm to identify which part dominates the computational load. It is not atypical in applications built on linear algebra and matrix methods to nd that dierent components dominate the compute time depending on the problem size. For the Kriging problems
described here, the compute time can be described by a polynomial in the number of non-gridded data points, for a xed desired grid output resolution. This can be written as:
T = A O + A1 N + A 2 N 2 + A3 N 3
(3)
where N is the variable data component. In matrix problems the matrix assembly component is often accounted for by the quadratic term and the matrix solution time is accounted for by the cubic term. However, it is fairly common for A2 A3 so that the matrix assembly and disassembly to dominate for small problem sizes and the solver stage to dominate only for very large problems. In addition, parallel solver methods such as the solvers embodied in the CMSSL library can reduce the impact of the the cubic term for matrix solving by adding processors when the problem size is large. This can reduce the cubic term by a factor proportional to the number of parallel processors [6] and can result in both the solver phase and matrix assembly/disassembly phases to be equally important. The matrices that we produce are eectively dense and need to be solved using a full matrix method. We analyse the timing coecients in Section 5.2 and report on measured values for test case problems. We envisage a large problem size N will be needed to successfully Krige the Rainman data for the entire Australian landmass.
5.2 Results and Timing
5.2.1 Validation Testing
Figure 2 shows results of the Kriging algorithm for two of the test functions used for validation testing. The rst column shows the function, the second shows the approximated surface produced by the Kriging algorithm and column three shows the error between the approximation and the function value. As can be seen in gure 2, the interpolation method produces good results for simple test functions. The second test function shown is quite complex, more complex than predicted for the Rainman data, and hence the Kriging algorithm performs poorly. It can be seen in the interpolated surfaces produced for the rst test function that the error increases in value on the edges of the target interpolation area. This is explained by observing that all known values are within the target area and therefore the weighted sum of these values will tend towards these values rather than outside of the bounds of the target area. A solution to this problem is to select known values from a larger area than the target area. 200
200
150
150
100
100
50
50
0 50
0 50
2 0 −2 −4
40
40
50 30
−6 −8 50 50 40 30
3
3
2.5
10 0
0
Z (x; y) = 6 + 2x + y
3.5
20
10
10 0
0
20
20
10
10 0
30
40 30
20
20
10
40
50 30
40 30
20
0
2
1
2.5
2
2 1.5
0
1.5 1
1
−1 0.5
0.5 0 50
0 50 40 30 10 0
0
Z (x) = 1 ?
40
30
Q2
i=1
10 0
0
cos
pxii
40 30
20
20
10
50 30
40 20
20
10
50 30
40 20
−2 50 40
50 30
+
P2
20
10
10 0
x2i i=1 4000
(
0
)
Fig. 2. Sample test functions, approximated surface and error Figure 3 shows the interpolated surface produced using Kriging for a selection of Rainman data for South Australia.
8
Rainfall (mm)
6 8
Rainfall (mm)
6
4
2
4
2
0 −25
0 −32
142
−30
140 138
141
−34
140
136
−35
139
134
138
−36
137 −38
Latitude
136 135
−40 130
Latitude
Longitude
Rainman data
132 Longitude
Interpolated surface
Fig. 3. Interpolated rainfall surface for real rainfall values. 5.2.2 Performance Testing
Performance results for the Kriging algorithm needs to be analysed in terms of both the size of the variance matrix to be inverted, ie. the number of known data points that we have, and the size of the target surface. Figure 4 shows the timing results for inverting varied size variance matrices on a single Alpha using Fortran 90 and the CM5 using CMFortran with CMSSL libraries. These times were gathered by removing the execution of the Kriging approximation. Timing was performed using the Unix commancd time on generated test data. Variance Calculation Timing Results for CMF and F90 20 ’f90’ ’CMFortran’
18 16
Time (seconds)
14 12 10 8 6 4 2 0 0
Fig. 4.
100
200
300
400 500 600 Number of known points
700
800
900
1000
Timing Results for inversion of Variance matrix N N with N = Number of known points.
Figure 4 shows what appears to be only partial results for the single Alpha. This is due to memory allocation restrictions on the alphas preventing execution for larger sized variance
matrices. As can be seen the results for the CM5 are considerably better than those for the single Alpha for large problem sizes. Figure 5 shows the timing results for complete execution of the Kriging algorithm on the platforms used in Figure 4. As can be seen in Figure 5 the times for the CM5 are preferable over those of a single Alpha for large sizes of variance matrices and large target matrices. For smaller sizes, execution is preferable on the Alphas. This is because for small target matrices and small variance matrices the bene ts of inversion on the CM5 do not outwiegh the extra communication costs. For large data sizes, execution is always preferable on the CM5 due to memory limitations on a single processor. It is expected that most interpolation requests for the Rainman data will be large enough to require execution on the CM5. Kriging Timing Results for CMF and F90 with Static Variance Size 3000 ’f90’ ’CMFortran’ 2500
Time (seconds)
2000
1500
1000
500
0 0
100
200
300
400 500 600 Target Surface Size: N x N
700
800
900
1000
Fig. 5. Timing Results for Kriging algorithm, with target surface area, E E , with E = Target surface size and Number of known points = 100.
Figure 6 shows timing results for large variance sizes and large target surface areas. Due to the memory restrictions for execution on a single Alpha, execution on the CM5 would be required for certain larger data sizes. It can be seen clearly in Figure 6 that for small variance sizes and large target areas, the CM5 is comparable to an Alpha. We estimate that a target surface size of 2000x2000 points with the number of known points varying from 1000 to 4000 would be a standard real application for the Kriging algorithm. The Rainman data set contains approximately 4000 potential known data points. By tting a 3 degree polynomial to the result data gathered we can establish polynomial coecients for equation 5 shown in Section 5.1. The coecients are shown in Table 1 and 2. Table 1 shows the tted coecients for matrix inversion only. It is seen clearly that the CM5 x3 component has a much smaller eect than that for the Alpha. The second table shows the tted coecients for the full Kriging algorithm with a larger communications component.
A0
A1
A2
A3
CMF 2.5908 -0.2005 0.005 0.0037 F90 -1.5088 2.0505 -0.579 0.081 Table 1.
Fitted coecients for variance matrix inversion.
Performance Comparison of F90 and CMF with large Target Area 4000 "f90" "CMF:2000" "CMF:3000"
3500
3000
Time (Minutes)
2500
2000
1500
1000
500
0 0
Fig. 6.
500
1000
1500 2000 2500 Number of Known Points
3000
3500
4000
Timing Results for Kriging algorithm for Large Tests (Comparison).
A0
A1
A2
A3
CMF 13.99 -16.737 7.0784 -0.0404 F90 -23.93 12.202 9.0459 0.9247 Table 2.
Fitted coecients for full Kriging algorithm.
6 Interpolation Services within DISCWorld
As described in Section 5.2, there is the potential for the Kriging algorithm to be executed on dierent platforms dependent on the data sizes of the problem. This problem requires the development of a middleware environment that will select the appropriate execution platform and perform communication and scheduling with the executing node. Selection could be performed using tted polynomials established from previous runs, producing a estimate of completion time. Hence, the Kriging operation as a service in DISCWorld would be desirable. The combination of the Alpha Farm and the CM5 together represent a heterogeneous network of computational power with diering interface requirements. Using Java as the glue between these architectures is desirable as Java is platform independent and therefore Java programs can be executed upon any environment for which the Java Virtual Machine has been implemented. The CM5 does not currently have an implementation of the Java Virtual Machine so other communication mechanisms must be sought for this case. Integration with DISCWorld would also provide facilities for data retrieval from spatial and Rainman databases and an easy to use interface for selection of the known data points and the target area required. Figure 7 shows the process involved in processing the Kriging application. Four stages are involved:
{ { { {
Collection of the satellite and Rainman data User request and interpretation of the request Execution of the application on the selected data Return of the application results to the user
Figure 7 shows the speci cation of storage equipment available and the access paths for user communication and data access. The Chain Processing Manager shown in the diagram performs
the interpretation of the user request and handles scheduling of execution upon the selected resource. Caching is also shown connected to the Chain Processing Stage. This caching involves caching intermediate results and nal results from operations to increase eciency for repeated requests. Production of an accurate interpolated rainfall surface requires several steps: recti cation of the satellite data, classi cation, interpolation of the rainfall data and integration with spatial data provided by the satellite data. It is hoped that integration with the spatial data produces a more accurate estimate of unknown values. Interpolation of
Satellite
National Datasets
NOAA Satellite
Rainman Dataset
Rainfall Data Integration with Classification of
Spatial Data
Satellite Data
DELIVERY RAID (UoA)
FTP SITE
STORAGE
CHAIN PROCESSING
RAID (Canb)
MANAGER
Tape Silo (UoA)
WWW PROTOCOLS
Tape Silo (Canb) Flinders
END USER
MECHANISM
Initially Project Hierarchical Access
Collaborators
CACHE
University
Fig. 7.
Framework environment
7 Conclusions
Generation of an interpolated rainfall surface is a computationally intensive task which requires the use of high performance technologies. Using the application of Kriging as a demonstrator application for the DISCWorld system we can motivate the need for a framework which provides access to such high performance distributed and parallel technologies as well as providing means for data retrieval and easy user access. Using WWW protocols as a delivery mechanism provides a portable and easy to use means for accepting user requests and returning application results. Kriging interpolation is a successful method for interpolating rainfall data. However Kriging does not scale well which further motivates the need for optimised solutions which make the best use of the available resources.
8 Acknowledgments
It is a great pleasure to thank Duncan Stevenson for his help in explaining the Kriging process to us and other discussions on remote sensing issues. It is also a pleasure to thank Kim Bryceson for suggesting this application and to the Soil and Land Management Cooperative Research Centre (CRC) for making available the rainfall data described in this paper. Thanks also to Francis Vaughan, Paul Coddington and Jesudas Mathew for helpful discussion of some of the ideas presented here. We acknowledge the support provided by the Research Data Networks and Advanced Computational Systems Cooperative Research Centres (CRC) which are established under the Australian Government's CRC Program.
References
1. Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D., \LAPACK Users' Guide, Second Edition", Pub. SIAM, ISBN 0-89871-345-5. 2. Border, S., \The Use Of Indicator Kriging As A Biased Estimator To Discriminate Between Ore And Waste", Applications of Computers in the Mineral Industry, University of Wollongong, N.S.W., October 1993.
3. Bogucz, E.A., Fox, G.C., Haupt, T., Hawick, K.A., Ranka, S., \Preliminary Evaluation of HighPerformance Fortran as a Language for Computational Fluid Dynamics," Paper AIAA-94-2262 presented at 25th AIAA Fluid Dynamics Conference, Colorado Springs, CO, 20-23 June 1994, 4. Bryceson, K., Bryant, M., \The GIS/Rainfall Connection", in \GIS User", No 4, August 1993, PP32-35. 5. Bryceson, K.P., \Locust Survey and Drought Monitoring - two case studies of the operational use of satellite data in Australian Government Agencies", ITC Journal, 1993-3, PP267-275. 6. Cheng, Gang., Hawick, Kenneth A., Mortensen, Gerald, Fox, Georey C., \Distributed Computational Electromagnetics Systems", to appear in Proc. of the 7th SIAM conference on Parallel Processing for Scienti c Computing, Feb. 15-17, 1995. 7. Gosling, J., Joy, B., Steele, G., \The Java Language Reference", Pub. Addison-Wesley, 1996, ISBN 0-201-63451-1. 8. Hawick, K.A., Bogucz, E.A., Degani, A.T., Fox, G.C., Robinson, G., `Computational Fluid Dynamics Algorithms in High-Performance Fortran', Proc. AIAA 25th Computational Fluid Dynamics Conference, June 1995. 9. Hawick, K.A., James, H.A., \Distributed High-Performance Computation for Remote Sensing", DHPC Technical Report DHPC-009, Department of Computer Science, University of Adelaide, 1997, Published in Proc. of Supercomputing '97, San Jose, November 1997. 10. Hawick, K.A., Stuart Bell, R., Dickinson, A., Surry, P.D., Wylie, B.J.N., `Parallelisation of the Uni ed Weather and Climate Model Data Assimilation Scheme', Proc. Workshop of Fifth ECMWF Workshop on Use of Parallel Processors in Meteorology, European Centre for Medium Range Weather Forecasting, Reading November 1992. (Invited paper) 11. Oliver, M.A., Webster, R., \Kriging: a Method of Interpolation for Geographical Information Systems", Int. J. Geographic Information Systems, 1990, Vol 4, No. 3, PP313-332. 12. Mason, D.C., O'Conaill, M., McKendrick, I., \Variable Resolution Block Kriging Using a Hierarchical Spatial Data Structure", Int.J.Geographical Information Systems, 1994, Vol 8. No. 5, PP429-449. 13. \The GMS User's Guide", Pub. Meteorological Satellite Center, 3-235 Nakakiyoto, Kiyose, Tokyo 204, Japan. Second Edition, 1989. 14. Lang, C., \Kriging Interpolation", Department of Computer Science, Cornell University, 1995. 15. Cressie, N.A., \Statistics for Spatial Data", Wiley, New York, 1993. 16. Scho eld, N., \Determining Optimal Drilling Densities For Near Mine Resources", Applications of Computers in the Mineral Industry, University of Wollongong, N.S.W., October 1993. 17. Silis, A.J., Hawick, K.A., \World Wide Web Server Technology and Interfaces for Distributed, HighPerformance Computing Systems", DHPC Technical Report DHPC-017, Department of Computer Science, University of Adelaide, 1997. 18. Kerry, K.E., Hawick, K.A., \Spatial Interpolation on Distributed, High-Performance Computers", DHPC Technical Report DHPC-015, Department of Computer Science, University of Adelaide, 1997. 19. \The Connection Machine CM5 Technical Summary", Thinking Machines Corporation, 1991. 20. Koelbel, C.H., Loveman, D.B., Schreiber, R.S., Steele, G.L., Zosel, M.E., \The High Performance Fortran Handbook", MIT Press 1994.