SURFACE INTERPOLATION AS A VORONOI SPATIAL ADJACENCY PROBLEM Christopher M. Gold Dept. des Sciences Géodésiques et de Télédétection Université Lava1 Ste-Foy, Québec, Canada GlK 7P4 Tel: (418) 656-3308; E-mail:
[email protected] Spatial information may be viewed in two ways: as coordinates measured with a ruler along imaginary axes; or as the adjacency relationships between objects. The former was the most common historically, but the latter appears more relevant to the handling of spatially distributed objects in a computer. The interpolation problem, as implemented using weighted-average techniques, depends on the selection of appropriate neighbouts to the sample location being estimated. Traditionally this was done using "metric" methods, with sometimes unsatisfactory results. An altarnativ "adjacenc relationship" a proach can be used, based on the con3 tructfon o r the Voronoi reg Pons about each point, that removes many of the inconsistencies of the earlier methods and provides‘ an integrated weighting method for each adjacent point. Interpolants can then provide appropriate levels of continuity. The technique may be extended to data objects besides points, in particular to line segments, permitting consistent integration of elevation information at break-lines, contour segments, etc. as well as at point locations. The concepts discussed are here applied to two spatial dimensions plus "elevation", but the Voronoi method may readily be applied to higher dimensions.
419
L'
SURFACE
commeMME PROBLEME DE VORONOI
CONTIGUITE
SPATIALE
L'information spatiale peut être vue de deux façons : des coordonnées mesurées au moyen d'une règle sur des axes imaginaires, ou des relations de contiguité La premiere était la plus répandue autrefois, mais la entre des objets. seconde semble plus utile pour le traitement à l'ordinateur d'objets distribués dans un espace. Le problème de l'interpolation, utilisée au moyen de techniques à moyennes pondérées, reside dans le choix approprié des éléments voisins du poste d'éhantillonnage faisant l'objet de l'estimation. Autrefois, on se servait pour se faire de méthodes "métriques", ce qui donnait parfois des résultats On peut prendre une autre approche, celle de la "relation de médiocres. contiguité", basée sur la delimitation de regions de Voronoi pour chaque point, ce qui élimine plusieurs des lacunes des méthodes utilisées antérieurement et permet de faire une pondération intégree pour chaque point Avec les valeurs interpolées, on peut alors obtenir le degré de adjacent. continuité approprié. Cette technique peut s'appliquer aux objets informatifs à côté des points, en particulier aux segments de lignes, ce qui donne une integration homogène des renseignements sur l'élévation aux lignes de rupture, aux segments de courbes de niveau, etc. ainsi qu’aux points. Les notions examinées s'appliquent ici à deux dimensions spatiales (indépendantes) ainsi qu'à "l'élévation", mais la méthode de Voronoi peut facilement s'utiliser pour des dimensions plus élevées.
420
INTRODUCTION In the world of GIS, "interpolation" is taken to mean a rather limited operation applicable to such applications as terrain modelling, contouring, or TIN (Triangulated Irregular Network) construction. This aper attempts to show that interpolation is a more general operat Pon than this. In order to do so we must first introduce the idea of "Proximal Query Space", and its breadth of application, and then describe some techniques that are sufficiently general to make the approach operational, for various map types and numbers of dimensions. PROXIMAL QUERY SPACE If one points at some location on a map (any kind of map) it is reasonable to expect some meaningful response. The human observer would not be satisfied with "nothin " as an answer, except in extraordinary circumstances. Histor7 cally, of course, if large areas of the map were "terra incognita" then a variety of fanciful icons were employed, such as whales, ships or imaginary beasts. Alternatively, if a textual description was deemed necessary, "Here Be Dragons!" was an appropriate response. In the GIS situation it is assumed that large portions of the map are D& completely unknown, and hence the "HBD" response is invalid. Nevertheless, for many GIS's and map types, HBD would be the response to most available query points. A road or stream network, for example, has no meaningful response for queries that fall between line segments, although for a polygon map the query would be meaningful. A set of survey points would also usually return a value of HBD in response to a query. This is clearly inadequate, if only because it is at variance with human experience, which persists in believing that space is not empty, and will frequently attempt to reply with the name of the nearest map object. "Proximal Query Space" is therefore a "natural" mode of interaction between the user and the map data and, as will be shown, a reasonable method of posing location queries within the spatial algorithms themselves. How does this relate to the "Interpolation Problem"? INTERPOLATION In traditional interpolation a variety of global and local methods may be used to estimate the value of the "elevation" (or some other dependent variable) at the desired query location. We will ignore for the purposes of this discussion the global techniques, such as trend surface or polynomial trend analysis, that use u the data to estimate any one query. Local methods of interpolation must therefore select a set of nearby data points that will be used to estimate the elevation value at the query point. At this stage either a local polynomial patch or a
421
veighted average are used to make the estimate - more frequently a weighted average, due to the problems of higher-order continuity between adjacent polynomial patches (see Gold 1989). The simplest form of interpolation - one that is still used in remote sensing - is to query the map and return the 'elevation" value of the nearest data point. (This is used for some simple forms of image rectification or resanpling.) This is equivalent to the previously defined proximal query space. Hence this interpolation is a form of query, returning the value (radiosity) of the nearest pixel. This also goes by the name of "proximal mapping" in the computer mapping literature. A sufficiently dense grid of queries of an arbitrarily distributed set of data points would produce a map with polygons coloured according to the nearest data point. These polygonal regions are called Thiessen polygons, Dirichlet tessellations or, more commonly, the Voronoi diagram of the point data set.The simple proximal query is here denoted a "Type I" query. Figure 1 shows a Voronoi diagram of a simple point set. Perhaps the most common form of interpolation assumes that the data point elevation values are samples from a continuous smooth surface. It attempts to generate query replies that, if asked for a dense grid over the map, would enerate a continuous, smooth surface, based, for each query po Pnt, on some form of weighted average of the values of a reasonable set of neighbouring data points. Rephrased as a query, when some particular location is designated, the resulting reply is a set of nearby data points and elevation values, together with the appropriate weighting values for each. This is here referred to as a Type II query. Thus a small perturbation of the query point might result in the same set of neighbours but with slightly different weights, while a larger perturbation would change the set of neighbouring points. In principle, a query point that coincided with a data point would return a neighbour set consisting only of the one data point, vith a weight of 100%. Most interpolation schemes, int values in this hovever, fail to precisely honour the data fashion, or else the surface itself is neitr er smooth nor continuous. Surprisingly, this is true for most of the common local techniques, as vell as for global methods. Details are given in Gold (1989). This will be summarized below, firstly for polynomial methods, and then for weighted averages. Polynomial Patches It is not surprising that a global polynomial, although smooth and continuous, does not pass through every data point on a map unless it has a great many terms - and hence is very unreliable in between data points. Local polynomial patches can be made to fit to the data points in a local reqion, but it is rare that they can be made with greater than first order smoothness between patches. It vould seem that having a continuous slope between
422
patches would be sufficient, but experience shows that rapid changes in curvature, even when slope is continuous, is easily detected by the readers of a contour map - leading to reading fatigue, irritation and misleading interpretations. In other words, the algorithm generates artifacts: features that are not necessary to fit the surface to the data. Often the patch used is a triangle connecting three data points, and a cubic polynomial is employed. This gives the level of continuity or smoothness described above. Where a linear polynomial is used the result is a planar triangle between the three elevation values - the common TIN. In both of these cases the selection of the triangulation algorithm affects the resulting surface significantly, although in the last ten years or so the Delaunay triangulation has been widely adopted. This, the dual of the Voronoi diagram of the data point set, is the only known definition of a triangulation that is guaranteed to remain unchanged (except for degenerate conditions) no matter what the order of data point entry, and is also guaranteed only to change locally when a data point is inserted, deleted or perturbed. Weighted Averages Weighted average methods are also subject to the problems of preserving continuity and smoothness, and honouring the data points. The most common traditional techniques use a distance based weighting, decaying either as l/da or as a gaussian "bell curve". In the first case the weighting is infinity when d=O (i.e. the query point is on the data point), and hence the weightings of all other neighbouring points become insignificant. In the second case the weighting becomes 1 when d=O, and the contributions of neighbouring points with non-zero weightings prevent the surface from honouring the data point. A far more serious problem, however, is that the weighting functions decay towards zero at great distances from the data point, but are by no means insignificant for most arrangements of data. Not only may this revent data points from being honoured, but it causes discontinu Pties - cliffs - in the interpolated surface whenever a data point is added to or deleted from the set of neighbours of a moving query point. Ideally a data point weight should reach zero at the time that it is rejected from or accepted into the set of neighbours, but this is impossible for weighting functions that only reach zero asymptotically. The weighting function and the neighbour selection criteria must be much more strongly coupled: most computer programs use a variety of arbitrary neighbour selection criteria (search radius, quadrant search, etc.) that often make the discontinuity problem worse (e.g. when a data point is re ected not because of distance, but because it falls outs1 de a particular octant). See Gold (1984,1989) for further details. A solution promoted for several years by this writer directly couples a neighbour selection criterion with a finite-range weighting function. It is
423
based on a more detailed definition of "neighbour" and "adjacency". Voronoi Neighbours The Delaunay triangulation is one way of defining the set of data points that are neighbours to any particular point: if they are connected by a triangle edge they are adjacent. This is equivalent to saying that if the Voronoi zones of two data points have a common boundary, the two points are adjacent. Figure 1 shows the Voronoi regions mentioned previously and the related Delaunay triangles. Thus, in order to determine the (Voronoi) adjacency of any pair of data points, first construct the Delaunay triangulation or Voronoi diagram (the two operations are equivalent). It follows that in order to determine the neighbouring data points to sone query point - first insert the query point into the Voronoi/Delaunay structure, and then examine its neighbours. (It should be noted that in order to prepare, say, a contour map many query points need to be inserted - and deleted individually, immediately after use, in order not to interfere with subsequent nearby queries. This demands a dynamic Voronoi/Delaunay data structure that permits incremental point insertion and deletion. See Gold (1992, these proceedings).) Voronoi Weighting If the Voronoi definition of adjacency of data points is accepted then a Type II proximal query does not return the closest object, as did the Type I, but instead returns the set of Voronoi neighbours. However, some neighbours are closer than others - but none will be behind other neighbours, as may be the case with simple distance based methods. How are we to select the more "important" points ? This is equivalent to specifying the interpolation weighting function described previously. The insertion of the query point Q into the Voronoi diagram has clearly only modified the immediately surrounding portion of the data structure - those Voronoi regions that are now adjacent to the newly inserted query point. They m adjacent precisely because a portion of their previous extent has been lost to provide the new point with m own region (see Figure 2). If Q is mid-way between several data points, the "stolen areas" of each will be approximately the same. If Q is then moved until it coincides with a data point P, all its stolen area will be taken from the Voronoi zone of P. If Q is moved half the map away from P, then it will cease to steal a n y of its own area from P, but will take it all from closer data points instead. The location at which Q ceases to steal any area from P is precisely that location at which the Voronoi regions of Q and P cease to have a common boundary - that is, they cease to be Voronoi neighbours. We thus have a "stolen area" weighting scheme that corresponds to our requirements for a finite range. As previously discussed, this is required for the surface to honour the data points, and also to prevent surface discontinuities as data points are added
424
to and subtracted from the neighbouring set. In addition, the weighting reaches 100% when the query point coincides with a data point, ensuring that the surface honours the data points. The remaining problem is the smoothness of the resulting surface. This weighting function approach guarantees that, as the query point approaches a data point, the weight approaches 100%. It does not guarantee that the same data point, approached from the other side, will be approached -the same slor>e. Similarly there will be a break in slope precisely where the weighting function becomes zero. The result is a surface somewhat like a series of overlapping volcanic cones, one at each data point. Sibson (1982) used these "linear" weighting functions in a final "spherical polynomial function". Gold (1989) applied a simple Hermitian polynomial to "smooth" each weight before it was applied, guaranteeing zero slope to the modified weighting function whenever the original weight reached zero or unity. This preserved the advanta e of using weighted averages throughout. He also applied these we!I ghts to the function at each data point, thus permitting the preservation of original or derived slopes at each data point. Watson and Philip (1987) blended the linear weights and the slope data to achieve the same objectives. In all of these cases, applied to achieve slope continuity, the more fundamental properties of surface continuity and data point honouring are preserved, due to the finite range of the weighting functions employed. One particular advantage of the area-stealing method is that it is adaptive to the data distribution, as with a TIN, but for awkward data distributions, such as flight line data, a far greater potential number of data points are involved. Figure 3 shows the result of performing full area-stealing interpolation on a set of seismic data using Gold's method. This compared well with available commercial systems. OTHER FORMS
OF INTERPOLATION
To summarize, "Type I" interpolation returns the name of the nearest map object, and its value is used. "Type II" interpolation returns the set of neighbouring map objects and their associated weights. These weights may be modified (e.g. to produce satisfactory slope continuity) and are then applied to the functions or values of the associated map objects. In our examples so far these "map objects" have been data points. One of our early examples of Proximal Query Space was a stream network. A Type I query or interpolation returns the name of the nearest map object - in this case a stream segment. In the same fashion that it is possible to define the Voronoi zones around a set of points, it is also possible to define the Voronoi zones (i.e. the proximal region) around line segments. This is described in Gold (1990). He uses a dynamic incremental technique, where points are created by splitting a nearby point and moving the newly created point to the desired location. Similarly, deleting a point consists of moving it to a nearby point and then merging the two. Line segments are generated as
425
the locus of the moving point. In all these cases the movement of one point at a time, while maintaining the Voronoi spatial relationships, is central. The result of this operation on our digitized stream network is a set of Voronoi zones about each point or line segment forming the network. Using this we may derive out Type I query. As described in the same paper, it is also possible to apply a II query to the same data, In this case the inserted query po nt would steal areas from the adjacent Voronoi zones of points T and line segments, and return the set of neighbouring map objects and weights. If elevation values were assigned to the digitized network (and, in order to make this useful, to a ridge network or other survey spot heights) then weighted avera e inter olation ray be applied to the results. Thus interpolat Pon cons fsts of applying either a Type I or a Type II query at the desired location, and then performing the desired weighted averaging as required. This process is particularly useful in the case of digitized contour input. A good example of the difference between Type I and II queries for categorical data concerns the rock types on a geological map. Frequently the original data consists of the identification of a rock type at each of a set of widely separated outcrops. A Type I interpolation consists of the proximal zones around each outcrop, making an estimate of the map regions covered by each rock type. A Type XI interpolation, using an area-stealing weighted average, will assign a particular rock type to a location if it has the largest fraction of the total weights (or some other criterion). Either approach may be used, depending on the desired effect. Polygon maps If the points and line segments are used to form a polygonal map and the associated Voronoi diagram, as in Figure 4, it is then necessary to assign the polygon name to each point and half line - since each side of a line segment is given the "colour" of the associated polygon. This is equivalent to the elevation value used previously as the attribute associated with each map object. A Type I query would then return the name of the nearest object which will be one of the edges or vertices forming the polygon. The associated value is therefore the desired polygon name. A Type II query will achieve the same end - at greater effort. The areas stolen by the query point will all be taken from edges and vertices forming the same polygon, and so no matter what the weighting, the resulting interpolated value will be the polygon name again! This shows that the point-in-polygon problem is indeed a form of interpolation - but interpolation of a surface with cliffs in it at all poly on boundaries. This idea is directly extendable to a cantP inuous surface with cliffs (as in a geological map with faults) - each side of the line may have a different associated attribute or function, and within any individual region not separated by line segments regular smooth interpolation would apply.
426
Another form of polygon map problem occurs when a particular attribute value is associated with each polygon (e.g. census data). It is therefore assumed that the whole polygon has the same value. This is equivalent to the previous case, where a single attribute value is assigned to all the points or edges forming the polygon. Any Type I or II interpolation will return the polygon attribute value. If, however, the attribute value for a particular polygon is unavailable, how may it be estimated from its neighbours? One solution to this problem (Tobler and Kennedy, 1986) weights the neighbouring polygons by their respective boundary lengths but one highly convoluted boundary would invalidate this method. An alternative, using the Voronoi approach, is to imagine the polygon itself deleted, leaving a hole: the neighbouring polygons would then have Voronoi regions within this unoccupied hole (see Figure 5). (These regions are readily determined using the usual Voronoi algorithm.) Replacing the polygon would therefore "steal" these Voronoi regions, and the weighted average could proceed as usual. Thus the same technique is appropriate as for the other interpolation problems, but here the "map objects" are the remaining polygons themselves. "Interpolation" is a more general procedure than is generally acknowledged, and is closely associated with the idea of a "spatial query" at some desired location. If the Voronoi concept is adopted, then a "Proximal Query Space" may be established, defining proximal zones about each map object. These map objects may be of any form, but only those constructed of points and line segments have been implemented. Within this query space two query types, I and II, have been defined - the first returns the nearest object onl , the second returns a set of neighbouring objects and their we 1ghts, based on "area-stealing". (These can be referred to as "non-invasive" and "invasive" respectively.) Either, but Type II in particular, may be considered the basis for interpolation. Various ap lications have been suggested for Voronoi based interpolat I?on, indicating the wide range of possibilities, the generality of the method and the ability to mix data types and queries that were previously handled by widely disparate techniques. For smooth interpolation of continuous data the methods described appear to have significant advantages. The methods described, besides being ap licable in principle to many t pes of map object, may be applie !i in various metrics and dimens xons. In particular, three dimensional point interpolation, based on the same principles as the two dimensional case (but stealing volumes instead of areas) appears to be of considerable interest.
427
The funding for this research was made possible in part by an Energy, Mines and Resources Canada research agreement, and also by the foundation of an Industrial Research Chair in Geomatics at Lava1 University, jointly funded by the Natural Sciences and Engineering Research Council of Canada and the Association de l'Industrie Forestière du Quebec. Gold, C.M., 1984. Common sense automated contouring generalizations. Cartographica, v. 21, pp. 121-129.
some
Gold, C.M., 1989. "Surface interpolation, spatial adjacency and G.I.S." Three Dimensional Applications in Geographical Information Systems, ed. J. Raper, pp. 21-35. London: Taylor and Francis Ltd. Gold, C.M., 1990. Spatial data structures -- the extension from one to two dimensions. IN: NATO Advanced Research Workshop, Wapping and spatial modelling for navigation, (L.F. Pau, ed.), Fano, Denmark, August 1989. Gold, C.M ., 1992. Dynamic spatial data structures - the Voronoi approach. Proceedings, The Canadian Conference on GIS - 1992. Sibson R., 1982. A brief description of natural neighbour interpolation. IN Interpreting Multivariate Data, ed. V. Barnett. London: Wiley, pp. 21-26. Tobler, W.R. and S. Kennedy, 1986. Smooth nultidinensional interpolation. Geograph i.cal Analysis, v. 17, PP. 251-257. 1987. Watson, D.F. and G . M . Philip, Neighbourhood-based interpolation. Geobyte , v. 2 no. 6, pp. 12-16.
8’0
-793
.755
%OO .7ao
a
Figure 1. a) Simple point data set, and b) resulting Voronoi diagram and Delaunay triangles a
b
\\ ‘. ‘. . . \\ . \ / /’ 3!Fk C
Figure 2. a) Simple data set with Voronoi regions; b) Voronoi region after insertion of point X; c) Neighbouring areas stolen by point X.
429
,
I Figure 3. Seismic data: Area stealing interpolation
430
VI .