Multiresolution Query Optimization in an Online

3 downloads 0 Views 713KB Size Report
query processing in an online environment. A cost model for such ... These methods focus on “compressing MTM”, ... of online GIS applications. In this paper, we ...
Multiresolution Query Optimization in an Online Environment Kai Xu1 and Xiaofang Zhou2 1

2

IMAGEN group, National ICT Australia, Sydney, Australia, [email protected] School of ITEE, The University of Queensland, Brisbane, Australia, [email protected]

Abstract. Multiresolution (or multi-scale) techniques make it possible for Web-based GIS applications to access large dataset. The performance of such systems relies on data transmission over network and multiresolution query processing. In the literature the latter has received little research attention so far, and the existing methods are not capable of processing large dataset. In this paper, we aim to improve multiresolution query processing in an online environment. A cost model for such query is proposed first, followed by three strategies for its optimization. Significant theoretical improvement can be observed when comparing against available methods. Application of these strategies is also discussed, and similar performance enhancement can be expected if implemented in online GIS applications.

1

Introduction

Internet provides a revolutionary way to access geographical data. Significant efforts, such as Microsoft’s TerraServer [1] and NASA’s “World Wind” project [2], have been put into the development of core technologies of web-base GIS. It is more accessible than ever before with increasing data availability, client processing power and network bandwidth. However, the large size of terrain data remains a major challenge. The size of a data repository, such as the Digital Elevation Model of America (available from U.S. Geological Survey [3]), is measured in terabyte. The size of accessible dataset is increasing quickly as the data collection continues. The “Shuttle Radar Topography Mission” by NASA [4] is collecting the data of most land on earth. Applications based on such dataset require excessive resources. However, the resolution can be unnecessarily high in many cases. Figure 1(a) shows an small terrain model with only 10,000 triangles, but the top-right part already appears to be unnecessarily dense. Multiresolution (or multi-scale) techniques are introduced to address this problem [5]. The intuition is to construct a simplified terrain approximation (mesh) as the substitute of the original model with guaranteed accuracy according to application requirements. Therefore, the resource requirement can be significantly reduced without sacrificing necessary quality. A simplified terrain with 1,000 triangles is shown in Figure 1(c). For visualization

(a) 10,000 triangles

(b) 10,000 triangles with texture

(c) 1,000 triangles

(d) 1,000 triangles with texture

Fig. 1. Terrain at multiple resolutions

purpose, the difference is hardly noticeable after texture is applied (Figure 1(b) and 1(d)). Constructing a mesh directly from the original model is expensive, it involves the data of the original model, which is usually extremely large, and intermediate data generated during simplification. The possible benefit of using a mesh can be overcomed by its expensive construction. In order to reduce the cost, we need to represent it as a Multiresolution Triangular Mesh (MTM) - a terrain representation that supports the reconstruction of meshes with variable resolutions with acceptable overhead. The fact that terrain data can be used at any resolution and that data of different resolutions may be used together to construct a single mesh means that it is not feasible to pre-generate terrain data at a fixed number of resolutions. Instead, most MTM methods adopt a tree-like data structure, and each node stores a lower resolution approximation of its child nodes data. The construction of a mesh starts from the root, which contains the least detailed data, and refines this coarse mesh progressively by replacing it with the data stored at its children nodes until the desired resolution is achieved (details are provided in Section 2). Since the data with low resolution is substantially less than the original terrain model, such MTM can significantly reduces the mesh construction cost. Such structure also can provide large number of possible meshes and mesh with changing resolution, because the initial coarse mesh can be refined to any resolution available and each part can have different resolutions. The ability to reduce data amount, and thus resource requirement, makes it possible to perform visualization and analysis on large dataset in an online environment. Recently, a number of work has been published for visualizing large terrain over network [6–11]. These methods focus on “compressing MTM”,

i.e., using compact MTM data structure to reduce the amount of data transfer. However, none of them considers the role of database system. There are also signification research attention on multiresolution terrain database [12–17]. These methods employ various multiresolution access methods to improve the performance of multiresolution query, i.e., efficiently identify and retrieve data required for mesh construction. Unfortunately these methods do not take into account any constrains imposed by network environment. We think multiresolution query processing, which mainly performed at the server end, is a critical to the performance of online GIS applications. Such query is quite resource intensive because it involves large amount of data. Large number of simultaneous queries can easily exceed server capacity. Since the available bandwidth is increasing much faster than that of server processing power, it is possible that the query processing will replace the bandwidth as the bottleneck of online GIS applications. In this paper, we aim to improve multiresolution query processing in an online environment. We started with examining existing work, and a cost model for current multiresolution query processing is proposed. An optimal cost, which serves as a bound for possible improvement, is derived from this framework. Three optimization strategies are proposed to address different problems identified from performance analysis. The details of these strategies are discussed when applying them to existing methods, which also confirms the feasibility of our strategies in real applications. The remainder of this paper is organized as follows. In Section 2 we provide an brief overview on multiresolution query processing. The cost model is introduced in Section 3. In Section 4, the optimal cost is derived, followed by three optimization strategies, whose application to existing methods is also included. The paper is concluded in Section 5.

2

Multiresolution Query Processing

A multiresolution query retrieves a mesh within given area with required resolution, i.e., it can be specified by two parameters: the Level Of Detail (LOD) condition and the Region Of Interest (ROI) condition. For a mesh polygon t, the LOD condition returns a value e(t), which is the required LOD value for polygon t. In a mesh m, we denote the resolution of a polygon t as l(t). We say t is LOD feasible with respect to e if l(t) ≥ e(t). A LOD feasible polygon implies it is detailed enough for current application, and no further refinement is required. There are two types of LOD conditions: uniform LOD, where e(t) is a constant, i.e., the LOD of every polygon t is at least a constant value; or variable LOD, where e(t) is a function of polygon attribute(s). An example is that a LOD condition can be a function of the distance from the viewpoint, i.e., the further away from the viewpoint the less LOD value. While the LOD condition defines mesh resolution, the ROI condition defines mesh location, size and shape. It is specified by a collection of two-dimensional spatial objects (points, lines and/or polygons). Given a ROI condition r, a poly-

gon t is ROI feasible if its two-dimensional projection P (t) has at least one point in common with r, i.e., r ∩ P (t) 6= ∅. For multiresolution query, a polygon that is not ROI feasible means it is irrelevant. Therefore, only the polygons that are ROI feasible are checked against the LOD condition; whereas the LOD of polygons that are not ROI feasible can be arbitrarily low. Given LOD and ROI conditions, a multiresolution query returns a mesh with minimal number of polygons that satisfies both. Formally, given a MTM M a multiresolution query Q(M, r, e) returns a mesh m from M that satisfies both the ROI condition r and the LOD condition e with minimal number of polygons. A multiresolution query is a uniform-LOD query if its LOD condition is a constant; a variable-LOD query if its LOD condition is a function of polygon attribute(s). Here we use the progressive meshes [18], one of the most popular MTMs, to explain multiresolution query processing. The structure of a progressive meshes is an unbalanced binary tree (Figure 2(a)). The fundamental step in mesh construction is vertex split, i.e., one point is replaced by its two children (Figure 2(b)). To answer a multiresolution query, a mesh that contains the root only is created first; then it is refined by applying vertex split progressively following the tree structure, until both the ROI and LOD conditions are met. In the

(a) Tree structure of progressive meshes

(b) Vertex split

Fig. 2. Progressive Meshes

progressive meshes, each node has the following information: (ID, x, y, z, l, parent, child1, child2, wing1, wing2, M BR) where ID is the unique ID of the point, (x, y, z) is the three-dimensional coordinates, l is the LOD, parent, child1, child2 are the IDs of its parent, left and right child node, wing1 and wing2 are the IDs of the left and the right point connecting to both children, and M BR is the minimal bounding rectangle that encloses this node and all its descendants. The wing points (i.e., wing1 and wing2) are essential to form a correct triangulation after vertex split (v4 and v7 for v9 in Figure 2(b)). The M BR is for identifying ancestors of ROI feasible nodes. Due to the progressive refinement process of mesh construction, all ancestor nodes

are necessary if a point exists in the final mesh. While this point is ROI feasible, some of its ancestor nodes may not. With M BR, all such nodes can be identified because their M BR intersects with ROI. The fact that the necessity of every node (except the root) depends on its predecessor makes it difficult to retrieve all necessary data together. In fact, the data for every vertex split (two child nodes) has to be fetched individually. Therefore, the data fetch of a multiresolution query is composed of many retrievals with every small amount, which makes it intrinsically inefficient for query processing.

3

Cost Model for Multiresolution Query

The I/O cost of a multiresolution query Q(M, r, e) has two major components: the cost of retrieving the index and the cost of retrieving data. In other words, the total I/O cost D(Q) of a multiresolution query Q(M, r, e) is the sum of index retrieving cost Ci and data retrieving cost Cd : D(Q) = Ci + Cd

(1)

The cost of index retrieving depends on two factors: the average cost of one index scan i0 and the number of index scans Ns . The value of i0 depends on the indexing method used, while the value of Ns is decided by the query processing method. The total cost of index retrieving (Ci ) is the product of i0 and Ns : Ci = i0 × Ns

(2)

Ideally, data retrieval costs (Cd ) should only include the cost of retrieving data necessary for multiresolution query. However, in many cases redundant data is also retrieved during query processing (this is further explained in the next section). Therefore, the value of Cd is the sum of the cost of retrieving necessary data Cn and unnecessary data Cu : Cd = Cn + Cu

(3)

Combining previous equations, we have a cost model for multiresolution query: D(Q) = i0 × Ns + Cn + Cu

(4)

Based on the discussion in last section, the I/O cost of multiresolution query on the progressive meshes is: Dp (Q) = i0 × Ns + Cn

(5)

Here Cu = 0 because it only retrieves required data. Without any caching strategy, the number of index scan (Ns ) depends on the number of retrievals (Nr ), i.e., one index scan is required for each retrieval: Ns = Nr

(6)

As mentioned, each retrieval fetches two child nodes; all data is fetched this way, except the root, which is retrieved by itself at the very beginning. Therefore, given the total number of nodes needed (Np ), the number of retrieval (Nr ) is: Nr = (Np + 1) / 2

(7)

The cost of necessary data retrieval (Cn ) is determined by the number of retrievals (Nr ) and the number of disk page access in each retrieval (Nd ), i.e., Cn = Nr × Nd × t0

(8)

where t0 is the cost of retrieving one disk page. Since there are two nodes involved in each refinement and each node has little associated data, it is unlikely that the data for one refinement exceeds disk-page size. If we assume nodes for one refinement are always stored together in one disk page (can be achieved by clustering progressive meshes nodes on disk), the number of disk page access for each retrieval (Nd ) is one, i.e., Nd = 1 (9) Combining Equation 5, 6, 7, 8 and 9, the I/O cost of multiresolution query on the progressive meshes is: Dp (Q) = (i0 + t0 ) × (Np + 1) / 2

(10)

The discussion so far is based on progressive meshes, now we extend the cost model to more general multiresolution query, where a node can have more than two children. If each internal node has F child nodes, the number of retrievals (Nr ) is now: Nr = (Np − 1) / F + 1 (11) Therefore, the total cost of index scan Ci is Ci = i0 × Ns = i0 × Nr = i0 × [(Np − 1) / F + 1]

(12)

Similarly, the cost of necessary data retrieval Cn is: Cn = Nr × Nd × t0 = [(Np − 1) / F + 1] × t0

(13)

Combine Equation 12 and 13, the total cost D(Q) is: D(Q) = Ci + Cn = (i0 + t0 ) × [(Np − 1) / F + 1]

(14)

The cost model for general multiresolution query (Equation 14) has a configuration similar to that for the progressive meshes (Equation 10): 1. The total cost of is proportional to the number of required nodes (Np ) since the value of other variables (F , i0 and t0 ) are decided by either the MTM (F ) or the database system (i0 and t0 ). 2. The cost of index scan accounts for a large portion of total cost. From Equation 14 we can say that the number of index scan is the same as the number of disk page retrieved (both are (Np − 1) / F + 1), but the cost of the former (i0 ) is much larger than that of the latter because the indexing structure size is several-order larger than disk page size. These observations lead to our strategies for multiresolution query optimization.

4

Multiresolution Query Optimization

There are several possible approaches to reduce the cost of multiresolution query processing. For simplicity, we use the cost model for the progressive meshes instead of the general one since they share similar structure. From Equation 10, we know that the cost of index scan and data retrieval are i0 × Nm and t0 × Nm respectively. We think neither of them is optimal. In ideal case, one index scan should identify necessary data, i.e., min(Ci ) = i0

(15)

Regarding data retrieval cost, the optimal case is that all required data is stored continuously on the disk, i.e. it can be as small as: min(Cd ) = dNp / Be × t0

(16)

where B is the number of points each disk page can hold. Therefore, the optimal I/O cost of a multiresolution query is: min(D(Q)) = min(Ci ) + min(Cd ) = i0 + dNp / Be × t0

(17)

Comparing the minimal cost and the cost of current processing method (Equation 10), we can say that there is significant gap between the two, in other words, considerable potential for improvement. 4.1

Optimization strategy 1 - Reducing Retrieval Number

Our first strategy aims to decrease the number of index scan to reduce total cost. A possible solution is to retrieve all necessary data before mesh construction, which can avoid data fetch during mesh construction and thus reduce index scan numbers. Moreover, it has the potential to improve data retrieval efficiency since all data is fetched together instead of bit by bit during construction. With this strategy, it possible to complete data retrieval with one index scan, which is the optimal value as shown in Equation 17. However, there are a few challenges: 1. It is difficult to identify the spatial extent. Among the data needed by a multiresolution query, nodes that appear in the mesh are within the ROI, but their ancestors may not be. A naive method to encode each node as a MBR enclosing all its decedents in an index will cause severe overlapping in the indexing structure, because the nodes in the upper part of MTM have large MBRs covering many nodes. This can significantly degrade the search performance of multiresolution access method. Another possible solution is to enlarge the ROI by some extent to include such ancestor nodes. However, it is difficult to estimate the extent of enlargement, and this also introduces redundant data;

2. It is difficult to identify the LOD interval. The requirement to support variable-LOD mesh implies that mesh LOD can be a function of spatial location. It is well known that such a function condition can not be handled efficiently in a database systems. A naive solution is to retrieve all data at the most detailed level, which inevitably includes redundant data. The existence of these challenges results in possible redundant data fetch (Cu ), i.e., unnecessary data may also be retrieved in order to include all required data. However, the overall performance can be better if the amount of redundant data is acceptable. The I/O cost of this strategy methods can be described as: D1 (Q) = i00 + Cn + Cu

(18)

where 1. The value of i00 varies depends on the multiresolution access method used, but it should be much smaller than the previous index scan cost (Nm × i0 ); 2. It is difficult for the value of Cn to be as small as the optimal value (dNp / Be) because it is impossible to have all necessary data always stores continuously on the disk for different queries; 3. As a side effect, such method may incur unnecessary data retrieval. In some cases such cost (Cu ) is comparable to that of necessary data. To further illustrate our approach, we apply them to existing work. Since there is no method that exactly implements our strategy, we choose similar approach to illustrate the idea. The secondary-storage progressive meshes [12] is one of the first attempts to employ multiresolution access method in MTM. It builds a two-dimensional quadtree [19] into the progressive meshes (Figure 3). The bottom-up construction starts by dividing the terrain into blocks, which serves as the leaf level of quadtree. Every four simplified blocks are merged to form larger blocks (serve as the internal nodes of quadtree), which are further simplified. This process repeats until only one block is left. The LOD-R-tree

Fig. 3. Secondary-storage PM

[13] and the HDoV-tree [15] adopt a similar approach, but they adopt a R-tree instead. For these methods, a multiresolution query is translated into a range query whose query window is the ROI. During processing, it traverses down the index tree from the root, retrieves nodes that intersects with ROI, and stops at the level where the resolution is sufficient. The performance improvement

reported in the test results of these methods confirms that reducing retrieval number can considerably improve overall performance even if it may introduce some redundant data. It is reasonable to expect similar enhancement if applied in an online environment. 4.2

Optimization strategy 2 - Balancing Retrieval Number and Redundant Data

Besides the difficulty of identifying data extent, there is another factor that contribute to redundant data fetch, the “boundary effect”: the boundary of ROI intersects with some data blocks (nodes of index tree); these blocks need to be retrieved because they contain required data; but they also includes unnecessary data outside ROI. The boundary-effect is inevitable since it is unusual that the query window matches the data block perfectly. However, its negative impact can be reduced by using smaller data block. Most spatial access methods use one disk page, which is the minimal possible value, as the block size to reduce redundant data. However, the block size of previous methods that are based on MTM partitioning can not be small: extra edges are generated at the block edge when partitioning; it will introduce too many edges if the block size is very small. The fact that boundary-effect happens at every level when traversing down the MTM tree worsens the problem. For this problem, we propose our second optimization strategy: balancing retrieval number and redundant data. This strategy retrieves most data before mesh construction and leaves the rest during construction if this can reduce redundant data. Obvious candidates are those nodes that are outside ROI but still necessary. Excluding such nodes also makes it possible to applying wellstudied spatial access methods, which incurs a much less boundary effect, directly to MTM because MBR was a major hurdle but it is no longer needed. Similar idea appears in a multiresolution access method, the LOD-quadtree, recently proposed by Xu [16]. A MTM is indexed in a x-y-LOD space using a modified three-dimensional quadtree. Multiresolution query is translated into a three-dimensional range query that retrieves nodes within ROI and whose resolution is between minimal LOD (the root) and the LOD conditions. The LOD-quadtree treats every node in a MTM as a point, and extra queries are needed during mesh construction to find missing nodes. The cost of this strategy can be modeled as: (19) D2 (Q) = i0 × Ns0 + Cn + Cu Experiment results show that the number of index scan (Ns0 ) is usually small, and the value of Cu is much less than that of previous methods (first strategy), thus even better performance in most cases. We think the reason is that this method achieves a better balance between number of index scan and amount of redundant data: current query processing method, which retrieves all data during mesh construction, is one end; whereas our first strategy, which fetches all data before mesh construction, is the opposite end; the second strategy is somewhere in between and manages to ease the problems of both. The ability to use spatial access method also helps to reduce boundary effect.

4.3

Optimization strategy 3 - Integrating MTM and Query Processing

The cost of the second strategy (Equation 19) is considerably better than that of current processing method (Equation 10), but still not close to the optimal cost (Equation 17). It is rather difficult to further reduce the cost because of the two challenges inherited in MTM (mentioned in first strategy). Therefore, our third optimization strategy takes a different perspective and try to integrate MTM and query processing, i.e., including query processing information in MTM node, to preclude the inherent problems. For instance, the progressive refinement of mesh construction is well known for not suitable to query processing, if this can be avoided totally, it can significantly reduce processing cost. Our strategy can be better explained with a recent work, the direct mesh [17], by Xu et al. . It encodes topological data in every node of a MTM, which makes it possible to start mesh construction from the level specified by the LOD condition. This avoids fetching any ancestor nodes and substantially reduces the amount of data required, and thus I/O cost. To avoid introducing too much topological data, direct mesh only encodes relations among nodes with “similar” LOD, which means it can not solve the problem completely for variable-LOD query, which may have nodes whose LOD changes dramatically. The cost of this strategy can be modeled as: D3 (Q) = i0 + Cn + Cu

(20)

where ½ Cu

= 0, > 0,

uniform-LOD query variable-LOD query

(21) (22)

For uniform-LOD query, the direct mesh only retrieves nodes that appear in the mesh, and the total cost is: D(Q) = i0 + Cn ≈ i0 + dNm / Be × t0

(23)

This is very close to the optimal value, or even better in some case because the number of nodes here (Nm ) is much less than that in Equation 17 (Np ) because the ancestors are no longer needed. However, this does not mean that Equation 17 is not optimal. The reason is that the direct mesh breaks the assumption that all the ancestors of mesh nodes are also required. According to the test results of the direct mesh, the cost of this strategy significantly outperforms others for both uniform- and variable-LOD queries. Due to the scope of this paper, we only optimize multiresolution query processing qualitatively rather than quantitatively, which requires more detailed examination of specific indexing structure. In many cases this is essentially range query based on spatial access methods. The cost of such queries have been well studied in the literature. Interested reader please refer to [20–22] for range query on Quadtree family and [23–25] for range query on R-tree family.

5

Conclusions

In this paper, we proposed a cost model for multiresolution query processing in an online environment and three strategies for its optimization, among them 1. Reducing retrieval number is an effective method to reduce the total I/O cost, but it also introduces redundant data; 2. Balancing retrieval number and redundant data seems to a more sensible approach in many cases. The ability of utilizing existing spatial access method also helps to ease boundary effect; 3. The limitation of MTM, especially the progressive refinement routine of mesh construction, is a major hurdle of query performance. Integrating query processing information in MTM is a potential solution, and it has surprisingly good performance. We think there is no simple best strategy among the three. Adopting which one should be based on the specific application. Our optimization strategies are proposed based on theoretical analysis. Their feasibility in practice is confirmed by applying them to existing methods. It is reasonable to expect that they can significantly improve performance if employed to real online GIS applications.

References 1. Barclay, T., Gray, J., Slutz, D.: Microsoft terraserver: a spatial data warehouse. In: ACM SIGMOD international conference on Management of data, Dallas, Texas, United States, ACM Press (2000) 307–318 2. NASA: World Wind project (2004) http://learn.arc.nasa.gov/worldwind/. 3. U.S. Geological Survey: Earth Resources Observation Systems (EROS) Data Center (2004) http://edc.usgs.gov/. 4. Rabus, B.and Eineder, M.R.A., Bamler, R.: The shuttle radar topography mission - a new class of digital elevation models acquired by spaceborne radar. ISPRS Journal of Photogrammetry and Remote Sensing 57 (2003) 241–262 5. Garland, M.: Multiresolution modeling: Survey and future opportunities. In: Eurographics’99, Aire-la-Ville (CH) (1999) 111–131 6. Danovaro, E., De Floriani, L., Magillo, P., Puppo, E.: Compressing multiresolution triangle meshes. In: Advances in Spatial and Temporal Databases. 7th International Symposium SSTD 2001, Springer Verlag (2001) 345–364 7. Aasgaard, R., Sevaldrud, T.: Distributed handling of level of detail surfaces with binary triangle trees. In: 8th Scandinavian Research Conference on Geographical Information Science, Norway (2001) 45–58 8. Gerstner, T.: Multiresolution visualization and compression of global topographic data. GeoInformatica, to appear (2001) 9. DeCoro, C., Pajarola, R.: Xfastmesh: fast view-dependent meshing from external memory. In: Conference on Visualization, Boston, Massachusetts, IEEE Computer Society (2002) 363 – 370 10. Lindstrom, P.: Out-of-core construction and visualization of multiresolution surfaces. In: Symposium on Interactive 3D graphics, Monterey, California, ACM Press (2003) 93 – 102

11. Wartell, Z., Kang, E., Wasilewski, T., Ribarsky, W., Faust, N.: Rendering vector data over global, multi-resolution 3d terrain. In: symposium on data visualisation, Grenoble, France (2003) 213 – 222 12. Hoppe, H.: Smooth view-dependent level-of-detail control and its application to terrain rendering. In: IEEE Visualization ’98, Research Triangle Park, NC, USA (1998) 35–42 13. Kofler, M., Gervautz, M., Gruber, M.: R-trees for organizing and visualizing 3D GIS database. Journal of Visualization and Computer Animation (2000) 129–143 14. Shou, L., Chionh, C., Ruan, Y., Huang, Z., Tan, K.L.: Walking through a very large virtual environment in real-time. In: 27th International Conference on Very Large Data Base, Roma, Italy (2001) 401–410 15. Shou, L., Huang, Z., Tan, K.L.: HDoV-tree: The structure, the storage, the speed. In: 19th International Conference on Data Engineering (ICDE) 2003, Bangalore, India (2003) 557–568 16. Xu, K.: Database support for multiresolution terrain visualization. In: The 14th Australian Database Conference, ADC 2003, Adelaide, Australia, Australian Computer Society (2003) 153–160 17. Xu, K., Zhou, X., Lin, X.: Direct mesh: a multiresolution approach to terrain visualisation. In: ICDE. (2004) 766–777 18. Hoppe, H.: Progressive meshes. In: 23rd International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96), New Orleans, LA, USA, ACM press (1996) 99–108 19. Samet, H.: The quadtree and related hierarchical data structure. ACM Computing Surveys 16 (1984) 187–260 20. Aref, W.G., Samet, H.: Efficient window block retrieval in quadtree-based spatial databases. GeoInformatica 1 (1997) 59–91 21. Faloutsos, C., Jagadish, H., Manolopoulos, Y.: Analysis of the n-dimensional quadtree decomposition for arbitrary hyperrectangles. IEEE Transactions on Knowledge and Data Engineering 9 (1997) 373–383 22. Aboulnaga, A., Aref, W.G.: Window query processing in linear quadtrees. Distributed and Parallel Databases 10 (2001) 111–126 23. Theodoridis, Y., Sellis, T.: A model for the prediction of R-tree performance. In: 15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Montreal, Canada, ACM Press (1996) 161–171 24. Proietti, G., Faloutsos, C.: I/O complexity for range queries on region data stored using an R-tree. In: 15th International Conference on Data Engineering, Sydney, Australia, IEEE Computer Society (1999) 628–635 25. Jin, J., An, N., Sivasubramaniam, A.: Analyzing range queries on spatial data. In: 16th International Conference on Data Engineering, San Diego, California (2000) 525–534