age and retrieval of spatial data. Several research reports have ... R.H. Güting, D. Papadias, F. Lochovsky (Eds.): SSD'99, LNCS 1651, pp. 355â359, 1999.
Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com
1
Introduction
Conventional relational databases often do not have the technology required to handle complex data like spatial data. Unlike the traditional applications of databases, spatial applications require that databases understand complex data types like points, lines, and polygons. Typically, operations on these types are complex when compared to the operations on simple types. Hence relational database systems need to be extended in several areas to facilitate the storage and retrieval of spatial data. Several research reports have described the requirements from a database system and prioritized the research needs in this area. A broad survey of spatial database requirements and an overview of research results is provided in [3,4,6,10]. Research needed to improve the performance of spatial databases in the context of object relational databases was listed in [4]. The primary research needs identified were extensible indexing and optimizer, concurrency control techniques for spatial indexing methods, development of cost models for query processing, and the development of new spatial join algorithms. Many of the system requirements identified in [4] have since been addressed in some commercial systems [1,8,9]. In this context, we describe our experiences in implementing a spatial database on top of Oracle’s extensible architecture. 1.1
Requirements of a Spatial Database System
Any database system that attempts to deal with spatial applications has to provide the following features: – A set of spatial data types to represent the primitive spatial data types (point, line, area), complex spatial data types (polygons with holes, collections) and operations on these data types like intersection, distance, etc. – The spatial types and operations on top of them should be part of the standard query language that is used to access and manipulate non spatial data in the system. For example, SQL in case of relational database systems should be extended to be able to support spatial types and operations. R.H. G¨ uting, D. Papadias, F. Lochovsky (Eds.): SSD’99, LNCS 1651, pp. 355–359, 1999. c Springer-Verlag Berlin Heidelberg 1999
356
Siva Ravada and Jayant Sharma
– The systems should also provide performance enhancements like indexes to process spatial queries (range and join queries), parallel processing, etc. which are available for non spatial data.’
2
Oracle’s Spatial
Oracle8i Spatial [7] provides a completely open, standards based architecture for the management of spatial data within a database management system. Users can use the same query language (industry standard SQL) to access the spatial data and all other data in the database. The functionality provided by Oracle8i Spatial is completely integrated within the Oracle database server. Users of spatial data gain access to standard Oracle8i features, such as a flexible client/server architecture, object capabilities, and robust data management utilities, ensuring data integrity, recovery, and security features that are virtually impossible to obtain with other architectures. Oracle8i Spatial enables merging GIS (Geographic Information System) and MIS (Management Information System) data stores and implementing a unified data management architecture for all data across the enterprise. The Oracle8i Spatial provides a scalable, integrated solution for managing structured and spatial data inside the Oracle server. 2.1
Spatial Data Modeling
Oracle Spatial supports three primitive geometric types and geometries composed of collections of these types. The three primitive types are: (i) Point, (ii) Line String, (iii) and N-point polygon where all these primitive types are in 2Dimensions. A 2-D point is an element composed of two ordinates, X and Y. Line strings are composed of one or more pairs of points that define line segments. Any two points in the line segment can be connected either by a straight line or a circular arc. That means line strings can be composed of straight line segments, arc segments or a mixture of both. Polygons are composed of connected line strings that form a closed ring and the interior of the polygon is implied. A geometry is the representation of a spatial feature, modeled as a set of primitive elements. A geometry can consist of a single element or a homogeneous or heterogeneous collection of primitive types. A layer is a collection of geometries which share the same attribute set. For example, one layer in a GIS might include topographical features, while another describes population density, and a third describes network of roads and bridges in an area. 2.2
Operations on the Spatial Data Types
The binary topological relationships between two spatial objects A and B in the euclidean space is based on how the two objects A and B interact with respect to their interior, boundary and exterior. This is called the 9-intersection model [2] for the topological relationships between two objects. In this model, one can theoretically distinguish between 29 = 512 binary relationships between
Oracle8i Spatial: Experiences with Extensible Databases
357
A and B. In case of 2-dimensional objects, only eight relations can be realized which provide mutually exclusive and complete coverage for A and B. These relationships are contains, coveredby, covers, disjoint, equal, inside, overlap, touch. Oracle Spatial supports this 9-intersection [2] model for determining the topological relationships between two objects. In addition the system can also support other relationships derived as a combination of the above 8 relations. For example, OVERLAPBDYDISJOINT can be defined as the relation where the objects overlap but the boundaries are disjoint. Oracle Spatial also provides a within distance function where the distances are calculated in the Euclidean space. In addition, this system also provides set theoretical operations like UNION, INTERSECTION, DIFFERENCE and SYMMETRIC-DIFFERENCE. For example, given two spatial objects A and B, one can compute and return a new object C which is the UNION of A and B. 2.3
SQL Support for Spatial Data
Query language is the principal interface to the data stored in a relational database system. A popular commercial language used for accessing data in a RDBMS is SQL. Traditional SQL has been extended recently to be able to support access for new data types. In case of Oracle8i Spatial, SQL is extended in two ways: SQL can be used to define and create objects of spatial types. SQL can also be used to insert, delete, update spatial types in addition to being able to query the spatial data with the help of spatial functions. For example to find out all the parks in city which overlap the rivers in the city can be found using the SQL query: SELECT A.feature FROM parks A, rivers B WHERE sdo geom.relate(A.geometry, B.geometry, ‘OVERLAP’) = TRUE; 2.4
Spatial Indexing
The introduction of spatial indexing capabilities into the Oracle database engine through Oracle Spatial is a key feature. A spatial index acts much as any other index as a mechanism to limit searches within tables (or data spaces) based on spatial criteria. An index is required to be able to efficiently process queries like find objects within a data space that overlap a query area (usually defined by a query polygon) and find pairs of objects from within two data spaces that spatially interact with one another (spatial join). A spatial index in spatial cartridge is a logical index. The entries in the index are dependent on the location of the geometries in a coordinate space, but the index values are in a different domain. Index entries take on values from a linearly ordered integer domain while coordinates for a geometries may be pairs of integer, floating, or double-precision numbers. Spatial cartridge uses a linear quadtree based indexing scheme, also known as z-ordering which maps geometric objects to a set of numbered tiles. Point data can be very well indexed by a recursive decomposition of space. Spatial object with extent, such as area
358
Siva Ravada and Jayant Sharma
or line features create a problem for this sort of index, because they are highly likely to cross index cell partition boundaries. Alternative indexing mechanism, such as R-trees, have been proposed based on overlapping index cells (a nonhierarchical decomposition). Oracle Spatial chooses to take another approach to the problem. Each item is allowed multiple entries in the index. This allows one to index features with extent by covering them with the decomposition tiles from a hierarchical decomposition. Extensible Indexing in Oracle With Oracle’s extensible indexing framework, applications can defines the structure and access methods for the application specific data. (This is called the domain index in Oracle.) The application can store the index data either inside the Oracle database (e.g. in the form tables) or outside the Oracle database (in the form of files). And the application can define routines that manage and manipulate the index to evaluate SQL queries. In effect, the application controls the structure and semantic content of the domain index. The database system interacts with the application to build, maintain, and employ the domain index. The main advantage of this extensible indexing framework is that the index is always in sync with the data. That is once the index is build, all the updates on the base table will automatically result in updates in the index data. Thus the users are not required to worry about the data integrity and correctness issues. Once the domain index is built, it is treated like a regular B-tree index. The database server knows the existence of this domain index and thus manages all the index related work using user defined functions. The extensible indexing framework also provides hooks into the optimizer to let the domain index creator educate the optimizer about the cost functions and selectivity functions associated with the domain index. The optimizer can then generate execution plans that make educated choices regarding domain indexes. Oracle Spatial built an indexing mechanism using this extensible indexing framework which is completely integrated with the database system. This also provides full concurrency control that is available to non spatial data and b-tree indexes in the database. 2.5
Query Processing
Queries and data manipulation statements can involve application-specific operators, like the Overlaps operator in the spatial domain. Oracle’s extensible framework lets applications/users define operators and tie the operators to a domain index. This lets the optimizer choose a domain index in evaluating a user defined operator. Oracle Spatial defined operators which are very common for many of the spatial applications. The spatial queries are evaluated using the popular two-step method: a filter step and a refinement step. A spatial index is used during the filter step and the actual geometries are used in the refinement step. This two-step process is used in both the window-query case and the spatial join case. For example, Oracle Spatial provides an SDO RELATE operator which can be used compute if two geometries overlap with each other. If we want to find
Oracle8i Spatial: Experiences with Extensible Databases
359
all the roads through a county where the road intersects the county boundary, the query will look like this: SELECT a.id FROM roads A, counties B WHERE B.name = ’MIDDLESEX’ AND SDO RELATE(A.geometry, B.geometry, ’MASK=OVERLAP’) = ’TRUE’;
This query shows a simple example where a non spatial attribute and a spatial attribute is used in the same query. Assume that there is only one row in the counties table that satisfies the predicate on counties.name column. Then optimizer in this case will be able to choose a B-tree index on counties.name column and use the spatial index to evaluate the SDO RELATE operator as a window query on the roads table.
3
Conclusions
In this paper, we described our experiences in implementing a spatial database on top of Oracle’s extensible framework. We described how the query language, data modeling and query processing issues are addressed in this system. However, there is still more research required in areas like partitioning techniques to support parallel query processing and bulk loading, and spatial clustering. In addition, there is a growing need for a industry wide benchmark for measuring performance of different database systems supporting spatial databases.
References 1. S. Defazio et al. Integrating IR and RDBMS Using Cooperative Indexing. Proceedings of SIGIR, Seattle, Washington, 1995. 2. M.J. Egenhofer. What’s Special About Spatial? Database requirements for Vehicle Navigation in Geographic space. Proceedings of ACM SIGMOD, 1993. 3. R.H. Guting. An Introduction to Spatial Database Systems. VLDB, 3:357:399, 1994. 4. W. Kim, J. Garza, and A. Keskin. Spatial data management in Database Systems. pp1-13, 3rd Intl. Symposium on Advances in Spatial Databases, 1993. 5. OGC. The Open GIS Consortium. http://www.opengis.com. 6. Spatial Databases- Accomplishments and Research Needs. S. Shekhar, S. Chawla, S. Ravada, A. Fetterer, X. Liu and C.-t. Lu. IEEE Transactions on Knowledge and Data Engineering. Vol 1, Number 1, January 1999. 7. Oracle8i Spatial User’s Guide and Reference, Release 8i, 1999. 8. M. Stonebreaker and G. Kennitz. POSTGRES Next-Generation Database Management System. Communications of the ACM, 34(10):78-92, 1993. 9. M. Stonebreaker and D. Moore. Object Relational DBMSs: The next great wave. Morgan Kaufmann, 1997. 10. M. F. Worboys. GIS: A computing perspective. Taylor and Francis, 1995.