Multiple Representation Spatial Databases and ... - Semantic Scholar

3 downloads 0 Views 194KB Size Report
was based on the three multiplicities defined by Martel (1999): geometric multiplicity, ... This approach suits the specific case of MRDB where the generalisation.
MULTIPLE REPRESENTATION SPATIAL DATABASES AND THE CONCEPT OF VUEL Yvan Bédard, Eveline Bernier, Thierry Badard Yvan Bédard, PhD Departement of Geomatics Sciences Centre for Research in Geomatics Canada NSERC Industrial Research Chair in Geospatial Databases for Decision-Support Laval University Quebec City, Qc Canada G1K 7P4 Phone.: 418.656.2131 (3694) Fax.: 418.656.7411 [email protected] Eveline Bernier, MSc Research Assistant Centre for Research in Geomatics Canada NSERC Industrial Research Chair in Geospatial Databases for Decision-Support Laval University Quebec City, Qc Canada G1K 7P4 Phone.: 418.656.2131 (6734) Fax.: 418.656.3607 [email protected] Thierry Badard, PhD Departement of Geomatics Sciences Centre for Research in Geomatics Laval University Quebec City, Qc Canada G1K 7P4 Phone.: 418.656.2131 (7116) Fax.: 418.656.7411 [email protected]

MULTIPLE REPRESENTATION SPATIAL DATABASES AND THE CONCEPT OF VUEL INTRODUCTION The 70s have witnessed the advent of digital geographic information. Gradually, paper maps have been replaced by digital products. Mapping agencies rapidly started to build spatial databases, each of them being intended to specific ends and without relationships to each other. Such databases multiplied among and within mapping agencies, which resulted in a more complex and heavy management process where the same world features were created over and over again or where the same geometric primitives were duplicated to represent different features. In an attempt to facilitate the production, update and management of spatial data, it was proposed by the end of 80s to explicitly link these existing databases and to build multiple representation databases (MRDB) (Buttenfield & Delotto, 1989). This kind of spatial database typically stores several geometric representations of a same world feature. Different approaches and structures have been proposed to build such MRDB and, typically, they address solely this geometric aspect of multiple representation as this is the most common requirement. However, in order to enrich the theoretical corpus related to multiple representations or to better support certain categories of applications such as automatic map generalization, automatic propagation of updates and Spatial On-Line Analytical Processing (SOLAP), different authors have proposed to extend the concept of multiple representations to the semantics of the observed object as well as to their graphical display (symbology and semiology, including visual variables such as color, pattern, weight, etc.).

Very similar concepts have been proposed for multiple semantics (i.e. supporting multiple definitions, multiple interpretations of the observed reality) (ex. Devogele, 1997; Jones et al., 1996; Martel, 1999; Rigaux & Scholl, 1995; Vangenot, 2004) and several papers have used the term without defining the concept. However, since the semantics of the observed feature could change, i.e. its own meaning, the concept of multiple semantics raised a very fundamental question: “the multiple semantics of what ?”. With the proposed concepts, identifying the “what” immediately fixed the semantics of the observed object, thus automatically eliminating its possibility of multiple semantics and leading to a faulty theoretical concept. In other words, an observed object or real world phenomena could not remain the “same object or phenomena” if it changed its very own essence, i.e. its semantics. This fundamental question remained unanswered up to the development of the VUEL concept (“View Element”; Bédard & Bernier, 2002) which was based on the three multiplicities defined by Martel (1999): geometric multiplicity, semantic multiplicity, graphical multiplicity. The present chapter discusses issues associated to multiple representation databases and gives an overview of existing approaches and structures. It then defines the Vuel concept and its underlying structure. Finally, the chapter presents future trends in this specific research area.

ISSUES RELATED TO MULTIPLE REPRESENTATION Multiple representation is communally defined as the creation and the maintenance of different digital versions of the same geographical phenomena within a single system (Kidner & Jones, 1994; Timpf & Devogele, 1997; Weibel & Dutton 1999). This concept

is often presented together with automatic cartographic generalization. The latter aims at deriving a simplified representation from the detailed representation of a phenomena using specific algorithms. This is usually required when reducing a map scale. In spite of significant advances in automatic generalization over the last two decades, it still remains a complex operation that cannot be entirely done without a human intervention. Accordingly, several projects tend to store the results of such a complex process in a MRDB to avoid repeating this process. However, MRDB raise specific issues such as the ones discussed in the following paragraphs.

Multiple Representation vs. Data Redundancy The concept of multiple representation may, at first glance, appears as data redundancy but it is actually not the case. While data redundancy involves the direct (a value is a copy of another) or indirect (a value can be derived from one or several other values) duplication of data, multiple representation involves the storage of different aspects of an object that may by no means be derived automatically from a database point of view (thus, in the case of map generalization, an MRDB is appropriate when the generalization is not performed 100% automatically). In fact, the characteristics that are common to the different versions must ideally be stored only once and shared by the concerned occurrences. For example, if an object may be represented by two different shapes (ex. a city represented by a polygon at larger scales and by a point located downtown at smaller scales, i.e. at a location that cannot be derived), there should be only one semantic occurrence (of city), and it should then be associated to two geometric occurrences, i.e. the point and the polygon. A less efficient data structure would be to have twice the same

semantic occurrence (i.e. duplicating the city and its attributes), each with its specific geometry. Typically, the most efficient MRDB are those that have been designed right from the start with multiple representation in mind. The implemented data structure can then provide strong mechanisms to efficiently supports the required relationships and avoid uncontrolled redundancy potentially leading to inconsistent states and conflicts.

Data Volume Database volume may significantly increase when dealing with multiple representation. In order to limit this increase, the concept of geometric patterns has been proposed (Cardenas, 2004; Sabo et al, 2005). Geometric patterns are defined as shapes that are common to several cartographic objects but that can be adapted as needed. This approach relies on the fact that, at certain scales, several cartographic objects share geometric similarities. For example, several buildings, at small scales, are displayed using an L shape. Thus, storing only the pattern IDs and transformation parameters of an object reduces the required space as opposed to storing all the similar geometries of this object for different scales. Furthermore, as pattern can be represented by a short string of digits (ex. 7 digits for the building pattern), it becomes possible to embed very simple generalization algorithms applicable to this string. This approach is at the centre of the Self-Generalizing Object (SGO) concept (Sabo et al., 2005), a highly efficient on-the-fly map generalization solution using a pattern-based MRDB with very little increase in data volume. Another way to minimize data volume is to combine in a more traditional way an MRDB with generalization processes that can be automated 100% and build intermediate levels

of detail. Cecconi (2003) proposes to use a multi-scale database (MSDB) with a minimum of two LoDs that will then form the basis for the generalization processes. Depending on the users’ specifications, object classes are selected from the LoD that is the closest to the final result and it is refined using automatic generalization processes.

Update Process Update processes must be revisited for MRDB as it is one of their most obvious problems. In MRDB databases, an update may affect every version of an object, but not necessarily. Consequently, it is important to efficiently manage MR relations to automatically propagate the required updates, otherwise inconsistencies may appear. Several researchers have tackled this update process for MRDB. For instance, Kilpelaïnen (1994, 1995) proposed the concept of incremental generalization where updates are propagated using generalization processes only for the affected representations. This approach suits the specific case of MRDB where the generalisation processes between all representations are explicitly defined. To overcome such a limitation, Badard (1999, 2000) has defined a generic mechanism dedicated to the consistent propagation of updates in MRDB. Based on geographic data matching tools, it allows the updating of geospatial databases from the retrieval of changes in reference data sets to the integration of detected evolutions and to the propagation of their effects in derived (or not) databases. Preservation of consistency in the different representations is fully controlled at each step of the process. Resulting MRDB remains always consistent.

Populating an MRDB An MRDB can be populated using two different ways: 1) generalizing detailed data and storing the results, or 2) integrating existing data from different maps at different scales. Although the first method uses generalization algorithms to create simplified data from more detailed one, this process is rarely automated completely. Nevertheless, the link between the different representations of an object can usually be automatically created during a computer-assisted generalization process. The second approach aims at integrating data from different spatial data sources. Depending on these sources, this may lead to richer MRDB as it may involve different types of multiplicities (geometric, semantic and graphic). Moreover, this approach is well suited for multiple representations at the same level of detail (LoD) for different purposes. However, the limitations stated by Badard (1999) remain valid today, that is using geographic data matching or conflation processes for the creation of the links between the different representations of an object cannot be fully automated and still remains a research area.

EXISTING APPROACHES AND STRUCTURES Amongst the earliest research projects on multiple representation was an initiative by the National Center for Geographic Information and Analysis (NCGIA) back in 1988 (Buttenfield & Delotto, 1989). Since these early days, several researchers have proposed ways of storing MR in spatial databases. Several projects were influenced by the hierarchical nature between scale transitions (Vangenot, 2004), leading to the development of hierarchical structures to support multiple representations in spatial databases (Devogele, 1997; Francalanci & Pernici, 1994; Jones, 1991; Kidner & Jones,

1994; Timpf & Frank, 1995; Timpf, 1998). Other researchers proposed a special type of hierarchical structure called a tree structure (van Oosterom, 1989; van Oosterom & Schenkalaars (1993, 1996)). Its underlying idea is to combine a spatial index with a hierarchical structure that differentiate the representations of an object according to the map scale. Another example of approach was presented by Vangenot (2004). It is a stamping approach that defines the geometry as a spatial attribute with n values. This approach provides access to the appropriate values according to the level of detail. This stamping can also be applied to attributes and relationships. Other similar methods have been proposed, one embracing the MR paradigm in a unique way: the VUEL.

THE VUEL CONCEPT AND STRUCTURE Stemming from theoretical concepts in the field of spatial data cubes and SOLAP (Spatial On-Line Analytical Processing), the Vuel has been developed by Bédard and Bernier (2002) in an attempt to support the three types of multiple representations that are inherent to spatial data, namely geometry, semantics and graphic. Fundamentally, a vuel is a View Element and is defined as the primitive of an object-based graphical view (ex. map, chart, table) similarly to the way a pixel (picture element) is defined as the primitive of a digital image. More precisely, a vuel is a unique combination of a geometry (ex. polygon A), a graphical portrayal (ex. red plain pattern with black border) and a semantics (ex. fire station). Any change in one of these three vuel components will create a new occurrence of vuel. Figure 1 shows the high-level model of this concept.

VIEW

VUEL

SEMANTIC

GEOMETRY

GRAPHIC

Figure 1. The three VUEL components and the resulting view

Accordingly, vuels are used to represent and define the phenomena observed in the reality. A number n of representations of an observed phenomena leads to n vuels independently of the nature of the multiplicity leading to this multiple representation: geometric, semantics, graphic. Thus, supporting two geometries for a same phenomena results in having two occurrences of vuels with relationships to the same semantics and graphics, but with relationships to different geometric primitives. Figure 2 shows a realworld phenomena represented and defined by three different vuels. The two first vuels have the same semantic, but differ by their geometric and graphic aspects while the first and third vuels have a same geometry but different semantic and graphic properties. In fact, all the possible multiplicities are supported by managing these combinations at the occurrence level. Such flexibility offers new levels of possibilities to emerging technologies such as Spatial On-Line Analytical Processing (Rivest et al, 2005) and occurrence-based Map-on-Demand (Bernier, Bédard & Hubert, 2005).

REALITY

VUEL 1

VUEL 2

HOUSE

HOUSE

VIEW A

VIEW B

VUEL 3

BUILDING

VIEW C

Figure 2. A same reality defined by three different vuels (Bédard & Bernier, 2002) From that concept, we have developed an MRDB structure. The design of this structure has been inspired by today’s multidimensional structures (Inmon, 2002) as those used in SOLAP. The vuel is the central table (i.e. the fact table) linked to three other tables (i.e. the dimension’s tables): geometric, semantic and graphic. Each vuel (i.e. a fact) is thus a unique combination of one instance of each dimension’s tables (figure 3).

Figure 3. The Vuel conceptual model with the hierarchical relationships for multi-granularity databases

FUTURE TRENDS Populating and updating an MRDB remain a challenge. Today’s method rely on generalization processes or data matching and conflation processes, which will likely improve in the future as research still goes on. Among these planned improvements, approaches based on geospatial web Service Oriented Architectures seem to be very promising. By defining reusable, interoperable and real-time geospatial data processing web services (data matching, generalisation, updating, etc.), implementation of distributed MRDB over the Internet, by federating online heterogeneous data sources (Web Feature Services or other sources), will be made possible. In the meantime, a new way of populating an MRDB is emerging: producing the different geometries and semantics of a same observed phenomena immediately during the initial data acquisition process. For example, a first project aims at extending the 2D data digitization process to insert SGO and geometric patterns (Sabo et al, 2005). A second project aims at developing enriched photogrammetric method and tool for multi-scale 3D data acquisition (Fredericque et al, 2005). Improvements to existing hierarchy-based algorithms and structures can also be expected and contribute to better MRDB. Finally, the demand for faster and more customizable web and mobile mapping applications requiring an occurrence-based flexibility is likely to push forward the need for MRDB (as substitute to the map-replacement approach of today’s web mapping solutions). Such a capability has recently been investigated with an on-demand web mapping application running on top of a Vuel MRDB. This application, called UMapIt, offers powerful capabilities to display information at different LoD (geometrically or semantically) for a

complete class of objects at once or for one occurrence only, in a completely interactive manner (figure 4) (Bernier et al., 2005).

Figure 4. UMapIt, an on-demand web mapping application based on a Vuel MRDB

Future developments of UMapIt deal with the coupling of the present application with web services in order to provide users with augmented and innovative MR web and mobile mapping functionalities.

CONCLUSION Multiple representation databases can store different representations and definitions of a same geographical phenomena observed in reality. Such databases came from the needs of mapping agencies to produce generalized maps in a more efficient manner, to integrate independent and heterogeneous data in a unique database to facilitate data management and update processes, and to meet the challenges of emerging applications such as mapon-demand and SOLAP. Several solutions have been proposed with their own strengths

for given contexts and datasets. A unique solution, the vuel, has provided answers to some fundamental questions and offers new capabilities. The biggest challenge for MRDB remains its efficient feeding and updating but solutions are being developed.

REFERENCES Badard, T. (1999). On the automatic retrieval of updates in geographic databases based on geographic data matching tools. In: Proceedings of the 19th International Cartographic Conference (Ottawa’99), ICA/ACI (Eds.), Ottawa, Canada, August 14-21, pp.47-56. Badard, T., Lemarié C. (2000). Propagating updates between geographic databases with different scales. Chapter 10 of Innovations in GIS VII: GeoComputation, Atkinson, P. and Martin, D. (Eds.), Taylor and Francis, London, UK, 12 pages. Bédard Y., Bernier E. (2002). «Supporting Multiple Representations with Spatial View Management and the Concept of VUEL», Joint Workshop on Multi-Scale Representations of Spatial Data, ISPRS WG IV/3, ICA Com. on Map Generalization. Ottawa, CANADA, July 7th-8th. Bernier, E., Bédard, Y. & Hubert, F. (2005). UMapIT: An On-Demand Web Mapping Application Based on a Multiple Representation Database, 8th ICA Workshop on generalization and multiple representation, A Coruna, Spain, July 8-9th. Buttenfield, B.P., & Delotto, J.S. (eds). (1989). Multiple representations : Scientific Report for the Specialist Meeting, National Center for Geographic Information and Analysis (NCGIA), Technical Paper 89-3, 87p. Cardenas, A. (2004). Utilisation de patrons géométriques comme support à la généralisation automatique. MSc Thesis, Laval University, Dept. Geomatics Sciences, Quebec, Canada, 77 pp. Cecconi, A. (2003). Integration of Cartographic Generalization and Multi-Scale Databases for Enhanced Web Mapping, Ph.D. Thesis, University of Zurich, Zurich. Devogele, T. (1997). Processus d'intégration et d'appariement de Bases de Données Géographiques: Application à une base de données routières multi-échelles. PhD Thesis, Université de Versailles, Dept. Méthodes Informatiques, France, 207 pages. Francalanci, C., & Pernici, B., (1994). Abstraction levels for entity-relationship schemas, In Loucopoulos, pp. 456-473 Frédéricque, B., Daniel, S., Bédard, Y. & Paparoditis, N. (2005). Knowledge based processes management to support databases population with 3D multi-representation of buildings. International Society for Photogrammetry and Remote Sensing (ISPRS) Hannover Workshop, 17-20 May, Hannover, Germany. Inmon, W.H. (2002). Building the Data Warehouse, 3rd Edition. (John Wiley & Sons). 412 p. Jones, C.B., 1991, Database architecture for multi-scale GIS, Proceedings Auto-Carto 10, Baltimore, ACSM/ASPRS, pp. 1-14 Jones, C.B., Kidner, D.B., Luo, L.Q., Bundy, G.L. & Ware J.M. (1996). Database design for a multiscale spatial information system. International Journal of Geographical Information Systems, 10(8), 901-920. Kidner, D. B. & Jones, C.B. (1994). A deductive object-oriented GIS for handling multiple representations, Advances in GIS Research, Proceedings of the 6th International Symposium on Spatial Data Handling, Taylor & Francis, London, pp. 882-900

Kilpelaïnen, T. (1994). Updating multiple representation geodata bases by incremental generalization, Int. Arch. of Photogrammetry and Remote Sensing, Commission III /IV, Munich, Vol. 30, part 3/1, 440-447. Kilpelaïnen, T. (1995) Requirements of a multiple representation database for a topographical data with emphasis on incremental generalization, Proceedings of 17th International cartographic conference, Barcelona, Vol. 2, 1815-1825. Martel, C. (1999). Développement d’un cadre théorique pour la gestion des représentations multiples dans les bases de données spatiales, M.Sc. thesis, Univ. Laval, Dept. Geomatics Sc., Quebec, Canada, 128 pages. Rigaux, P., Scholl, M. (1995). Multi-scale partitions: Applications to spatial and statistical databases. 4th International Symposium on Advances in Spatial Databases, SSD'95, Portland, Maine, USA, pp. 170-183,. Rivest, S., Y. Bédard, M.-J. Proulx, M. Nadeau, F. Hubert & J. Pastor (2005). SOLAP: Merging Business Intelligence with Geospatial Technology for Interactive Spatio-Temporal Exploration and Analysis of Data. Journal of International Society for Photogrammetry and Remote Sensing (ISPRS), Vol. 60 (1), pp. 17-33. Sabo, M.N., Bédard, Y., Bernier, E. & Cardenas, A. (2005). Methodology for developing a database of geometric patterns to better support on-the-fly map generalization, International Cartographic Conference, 9-16 July 2005, Coruna, Spain. Sabo M.N, A. Cardenas, Y. Bédard & E. Bernier (2005). Introduction du concept de patron géométrique et application aux bâtiments afin de faciliter leur généralisation cartographique à la volée. Geomatica, Journal of the Canadian Institute of Geomatics, Vol. 59, No. 3, pp. 295-311. Timpf, S. & Frank, A. (1995). A multi-scale data structure for cartographic objects, Proceedings of ICC’95 (ICA, ed.), Barcelona, Vol.1, pp.1389-1396 Timpf, S. (1998). Hierarchical structures in map series, Thèse de doctorat, Technical University of Vienna, Departement of Geoinformation, http://www.geoinfo.tuwien.ac.at/publications/timpf Timpf, S. and Devogele, T. (1997). New tools for multiple representations. ICC'97, Stockholm, Editor: Lars Ottoson, pp. 1381-1386. van Oosterom, P. (1989). A reactive data structure for geographic information systems, Proceedings AutoCarto 9, Baltimore, pp. 665-674 van Oosterom, P. & Schenkelaars, V. (1993). The design and implementation of a multi-scale GIS, Proceedings of EGIS’93, Italie, pp. 712-722 van Oosterom, P. & Schenkelaars, V. (1996). Applying reactive data structure in an interactive multi-scale GIS, dans : Molenaar, Martien (éditeur), Methods for the generalization of geo-databases, Netherlands geodetic commission, No. 43, pp. 37-56 Vangenot, C. (2004). Multi-representation in spatial databases using the MADS conceptual model. ICA Workshop on Generalisation and Multiple representation – 20-21 August 2004 – Leicester Weibel, Robert & Dutton, G. (1999). Generalising spatial data and dealing with multiple representations, In: Longley, Paul A., Michael F. Goodchild, David J. Maguire & David W. Rhind (eds), Geographical Information Systems: Principles and Technical Issues, Second Edition, Vol. 1, Wiley, pp. 125-155.

TERMS AND DEFINITIONS Geometric Multiplicity : Occurs when two or more different geometries are stored in a database to represent the same observed phenomena.

Graphic Multiplicity : Occurs when two or more different sets of visual variables are stored in a database to graphically portray the same observed phenomena. Semantic Multiplicity : Occurs when two or more different meanings are stored in a database to define the same observed phenomena. Multiple Representation Database: A spatial database that includes and supports, at least, geometric multiplicities but that can also include semantic and graphic multiplicities. Vuel (View Element) : being the primitive of an object-based graphical view, it is a unique combination of a geometry, a semantics and a graphical portrayal. View : an object-based graphical view such as a map, a statistical chart, a table, a legend, etc.. AKNOWLEDGEMENTS We recognize the financial support of Canada NSERC Industrial Research Chair in Geospatial Databases for Decision-Support and its partners (http://mdspatialdb.chair.scg.ulaval.ca/).