Earth Sci Inform DOI 10.1007/s12145-012-0104-0
RESEARCH ARTICLE
Service-oriented approach for geospatial feature discovery Peng Yue & Liping Di & Weiguo Han & Peisheng Zhao & Wenli Yang & Lianlian He
Received: 22 December 2011 / Accepted: 8 July 2012 # Springer-Verlag 2012
Abstract Rapid increases in remote sensing capability have made remotely sensed images an importance source for intelligence analysts to discover geospatial features. The overwhelming volume of routine image acquisition has greatly outpaced the increase in the capacity of manual image interpretation by intelligence analysts, and prompted automated methods for geospatial feature extraction from high spatial resolution images. Nevertheless, existing methods focus on automatic extraction of isolated or elementary features, such as buildings and roads. A compound geospatial feature, such as a Weapon of Mass Destruction (WMD) proliferation facility, is spatially composed of elementary features (e.g., containment buildings, cooling ponds, and fences). The spatial relations among elementary features can assist the detection of compound features from images. This paper proposes a service-oriented approach for discovering compound geospatial features. The approach includes both a chaining strategy and an architecture. The chaining strategy is to discover sites of facilities by orchestrating services that compute spatial relations among elementary features. The architecture is a service-oriented framework to support the chaining for feature discovery. The approach not only takes advantages of spatial Communicated by: Hassan Babaie P. Yue : L. Di (*) : W. Han : P. Zhao : W. Yang : L. He Center for Spatial Information Science and Systems (CSISS), George Mason University, 10519 Braddock Road STE 2900, Fairfax, VA 22032, USA e-mail:
[email protected] L. Di e-mail:
[email protected] P. Yue State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, 430079, Wuhan, China
characteristics of complex features, but also enjoys the openness and flexibility of the Service-Oriented Architecture (SOA). A prototypical implementation is provided to illustrate the applicability of the approach. Keywords Image mining . Geospatial services . Workflow . Service chain . Feature discovery . GIS
Introduction Rapid increases in remote sensing capability in recent years, especially in high spatial resolution imaging, have shown great promise for identification of geospatial features and their changes over time (Di and Ramapriyan 2010). However, the overwhelming volume of routine image acquisition has greatly outpaced the increase in the capacity of intelligence analysts to interpret them. When intelligence analysts are interested in using these geospatial data, the limited accessibility of geoprocessing resources such as analysis tools or computing facilities hampers their activities. Even if they develop their own geoprocessing algorithms or workflows, those geoprocessing resources cannot be readily accessible to the public for reuse. Web service technologies can significantly reduce the data volume, computing steps, and resources at the end-user side (Di and McDonald 1999; Zhao and Di 2010; Vannan et al. 2011). Using Web service technologies, large volumes of data and powerful computing resources are available to all users, thus significantly enhancing their ability to use online/near-line data over the Web and allowing the widespread automation of data analysis and computation. Current methods in geospatial image mining and feature extraction focus on the manual or automated processing of images to detect isolated or elementary features, such as buildings and roads (Mena 2003; Sohn and Dowmana 2007; Mohammadzadeh and Zoej 2010). Classification is
Earth Sci Inform
often performed on a per-pixel basis, although region-based characterization has received increasing attention in the recent years (Frome et al. 2007; Karantzalos and Argialas 2009). On the other hand, complex (compound) features often exist in geospatial images. There are spatial relationships (metric, topological, etc.) among elementary features in these complex features. For example, a manufacturing facility might be composed by elementary features such as buildings, ponds, and tanks. A school facility is spatially composed of buildings, field, and courts. Traditional image analysis approaches mainly exploit image features, such as color and texture, and, to some extent, size and shape. These image features ignore important spatial relationships (Vatsavai et al. 2010a), without which complex (compound) features that relate to facilities such as manufacturing or school facilities cannot accurately be discovered. The role of spatial relations has been investigated in annotating geospatial data using complex concepts (Klien and Lutz 2005; Klien 2007). Spatial analysis methods are associated with spatial relations in ontologies to derive annotations for datasets. An analysis algorithm is provided to combine primitive GIS operators to implement a set of spatial relations characterizing the complex concept. However, how to generate and formalize the analysis algorithm remains an open problem, and a workflow description might be useful for this task (Klien and Lutz 2005). The work in this paper extends the previous work in two ways. The first way is to combine the classical simple feature extraction from remote sensing with the spatial relation approach of Klien and Lutz (2005). The second one is to introduce a workflow and service approach for computing spatial relations. The paper proposes a service-oriented approach for discovering compound geospatial features. It advocates the use of spatial relations to orchestrate services as chains. A service-oriented framework is proposed to support the chaining for feature discovery. The framework uses Web services to access elementary features and spatial computation algorithms, and composes workflow-based service chains for on-demand feature discovery. The contribution is discovery of geospatial compound features by orchestrating services that compute spatial relationships among elementary features. It takes advantage of spatial characteristics of complex features and leverages the workflow and chainable service technologies under a Service-Oriented Architecture (SOA) (Papazoglou 2003). The proposed system framework is open, interoperable, and flexible. A prototypical implementation is provided to illustrate the feasibility of the approach. The remainder of the paper is organized as follows. Section 2 introduces examples of complex features, and highlights issues to be investigated in a Web service context.
Section 3 describes the related work. Section 4 presents the strategy on orchestration of services using spatial relations, and Section 5 describes a service-oriented framework to support the service orchestration. A prototypical implementation is presented in Section 6. Section 7 discusses the approach. Conclusions and pointers to future work are given in Section 8.
Complex geospatial feature discovery in a Web service context Complex geospatial features Complex geospatial features are spatially composed of elementary ground features. Figure 1(a) shows a nuclear power plant as a complex feature. It consists of a group of ground features (e.g., buildings for hosting fuel concentration machines, cooling towers, transportation roads, and fences). Detecting such a facility from geospatial imagery is a promising approach for characterizing proliferation of Weapons of Mass Destruction (WMD) (including nuclear). Figure 1 (b) illustrates a school facility composed of buildings as classrooms, sporting fields, and tennis courts. In both cases, the elementary features are spatially arranged and connected, and are located in a specific area. In the case of the nuclear power plant, containment buildings are surrounded by a fence and are near the cooling tower and switch yard; in the case of the school facility, buildings are near both the field and court. An intelligence analyst can identify components and their spatial arrangement patterns from remotely sensed images to detect a possible complex feature manually. Detecting a facility such as a nuclear power plant or school on images can be conducted in the following steps: (1) Extracting ground features from the images; (2) Having knowledge of the necessary components and their spatial arrangement pattern for specific types of complex features; (3) Finding the spatial relationships among the features that match the specific pattern for a type of complex feature. The current state of the art in image analysis approaches mainly exploits image features, such as color, texture, and shape (Vatsavai et al. 2010a). The algorithms for extracting ground elementary geospatial features, such as buildings, fences, bridges, railways, railway stations, and airports, from satellite images and other sources are mature (e.g., Gruen et al. 1995; Mena 2003; Sohn and Dowmana 2007; Mohammadzadeh and Zoej 2010). Therefore, step 1 is not the concern in this paper. In addition, we assume that intelligence analysts have the knowledge of the complex features. However, they need powerful software tools to find
Earth Sci Inform
Fig. 1 Examples of complex features
spatial relationships among simple features for detection of possible complex features in massive volumes of remotely sensed images. Such findings provide semantic content in understanding the images. The results will provide decision support to intelligence analysts for further investigation. Web services and workflow technologies can be used to develop such software tools.
of spatial relations is to be conducted in a distributed computing environment using service technologies, shortly known as the service computing environment (Papazoglou 2003). In this context, the following two issues need to be addressed primarily: &
Issues for complex feature discovery in a service environment A Web Service is a software system designed to support interoperable machine-to-machine interaction over a network (Booth et al. 2004). Web Service technologies are a set of technologies for the implementation of SOA. SOA is a way of reorganizing a portfolio of previously siloed software applications and supporting infrastructure into an interconnected set of services, each accessible through standard interfaces and messaging protocols (Papazoglou 2003). There are three key actors in SOA: requestor, provider and broker. The requestor is the user who requires the information services. The provider is the standards-based individual service. The broker is a meta-information repository (e.g., a registry, catalog or clearinghouse). The interactions among these actors involve the operations of publishing, finding and binding. Service composition introduces a new operation into SOA, chaining, which combines services into a dependent series to accomplish a complex task. The work in this paper focuses on the use of spatial relations in discovering complex features. The computation
&
How to translate the computation of spatial relations into a sequence of geoprocessing services, which run as a service chain and can be invoked to discover features that conform to a characteristic set of spatial relations? The answer to this question includes the specification of both service components and workflows on chaining service components. What is the system framework to support the feature discovery in the service computing environment? In particular, the building blocks such as elementary features, geoprocessing services, and workflows should be well positioned in an architecture to support the feature discovery. They should be published, discovered and integrated together to support the feature discovery process.
The first issue requires the understanding of workflows for complex feature discovery in terms of their inputs, outputs, and service components. This will be satisfied by a chaining strategy proposed in Section 4. The second issue can be addressed by adopting the publish-find-bind paradigm of SOA, which motivates the proposed service-oriented framework for feature discovery in Section 5. There are certainly other issues that can be addressed. For example, the approach uses only spatial relations in
Earth Sci Inform
detecting complex geospatial features. There are cases in which semantics of complex features such as non-spatial relations (e.g. taxonomy relations) or attributes (e.g. shape) are required to get more reasonable results (Klien and Lutz 2005). The paper concentrates on spatial relations driven workflows and architecture design for feature discovery. Other issues can be addressed in the future work.
Related work Feature extraction from remote sensing images can generate geospatial data at the feature level in a vector format and provide up-to-date geospatial data in a Spatial Data Infrastructure (SDI) at low cost and short times (Mansourian et al. 2008). Mansourian et al. (2008) present the design and implementation of a feature extraction Web service. An automatic feature extraction algorithm is provided in the Web service to allow online extraction of roads from satellite images. The extraction of elementary features such as buildings and roads has been studied extensively (Gruen et al. 1995; Baltsavias 2004; Mena 2003; Naouai et al. 2011; Michaelsen et al. 2010). High-resolution satellite images can be fused with data from other sources such as LiDAR for elementary feature extraction (Sohn and Dowmana 2007; Awrangjeb et al. 2010). The discovery of compound geospatial features is still rather understudied. Research from topographic sciences suggests an ontology-based spatial pattern recognition approach (Lüscher et al. 2009). For example, a terraced house is defined by its relations to other concepts such as yard, building, and house. Such a spatial pattern for the terraced house can be modeled using ontologies consisting of spatial entities and spatial predicates (i.e., spatial properties and relations). When spatial predicates are generated from spatial operations in a geographic information system (GIS) and exported in the knowledge base, the pattern classification process, such as Bayesian inference, can be carried out to infer instances of concepts defined in the ontology (Lüscher et al. 2009). Ontologies can be represented using Semantic Web (Berners-Lee et al. 2001) technologies. The Web ontology language (OWL) (W3C 2009), recommended by W3C as the standard Web ontology language, is designed to define ontologies based on a flexible graph model composed of Resource Description Framework (RDF) (Klyne and Carroll 2004) triples (subject, predicate, object). Once ontologies are expressed in RDF, the identification and extraction of complex features might depend on techniques such as subgraph isomorphism because the complex feature exists as a subgraph in the broader RDF graph from the Semantic Web
(Varanka and Jerris 2010; Varanka 2011). The work in this paper uses geoprocessing services and workflows for spatial pattern recognition, with an emphasis on distributed computing using service technologies. It provides flexibility for adjusting the implementation (i.e. orchestrating different geoprocessing services) according to the spatial characteristics of complex features. In the context of SOA, spatial analysis functions are provided as loosely-coupled Web services, and chained together as workflows to execute complex geoprocessing tasks (Kiehle et al. 2007; Brauner et al. 2009). The Open Geospatial Consortium (OGC) has defined a series of specifications for geospatial Web services, including Web Feature Service (WFS) (Vretanos 2010), Web Map Service (WMS) (de la Beaujardière 2006), Web Coverage Service (WCS) (Baumann 2010), Sensor Observation Service (SOS) (Bröring et al. 2012), Catalogue Services for Web (CSW) (Nebert et al. 2007), and Web Processing Service (WPS) (Schut 2007). Jager et al. (2005) presented a workflow framework for composing and executing Web services in the Kepler system. Some efforts propose the use of the Web Services Business Process Execution Language (WSBPEL, BPEL for short) (OASIS 2007) to support geospatial service chains (Friis-Christensen et al. 2009; Zhao et al. 2012). Service chaining, or service composition, can use approaches from either service orchestration or service choreography. Service orchestration uses a centralized control to enact an executable business process that interacts with services, while service choreography emphasizes the collaboration and interaction among services (Peltz 2003). The service orchestration approach is widely adopted in industry. This paper relies on one industry-wide service orchestration standard, BPEL for specification of service chains. Three types of service chaining are defined in the OGC Abstract Service architecture (Percivall 2002) depending on the level of user control: user-defined (transparent), workflow-managed (translucent), and aggregate (opaque). In transparent chaining, the human user is responsible for invoking service components and controlling the chains such as passing around processing results. Translucent chaining allows a user to define a service chain using a workflow language such as BPEL. The execution of service chains is managed by workflow engines. In opaque chaining, the human user invokes a service that carries out the chain. The user has no awareness of the individual services in the chain. This type of chaining is often addressed by automatic service composition (Yue et al. 2007). Thus, chaining geospatial services can help provide transparent, translucent, and opaque platforms for geospatial feature discovery. The work in this paper focuses on the translucent approach. The service chains are designed by intelligence analysts, represented using BPEL, and executed by a workflow engine.
Earth Sci Inform
Service orchestration for complex feature discovery Complex geospatial features can be identified from the elementary features that make them, and the spatial relationships among the member features and with the surrounding environment (surrounding ground features). These characteristics describe the spatial pattern of complex features, and thus can be used for feature discovery. Given the assumption that all elementary features (e.g., building, roads, ponds, and power lines) can be extracted from images, discovering complex features on images can be decomposed into a series of steps computing spatial relations among elementary ground features in a specific area. Spatial relationships can be grouped into different categories: topological relations, distance relations, and direction relations (Shariff et al. 1998). Topological relations (e.g. disjoint) refer to properties such as adjacency, intersection, connectivity, and containment among spatial objects (Arpinar et al. 2006; Ellul and Haklay 2006). They are invariant under topological transformation, such as rotation, translation, and scaling (Egenhofer 1989). Distance relations (e.g. close and far) express the geographical distance among spatial objects, and reflect the concept of metric (Shariff et al. 1998). They change under scaling but stay invariant under translation and rotation. Direction relations (e.g. east and north), or called cardinal direction relations, denote relative directions among spatial objects (Arpinar et al. 2006). They are based on the existence of a vector space,
and are subject to change under rotation while stay invariant under translation and scaling of the reference frame (Shariff et al. 1998). Spatial relationships are often expressed using natural language terms (Shariff et al. 1998; Arpinar et al. 2006). These terms can be denoted as fuzzy relationships due to their inherent vagueness and precision associated with natural language expressions. Fuzzy relationships can be considered as combinations of several independent concepts (Egenhofer 1989). For example, the near relation can be seen as a combination of distance (metric) and intersection (topology), which can be determined using a combination of primitive operators: buffer and overlap. The primitive operators will be implemented by geoprocessing services. We can use functional notations to represent the steps for computing spatial relations among elementary features for identification of possible complex features. Each functional notation consists of a spatial relation predicate (e.g., near) and types of input features (e.g., Building). In the case of the nuclear power plant, let near(Building, CoolingTower) denote buildings near cooling towers, near(Building, SwitchYard) denote buildings near switch yards, and surroundedBy(Building, Fence) denote buildings surrounded by fences. The spatial characteristics of nuclear power plants include a containment building surrounded by a fence, near to the cooling tower and switch yard. The functional form identifying buildings at sites of possible nuclear power plants can be
BuildingInNuclearPlant ¼ nearðnearðsurrounded ByðBuilding; FenceÞ; CoolingTowerÞ; SwitchYardÞ
Similarly, in the case of a school, the functional form identifying buildings in sites of possible school facilities can be
FeatureBuilding In School
Building In School ¼ nearðnearðBuilding; FieldÞ; CourtÞ
where Feature represents the member features of the complex features and the service’s are a set of spatial analysis services that select the first type of features by features from the second type. The selections are based on the specific topological spatial relationship. The nine-intersection topological relationship models (Clementini et al. 1993; Egenhofer and Herring 1990) are widely used and adopted by OGC (Herring 2006). Types of primitive relationships (operators) include equals, disjoint, intersects, touches, crosses, within, contains, and overlaps. The functional representation helps in understanding the chaining strategy for feature discovery, i.e., the final features discovered can be described solely in terms of elementary features and a set of feature selection services. The functional representation of the workflow also implies that when elementary features are generated from remote sensing
where near(Building, Field) denote buildings near fields, and near(Building, Court) denote buildings near courts. An important characteristic of this representation is that the sites of complex features can be identified by one type of its elementary features (e.g., building) following the specific spatial relationships with other types of elementary features. This functional representation is useful in generating service chains. If we define services for implementing each binary functional notation, then a service chain to discover possible school features, called a workflow, can be represented as follows:
¼ service nearðservice nearðBuilding; FieldÞ; CourtÞ
Earth Sci Inform
images, the feature selection services with specific spatial operators need only be defined appropriately in order to find the sites of the complex features. From the service-oriented perspective, a key development is to provide comprehensive atomic services as building blocks for service orchestration. In the context of this paper, feature selection services, which support a set of primitive relationships between any two classes of features in point, line, and polygon types, are such blocks for computing the complex spatial relations among multiple types of features. In the functional notations, some spatial predicates such as the near relation can be thought as a fuzzy relation. Taking the near relation as an example, according to Denofsky (1976), the definition of near can be based on some cutoff point. Two objects can be defined to be near each other whenever their distance is less than some threshold (Denofsky 1976). Thus, near can be a touches relationship or within 100 m distance; the latter case requires the combination of a buffer and overlap operations. Rather than attempt to represent the fuzziness of terms such as near, one alternative approach is to rely on users to unambiguously specify the meaning of a spatial relation (Robinson 1990). In service orchestration, depending on the context and semantic knowledge of intelligence analysts, selection and combination of services for implementing predicates are adjustable. Thus, empowering intelligence analysts with the orchestration capabilities is a “white box” approach for implementing spatial relation predicates in the functional notations, whereas a “black box” approach associates the predicate with a fixed implementation (Klien 2007). Such a “white box” approach shares the advantages of increased flexibility and transparency with the work by Klien (2007), whose work is in an ontology context and uses rules for adjusting the semantics of spatial predicates.
A service-oriented framework for feature discovery Once it is understood that feature discovery chains can be described in terms of elementary features and a set of spatial analysis services, the next step is to provide an architecture that can support the publishing, discovery, and binding of these features and services to support the feature discovery chains. This is addressed by the service-oriented framework described in this section. In this framework, Web-based distributed geospatial data and geoprocessing services such as feature selection services can be plugged in and chained for feature discovery. Such a plug-and-play system depends on the use of interoperable services. Standards from the OGC, the World Wide Web Consortium (W3C), and the
Organization for the Advancement of Structured Information Standards (OASIS) will be adopted in the system. Figure 2 illustrates the framework of the feature discovery system. It consists of the following components: (1) Decision support client: This module provides graphical user interfaces (GUI) for a series of functions, among them data access, visualization, and analysis, to support users discovering features. The project management component allows users to create, configure, and save projects on feature discovery. The client can not only provide discovery, retrieval, and integration of geospatial data from multiple sources according to users’ needs, but also find and invoke individual Web geoprocessing services. A series of geoprocessing steps, conducted by human users by invocation of each service from the client, can be considered as a service chain defined and managed by human users to support the discovery of complex features. This is a type of transparent service chaining. The workflow designer component allows users to conduct translucent chaining of geoprocessing services in an interactive user interface. The interface provides a drag-and-drop mode to design workflow-based service chains. (2) Catalogue service: In a distributed environment, geospatial resources such as elementary features, images, and services for feature discovery are cataloged in a registry/broker with their descriptive metadata. Elementary features can be discovered by either 1) performing new feature extraction from high resolution remote sensing images or 2) accessing existing feature repositories. Both existing features and the new features extracted from images can be published in the registry. The OGC CSW is an industry consensus regarding an open, standard interface to online catalogs for geographic data and services. It can be used to advertise and discover geospatial resources needed in the feature discovery. The ebRIM standard is defined by OASIS and selected by OGC as the information model for specifying how catalogue content is structured and interrelated (Martell 2008). The ebRIM can be extended using ISO 19115 Geographic Information - Metadata (ISO 2003) (including part 2: Extensions for imagery and gridded data) and ISO 19119 Geographic Information - Services (ISO 2005). It can also be used to develop a feature type catalogue (Stock et al. 2010). Metadata for geospatial data and services can then be structured and organized in catalogue services using the ebRIM model. (3) Feature discovery chains: Feature discovery chains are generated by workflow designers. The OASIS BPEL standard can be used to describe service chains, since
Earth Sci Inform Fig. 2 A service-oriented system framework for feature discovery
there is no similar standard available yet in the geospatial domain, and the OGC has adopted BPEL for its interoperability experiments. An executable BPEL process can provide the process description for a service chain using activities, partners, and messages exchanged between these partners. It can be deployed into the BPEL execution engine and works as a Web service. A BPEL process has a corresponding service description document. Therefore, descriptions of a service chain can also be registered in CSW as a type of services (Yue et al. 2011). Once a set of spatial relations that can characterize complex features are determined, users can orchestrate feature discovery chains in workflow designers by dragging and dropping service components. (4) Geoprocessing services: Traditional spatial analysis functions are used in their own proprietary environments. They cannot support geoprocessing in an open Web environment. Web service technologies make them accessible on the Web. The Web Services Description Language (WSDL) is a standard for descriptions of services. It is recommended by W3C and widely used in the business world. In the geospatial domain, the OGC WPS specifies a standard interface and protocol for executing distributed geoprocessing processes. The OGC standards-compliant services can be aligned with the Web service standards from W3C by providing WSDL descriptions. To support feature discovery, spatial analysis services such as feature selection services and buffer services need to be provided. In addition, when only raw images are available, feature extraction services can be provided for on-demand extraction of elementary features. Other
utility services such as coordinate transformation and file format conversion can also be provided. (5) Geospatial data services: The data includes either the features that have been already extracted from images or the images that are to be mined. A geospatial data service allows geospatial services and value-added applications to access diverse data provided by
Fig. 3 Discovery of Web services from CSW
Earth Sci Inform
different providers in a standard way independent of their internal handling of data. The interface standards for the geospatial data services are the OGC Web Data Services Specifications such as WFS, WCS, and WMS . The design of the framework follows the publish-findbind paradigm of SOA. Although this paper focuses on supporting the application for feature discovery, the generality of the framework is in the use of geoprocessing services as the provider, catalogue services as the broker, and service chaining as value-added services. These rules can be applied to other SOA-based geospatial applications.
Implementation The geoprocessing services are implemented by using the Apache Axis toolkit and wrapping analysis components of the legacy GIS – the Geographic Resources Analysis Support System, commonly referred to as GRASS GIS (GRASS 2011). The services implement Web request/response handlers and invoke the GRASS scripts for algorithm execution. The services are deployed in the Jakarta Tomcat server. The development framework for the GRASS
Fig. 4 Invocation of the feature selection service in GeOnAS
service has been introduced in Li et al. (2010). Here we use this framework and have added the feature selection and buffer services. The GEOS (Geometry Engine - Open Source) software (GEOS 2011) must be installed with GRASS in the Web server to enable the nine-intersection topological relationship operators in the feature selection service. The registration and discovery of simple features and geospatial services use an existing catalogue service (Wei, et al. 2005). Figure 3 shows the discovery user interface. Users can input keywords (e.g. feature selection) or select one among tree items to find the relevant services registered in database. The GeoBrain Online Analysis System (GeOnAS) (Han et al. 2008) is used for interactive data access, visualization, and feature analysis. GeOnAS provides Asynchronous JavaScript and XML (AJAX) based GUI components to create a better Web experience for the end users. Figure 4 shows the invocation interface of the feature selection service in the GeOnAS. The interface is generated automatically by parsing the WSDL description of the service. Using this GUI, it is possible to answer the following sample queries in feature discovery by using different spatial operators in the feature selection services:
Earth Sci Inform
Fig. 5 Service chaining for the feature discovery
& & &
Find buildings surrounded by fences Find buildings near railways Find roads across forests
GeOnAS also provides a project management module to allow users to conduct geoprocessing functions in a project step-by-step. It can be used as a transparent chaining platform for geospatial feature discovery when multiple feature analysis services are needed. A more complex query may include complex features. For example, an intelligence analyst who wants to find schools near railways in the Providence District, Fairfax
County, Virginia, U.S., can go through the following steps to design the service chain using the workflow designer. Here, a robust and free designer, the JDeveloper BPEL Designer from Oracle BPEL Process Manage 10.1.2, can be used. If the railway has been extracted from the highresolution image of Providence District, Fairfax County, Virginia, U.S., the data is accessible through WFS provided by the open source software – GeoServer (GeoServer 2011). The intelligence analyst specifies the use of a buffer (with the width 1,000 m) and feature selection (using the overlap operator) service to implement the fuzzy relation - near.
Earth Sci Inform
These two services, connected with the service chain (two feature selection services using the touches operator) for identifying the school (buildings near both the field and court) in Section 4, can be designed by the intelligence analyst. The data flow and control flow are illustrated in Fig. 5. Maps from each invocation of services in Fig. 5 can provide step-by-step informed understanding of how spatially correlated features are selected. The service chain runs as a Web service in the Oracle BPEL engine and can be invoked by GeOnAS. Figure 6 illustrates the invocation results of the service chain in GeOnAS. It shows the resulting building features, together with the railway, field, and court nearby, on the images. The result can provide the decision support in identifying possible sites of school facilities. Discussion In an interoperable environment, the service chains/workflows for the discovery of complex features are built from service modules by different vendors and can be executed across the different systems. The capability of the serviceoriented approach to feature discovery is not limited to simple feature discovery enabled by the individual service modules for computing spatial relationships (operator). The
Fig. 6 Decision support result
service chains allow on-demand discovery of complex features, and can be reused by others in the community as part of a whole system implementation. The use of geoprocessing service orchestration to discover compound features is not restricted to a specific type of compound feature or application area. It can work for different applications using flexible composition of services that compute spatial relations. The capabilities of the system for feature discovery increase by accommodating new services for feature extraction and new service chains for feature discovery. In the school facility scenario, if a railway is not available, an extraction service for railway features, a raster-to-vector conversion service, and a transactional Web Feature Service (WFS-T) can be chained to allow access to the railway feature. The system is also maintainable through the upgrade or replacement of existing services/chains. The use of service orchestration for feature discovery takes advantage of the spatial characteristics of complex features. The extraction of semantic information and semantic labeling of the features in high-resolution images is a very actively developing area of remote sensing (Tobin et al. 2006; Gleason et al. 2010; Vatsavai et al. 2010a, b). Typically, such algorithms use training data in the form of image segments with known objects and then use various statistics to match the training data with the imagery. Even though
Earth Sci Inform
effective for many purposes, such one-step approaches are likely to fail when there are subtle differences between the complex features on an image. For example, presence of an industrial chimney is a salient feature distinguishing a nuclear power plant from a coal-firing plant. However, a chimney is a relatively small feature in the planar view that is unlikely to produce a distinguishable effect in matching statistics. The approach described in this paper can be considered as a two-step approach, with step 1 being to identify the location and type of elementary ground features (such as buildings and roads) from high-resolution imagery, which has relatively mature technologies, and step 2 being to extract high-level semantic information (such as nuclear fuel concentration sites and school facilities) by discovering compound ground features from spatial relationships among the elementary features. The spatial characteristics of complex features are used when intelligence analysts orchestrates service chains. So far we have only considered the cases of nuclear power plants and schools, and tested the approach on school facilities using the system. The spatial characteristics of facilities are complex. For example, different spatial relationships such as adjacency, intersection, connectivity, and containment might exist among elementary features. Experiments conducted in this paper, although relatively simple, can illustrate the rationale of the approach. The goal here is to propose such vehicles as spatial analysis services and workflows to allow various spatial relations to be implemented. When more kinds of spatial relations are involved, the actual service chain should be designed case by case. The implementation of services and service chains is not limited to the technologies used in this paper. OGC Web services, W3C SOAP-based Web services, and RESTful services are available for implementation. Some efforts have been devoted to make them work together, such as defining WSDL for OGC services (Sonnet 2005), and using WSDL 2.0 as the bridge between REST and W3C Web service (W3C 2007; Lucchi et al. 2008). In addition to the BPELbased service chaining approach, there is an OGC WPS approach for Web Service Orchestration (Stollberg and Zipf 2007). However, a comparative analysis shows that the BPEL-based implementation is more mature (FriisChristensen et al. 2009).
Conclusions and future work This paper presents a service-oriented approach for discovery of complex features. Geoprocessing services that compute spatial relations are orchestrated by intelligence analysts using the spatial characteristics of complex features. A chaining strategy is proposed to discover sites of facilities by orchestrating services that compute spatial relations among elementary features. A reference
framework and a prototype system for feature discovery are provided. The framework uses Web services to access elementary features and legacy spatial computation algorithms, OGC CSW for locating geospatial data and services, and BPEL workflows for orchestrating geoprocessing services. The implementation incorporated into GeoBrain can be used to support online analysis of features. Both transparent and translucent chaining approaches can be used for geospatial feature discovery, providing flexibility for adjusting the implementation according to the spatial characteristics of complex features. The approach demonstrates how spatial relations can be used to support the discovery of complex features in a service-oriented environment. The openness and interoperability of the system allows either spatial relation computation or new feature extraction services to be plugged in easily. The service chains can be archived and reused for on-demand discovery of features. The vagueness of some spatial relations such as fuzzy relations affects the selection and combination of geoprocessing services to implement them. The parameter values such as the distance as one input to the buffer service in implementing the near relation are also determined by the context. Although the work in this paper relies on users to unambiguously specify the meaning of a fuzzy spatial relation, the future work will investigate the expression of fuzziness of spatial relations and its role on service selection and combination. Although intelligence analysts can orchestrate the service chains based on the spatial characteristics of complex features, such characteristics as part of semantic knowledge could be formalized and expressed explicitly to assist automatic or semi-automatic service orchestration. The spatial characteristics might also be combined with non-spatial relations such as taxonomy relations to achieve reasonable results. For example, schools have sub-categories like high schools and elementary schools. Some high schools have a tennis court, while elementary schools do not. Often, an elementary school has an outdoor playground with equipment. Consequently, knowledge representation approaches such as ontologies and rules can contribute to the establishment of a knowledge base for complex features. Future work will consider a knowledge-driven intelligent service system that can combine ontologies and rules with service computing technologies for service planning and result refinement. Acknowledgments We are grateful to the anonymous reviewers, and to Dr. Barry Schlesinger for their valuable comments. This work was funded jointly by U.S. Department of Energy (grant #DE-NA0001123, PI: Prof. Liping Di), National Basic Research Program of China (2011CB707105), and Project 41023001 supported by NSFC.
Earth Sci Inform
References Arpinar IB, Sheth A, Ramakrishnan C, Usery EL, Azami M, Kwan M (2006) Geospatial ontology development and semantic analytics. Trans GIS 10(4):551–575 Awrangjeb M, Ravanbakhsh M, Fraser CS (2010) Automatic detection of residential buildings using LIDAR data and multispectral imagery. ISPRS J Photogramm Remote Sens 65(5):457–467 Baltsavias EP (2004) Object extraction and revision by image analysis using existing geodata and knowledge: current status and steps towards operational systems. ISPRS J Photogramm Remote Sens 58(3–4):129–151 Baumann P (2010) OGC® WCS 2.0 Interface Standard - Core, Version 2.0.0, OGC 09-110r3, Open Geospatial Consortium, Inc., 53 pp Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43 Booth D, Haas H, McCabe F, Newcomer E, Champion M, Ferris C, Orchard D (2004) Web services architecture. W3C Working Group Note 11 February 2004, W3C, http://www.w3.org/TR/ ws-arch/. Accessed 12 April, 2012 Brauner J, Foerster T, Schaeffer B, Baranski B (2009) Towards a research agenda for geoprocessing services. In: Proceedings 12th AGILE International Conference on Geographic Information Science, Hannover, Germany, pp 1–12 Bröring A, Stasch C, Echterhoff J (2012) OpengGIS® Sensor Observation Service Interface Standard, Version 2.0, OGC 12-006, Open Geospatial Consortium, Inc., 163 pp Clementini E, Felice PD, van Oosterom P (1993) A small set of formal topological relationships suitable for end-user interaction. In: Proceedings International Symposium on Large Spatial Databases, Singapore, pp 277–295 de la Beaujardière J (2006) OpenGIS® Web Map Server Implementation Specification. Version 1.3.0, OGC 06-042, Open Geospatial Consortium, Inc., pp 85 Denofsky ME (1976) How near is near? A near specialist. AI Memo No. 344, MIT AI Lab, Cambridge, Massachusetts, pp 75 Di L, McDonald K (1999) Next generation data and information systems for earth sciences research. In: Proceedings first international symposium on digital earth, vol. I., Science Press, Beijing, China, pp 92–101 Di L, Ramapriyan HK (eds) (2010) Standard-based data and information systems for earth observation. Springer publication, German, p 248 Egenhofer MJ (1989) A formal definition of binary topological relationships. In: Proceedings of the 3rd international conference on foundations of data organization and algorithms, FODO 1989, Paris, France, Lecture Notes in Computer Science (LNCS) 367, pp 457–472 Egenhofer MJ, Herring J (1990) A mathematical framework for the definition of topological relationships. Proceedings of the fourth international symposium on spatial data handling, Columbus, OH, pp 803–813 Ellul C, Haklay M (2006) Requirements for topology in 3D GIS. Trans GIS 10(2):157–175 Friis-Christensen A, Lucchi R, Lutz M, Ostlinder N (2009) Service chaining architectures for applications implementing distributed geographic information processing. Int J Geogr Inf Sci 23(5):561–580 Frome A, Singer Y, Sha F, Malik J (2007) Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: Proceedings IEEE 11th International Conference on Computer Vision (ICCV 2007), pp 1–8 GEOS (2011) Geometry Engine, Open Source. http://trac.osgeo.org/ geos/. Accessed November 07 2011 GeoServer (2011) Open Source Geospatial Foundation. http://geoserver.org/display/GEOS/Welcome. Accessed November 07, 2011
Gleason S, Ferrell R, Cheriyadat A, Vatsavai RR, De S (2010) Semantic information extraction from multispectral geospatial imagery via a flexible framework. In: Proceedings 2010 IEEE International Geoscience and Remote Sensing Symposium (IGARSS2010), pp 166–169 GRASS (2011) Geographic Resources Analysis Support System (GRASS), Open Source Geospatial Foundation. http://grass.fbk.eu/. Accessed November 07, 2011 Gruen A, Kuebler O, Agouris P (eds) (1995) Automatic extraction of man-made objects from aerial and space images. Birkhäuser Verlag, Basel, Switzerland, p 340 Han W, Di L, Zhao P, Wei Y, Li X (2008) Design and implementation of GeoBrain online analysis system (GeOnAS). In: Bertolotto M, Ray C, Li X (eds) Proceedings 8th International Symposium on Web and Wireless Geographical Information System, Lecture Notes in Computer Science (LNCS) 5373, pp 27–36 Herring JR (ed) (2006) OpenGIS implementation specification for geographic information – simple feature access – Part 1: common architecture. OGC 06-103r4, Open Geospatial Consortium Inc., pp 93 ISO (2003) ISO 19115:2003: geographic information – metadata. International Organization for Standardization, Geneva, Switzerland, p 140 ISO (2005) ISO 19119:2005 geographic information – services. International Organization for Standardization, Geneva, Switzerland, p 67 Jager E, Altintas I, Zhang J, Ludascher B, Pennington D, Michener W (2005) A scientific workflow approach to distributed geospatial data processing using web services. In: Proceedings 17th international conference on scientific and statistical database management, Santa Barbara, USA, pp 87–90 Karantzalos K, Argialas D (2009) A region-based level set segmentation for automatic detection of man-made objects from aerial and satellite images. Photogramm Eng Remote Sens 75(6):667–677 Kiehle C, Heier C, Greve K (2007) Requirements for next generation spatial data infrastructures-standardized web based geoprocessing and web service orchestration. Trans GIS 11 (6):819–834 Klien E (2007) A rule-based strategy for the semantic annotation of geodata. Trans GIS 11(3):437–452 Klien E, Lutz M (2005) The role of spatial relations in automating the semantic annotation of geodata. In: Proceedings of the Conference on Spatial Information Theory (COSIT’05), Ellicottville, New York, pp 133–148 Klyne G, Carroll JJ (eds) (2004) Resource Description Framework (RDF): concepts and abstract syntax. World Wide Web Consortium (W3C), http://www.w3.org/TR/2004/REC-rdf-concepts20040210/. Accessed 19 April 2012. Li X, Di L, Han W, Zhao P, Dadi U (2010) Sharing geoscience algorithms in a Web service-oriented environment (GRASS GIS example). Comput Geosci 36(8):1060–1068 Lucchi R, Millot M, Elfers C (2008) Resource oriented architecture and REST. Technical report, European Commission, Joint Research Centre, pp 16 Lüscher P, Weibel R, Burghardt D (2009) Integrating ontological modelling and Bayesian inference for pattern classification in topographic vector data. Comput Environ Urban Syst 33 (5):363–374 Mansourian A, Zoje MJV, Mohammadzadeh A, Farnaghi M (2008) Design and implementation of an on-demand feature extraction web service to facilitate development of spatial data infrastructures. Comput Environ Urban Syst 32(5):377–385 Martell R (ed) (2008) CSW-ebRIM registry service—part 1: ebRIM profile of CSW. Version 1.0.0, OGC 07-110r2, Open Geospatial Consortium, Inc., pp 57 Mena JB (2003) State ofthe art onautomatic road extraction for GIS update: a novel classification. Pattern Recognit Lett 24(16):3037–3058 Michaelsen E, Stilla U, Soergel U, Doktorski L (2010) Extraction of building polygons from SAR images: grouping and decision-level in the GESTALT system. Pattern Recognit Lett 31(10):1071–1076
Earth Sci Inform Mohammadzadeh A, Zoej MJV (2010) A self-organizing fuzzy segmentation (SOFS) method for road detection from high resolution satellite images. Photogramm Eng Remote Sens 76(1):27–35 Naouai M, Hamouda A, Akkari A, Weber C (2011) New approach for road extraction from high resolution remotely sensed images using the quaternionic wavelet. Lect Notes Comput Sci 6669:452– 459 Nebert D, Whiteside A, Vretanos P, (eds) (2007) OpenGIS@ catalog services specification. Version 2.0.2, OGC 07-006r1, Open GIS Consortium Inc., pp 218 OASIS (2007) Web services business process execution language. version 2.0. Web Services Business Process Execution Language (WSBPEL) Technical Committee (TC), pp 264 Papazoglou MP (2003) Service-oriented computing: concepts, characteristics and directions. In: Proceedings 4th International Conference on Web Information Systems Engineering (WISE 2003), pp 3–12 Peltz C (2003) Web services orchestration and choreography. Computer 36(10):46–52 Percivall G (ed) (2002) The OpenGIS abstract specification, topic 12: OpenGIS service architecture. Version 4.3. OGC 02-112. Open Geospatial Consortium, Inc., pp 78 Robinson VB (1990) Interactive machine acquisition of a fuzzy spatial relation. Comput Geosci 16(6):857–872 Schut P (2007) OpenGIS® web processing service, version 1.0.0, OGC 05-007r7, Open Geospatial Consortium, Inc., pp 87 Shariff A, Egenhofer M, Mark D (1998) Natural-language spatial relations between linear and areal objects: the topology and metric of English-language terms. Int J Geogr Inf Sci 12 (3):215–246 Sohn G, Dowmana I (2007) Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction. ISPRS J Photogramm Remote Sens 62(1):43–63 Sonnet J (ed) (2005) OWS 2 common architecture: WSDL SOAP UDDI. Version: 1.0.0. OGC 04-060r1. Open Geospatial Consortium, Inc., pp 76 Stock K, Atkinson R, Higgins C, Small M, Woolf A, Millard K, Arctur D (2010) A semantic registry using a feature type catalogue instead of ontologies to support spatial data infrastructures. Int J Geogr Inf Sci 24(2):231–252 Stollberg B, Zipf A (2007) OGC web processing service interface for web service orchestration – aggregating geo-processing services in a bomb threat scenario. In: Proceedings 7th International Symposium of Web and Wireless Geographical Information Systems (W2GIS 2007), Cardiff, UK, Lecture Notes in Computer Science (LNCS) 4857, pp 239–251 Tobin KW, Bhaduri BL, Bright EA, Cheriyadat A, Karnowski TP, Palathingal PJ, Potok TE, Price JR (2006) Automated feature
generation in large-scale geospatial libraries for content-based indexing. Photogramm Eng Remote Sens 72(5):531–540 Vannan SKS, Cook RB, Pan JY, Wilson BE (2011) A SOAP web service for accessing MODIS land product subsets. Earth Sci Inform 4(2):97–106 Varanka D (2011) Ontology patterns for complex topographic feature types. Cartogr Geogr Inf Sci 38(2):126–136 Varanka DE, Jerris TJ (2010) Complex topographic feature ontology patterns. In: Proceedings AutoCarto 2010, Orlando, Florida, USA, pp 5 Vatsavai RR, Bhaduri B, Cheriyadat A, Arrowood L, Bright E, Gleason S, Diegert C, Katsaggelos A, Pappas T, Porter R, Bollinger J, Chen B, Hohimer R (2010a) Geospatial image mining for nuclear proliferation detection: Challenges and new opportunities. In: Proceedings 2010 IEEE International Geoscience and Remote Sensing Symposium (IGARSS2010), pp 48–51 Vatsavai RR, Cheriyadat A, Gleason S (2010b) Unsupervised semantic labeling framework for identification of complex facilities in high-resolution remote sensing images. In: Proceedings 2010 IEEE International Conference on Data Mining Workshops (ICDMW), Sydney, Australia, pp 273–280 Vretanos PA (2010) OpenGIS Web Feature Service 2.0 Interface Standard. Version 2.0.0, OGC 09-025r1, Open Geospatial Consortium, Inc., pp 253 W3C (2007) Web Services Description Language (WSDL) 2.0, World Wide Web Consortium (W3C). http://www.w3.org/TR/2007/ REC-wsdl20-adjuncts-20070626/#_http_binding_default_rule_ method. Accessed 16 October, 2011 W3C (2009) OWL 2 Web Ontology Language Document Overview. World Wide Web Consortium (W3C). http://www.w3.org/TR/ owl2-overview/. Accessed 19 April 2012 Wei Y, Di L, Zhao B, Liao G, Chen A, Bai Y, Liu Y (2005) The design and implementation of a grid-enabled catalogue service. In: Proceedings 25th Anniversary of IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2005), COEX, Seoul, Korea, pp 4224–4227 Yue P, Di L, Yang W, Yu G, Zhao P (2007) Semantics-based automatic composition of geospatial Web services chains. Comput Geosci 33(5):649–665 Yue P, Gong J, Di L, He L, Wei Y (2011) Integrating semantic web technologies and geospatial catalog services for geospatial information discovery and processing in cyberinfrastructure. GeoInformatica 15(2):273–303 Zhao P, Di L (eds) (2010) Geospatial Web Services: advances in information interoperability. IGI Global publisher, Hershey, p 552 Zhao P, Di L, Yu G (2012) Building asynchronous geospatial processing workflows with web services. Comput Geosci 39(2):34–41