Computers & Geosciences 47 (2012) 119–129
Contents lists available at SciVerse ScienceDirect
Computers & Geosciences journal homepage: www.elsevier.com/locate/cageo
Using SensorML to construct a geoprocessing e-Science workflow model under a sensor web environment Nengcheng Chen n, Chuli Hu, Yao Chen, Chao Wang, Jianya Gong State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan, Hubei 430079, China
a r t i c l e i n f o
abstract
Article history: Received 27 March 2011 Received in revised form 22 November 2011 Accepted 23 November 2011 Available online 6 December 2011
Many achievements in web-based geoprocessing focus on logically chaining Open Geospatial Consortium (OGC) web services, for example using Business Process Execution Language to orchestrate web services that are interfaced through the OGC Web Processing Service. For e-Science application in a sensor web environment, how to internally integrate the sensor system, observation, and processes (physical and non-physical) as a geoprocessing e-Science workflow model is a critical issue. The OGC Sensor Model Language offers the possibility to construct a geoprocessing e-Science workflow model in the form of observation processes. We propose a construction method for a geoprocessing e-Science workflow model that integrates logical and physical processes into a composite process chain for sensor observations. The three phases of geoprocessing e-Science workflow creation are abstract process chain modeling, process chain instantiation, and process chain workflow execution. An experiment on chaining-related sub-processes for deriving the Normalized Difference Vegetation Index of Hubei Province (China) was conducted to verify the feasibility of the proposed workflow model. & 2011 Elsevier Ltd. All rights reserved.
Keywords: Process model Process chain Geoprocessing e-Science workflow NDVI SensorML Sensor web
1. Introduction E-Science applications have grown in complexity in terms of data analysis tasks; workflow has emerged as an important enabling technology. Delin and Jackson (2001) first proposed the sensor web concept; since then it has evolved as a coordinated observation infrastructure with features such as interoperability, intelligence, dynamism, flexibility, and scalability (Liang et al., 2005; Di, 2007). The Sensor Web Enablement (SWE) initiative, conceived by the OGC, provides an API for managing deployed sensors and retrieving sensor observation data. A set of web services and information encodings developed as open standards. Service encoding specifications included the Sensor Observation Service (SOS) (Na and Priest, 2007), Sensor Planning Service (SPS) (Simonis and Dibner, 2007), and Web Notification Service (Simonis and Wytzisk, 2003). Information models included Sensor Model Language (SensorML) (Botts and Robin, 2007a), SWE Common Data (Robin, 2011), and Observation and Measurement (O&M) (Cox, 2007a, b). The developments over the past decade (Sensor Web 2.0 phase) have prompted more discussion about the exploitation of web service-oriented sensor webs (Alameh, 2003; Chu et al., 2006; Di, 2005; Broring et al., 2011).
n
Corresponding author. Tel.: þ86 27 68778586; fax: þ 86 27 68778229. E-mail address:
[email protected] (N. Chen).
0098-3004/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2011.11.027
Under the OGC Web Service (OWS)-4 initiative, the OGC developed the GeoProcessing Workflow (GPW), which represents a loosely coupled integrated web service system to handle complex data processing tasks in the SWE environment. The Geographic Resources Analysis Support System provides functionality to chain geospatial processing to the Internet through a Web Processing Service (WPS) (Schut, 2007). WPS defines a standardized interface that facilitates both geospatial process publishing and discovery and binding to those processes by clients. Stollberg and Zipf (2007) presented an approach of ‘‘Composite-WPS’’ using WPS interface itself as the alternative to the use of XML-based Business Process Execution Language (BPEL) (Curbera et al., 2006) for the orchestration of OWS, such as the Web Feature Service (WFS) (Vretanos, 2002), Web Coverage Service (WCS) (Whiteside and Evans, 2008), and Catalog Service for Web (CSW) (Voges and Senkler, 2005). Dadi and Di (2009) chained several atomic services to create a workflow that represents web service chains. In our previous work (Chen et al., 2010a), we proposed a general GPW framework for sensor web data services, which can improve the quality of services for sensor data retrieval and processing after converting an abstract GPW model to an executable description of a BPEL workflow. Chen et al. (2010b) presented a new data service for observations of atmospheric chemical composition based on the reusable SOAbased GPW, which can chain CSW, WCS, and WPS as automatic data services for observations. Furthermore, the authors Chen et al. (2010c) proposed an automatic sensor processing workflow
120
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
approach to chain EO-1 SPS, EO-1 SOS, GMU WCS, WHU WPS, and WHU WFS-T for updating spatial data and dynamic mapping in Antarctica. These achievements, however, mainly focus on the service chain-based GPW characterized by the separation of geoprocessing from the sensor observation model. SensorML is one of the OGC SWE information-encoding standards in which Botts and Robin (2007a) provide models and encodings to describe any kind of process in sensors or postprocessing systems, including physical and non-physical processes. All types of processes are discoverable and executable. These processes defined through their input, output, parameters, and methods provide relevant metadata. Therefore, SensorML process models are functional models of a sensor system and related observation data processes, which would aid in processing the entire complex sensor observation system. The process chain is a composite process of SensorML, linked with sub-process models (each of which may be single processes or process chains) by their input and output nodes. As SensorML has an intrinsic workflow description, it can serve as a mechanism for describing scientific workflows from the dataflow perspective, thereby facilitating completely distributed workflow descriptions on the web (Van Zyl and Vahed, 2009). Botts and Robin (2007b) proposed that SensorML processes be at least one of the methods for a WPS to describe the process. When invoking a SensorML-based process chain, the process chain execution engine parses the process chain description and runs the sub-process models in sequence. The data flows from one process module to the next as a stream (Regner et al., 2008). As mentioned above, BPEL and SensorML are two kinds of geoprocessing workflow implementation mechanisms in the Sensor Web environment. A web service-based WPS can logically encapsulate all kinds of processing. Hence, BPEL performs service orchestration to integrate external sub-processing encapsulated by WPS into a service chain-based workflow, controlling the entire service chain. This service chain-based GPW therefore cannot construct a model that completely integrates all sensor web resources, including sensors, data, algorithms of specific observation process, and models. For e-Science workflow applications in the sensor web environment involve, in addition to large observation data sets, diverse sensor resources and many geoprocessing algorithms. The sensor web resources and their observations divide into physical process and non-physical process (pure mathematical process). The physical process model describes a sensor observation procedure with the associated physical information (e.g. platform_relatedTo_sensors, time-space reference framework, position and location, interface of service or communication). The significant feature here is enabling rich nonphysical process to generate situational awareness from observations. BPEL cannot satisfy the requirements to handle internally the geospatial aspects of data and sensor observations. Therefore, the means to internally integrate the sensor system, observation, and processes as a geoprocessing workflow model for e-Science application in a sensor web environment is an interesting issue. The objective of this paper is two fold. First, it proposes an approach to constructing a geoprocessing workflow model in the form of dataflow-oriented process chain based on SensorML. The second objective is to verify the feasibility of applying the proposed method in a dataflow-oriented e-Science application. This paper contributes a proposal for a process chain based workflow model using SensorML. This model can internally integrate sensor web resources into a process chain representing a geoprocessing e-Science workflow. The sub-process model represents a sensor, its observation information model and sensor-related atomic observation process. The entire process chain based workflow is composed of those sub-process models facilitating easy discovery and association among sensor, observation data, and process.
The rest of the paper is organized as follows. Section 2 presents the method for modeling, instantiating, and executing the process chain model. Section 3 describes the experiments on Normalized Difference Vegetation Index (NDVI) geoprocessing for Hubei Province (China) to verify the proposed approach. Section 4 discusses the proposed approach in more detail, and Section 5 summarizes the conclusions and future directions.
2. Methodology A complete e-Science workflow contains typical phases such as data scheduling, access, processing/transformation, and visualization. We have studied data scheduling and access services (Chen et al., 2010a, b, c) in a sensor web environment. Van Zyl and Vahed (2009) reported that SensorML can be used to describe scientific workflows in a distributed web service environment. Therefore, we focus primarily on how to construct the geoprocessing e-Science workflow transforming raw data produced by the sensor web environment into a post-processed product using SensorML as the modeling standard. 2.1. Abstract model of the process chain Abstract process chain modeling aims to generalize the top process chain meta-model (Poole and Evans, 2008) containing nodes such as the metadataGroup, input and output, parameters, and process methods. The metadataGroup (i.e., keywords, identifiers, classifiers, constraints, capabilities, properties, contacts, and documentation sources), and parameters are common modules of a SensorML-based process model. These are unrelated to the implementation of the data observation process, but play a fundamental role in facilitating the understanding and discovery of the processes. The basic nodes of the process chain model regard data conversion and process links as the core, thus the basic elements of the process chain can be divided into data nodes and processing nodes. Furthermore, the process chain model defines the connections between data nodes and processing nodes. Therefore, it is possible to describe the logic structure of the process chain that can be described with a directed acyclic graph (DAG), as shown in Fig. 1. A DAG is a total process chain description that consists of directed edges between nodes, and these directed edges represent directions of logic structure and dataflow.
Fig. 1. Relationship between the logic and math models of the process chain.
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
The mathematical model of the process chain is defined as a triad, expressed as PS ¼ {D(PS), F(PS), C(PS)}. where D(PS) ¼{d1,d2, d3,y,dn}, (D(PS)a|) is the data node collection; F(PS) ¼{f1,f2, f3,y,fk}, (F(PS) a|) is the processing node collection; C(PS) ¼{c1,c2, c3,y,cm} is the connection collection. di is the input or output of these processes, and di can be the aggregate type, which means that the di node can own multi-input or multi-output. fi ¼[di,dj]; di is the input of one sub-process, and dj is the corresponding output. ci ¼[di,dj]; if di is the input of the total process chain and the dj will be the input of the first sub-process. If dj is the output of the total process chain, di will be the output of the last subprocess. If di is the output of the first or the mid sub-process, dj will be the input of its subsequence process. The corresponding relationship between the logic process chain model and mathematical model is shown in Fig. 1. As shown in the figure, the process chain model that comprises data nodes, processing nodes, connection nodes, and their flow reflects the characteristics of the process workflow that links the sub-process models. Users can choose data and combine processes in accordance with their practical needs. Given that the proposed abstract model maintains versatility in all geoprocessing models, it can be applied to any process description in a sensor system. This abstract process chain model describes the logic structure, but cannot be directly analyzed or executed by the driven engine. Only after it is instantiated into a concrete SensorML-based process chain instance (that is, all the flow and nodes have been mapped to a concrete implementation) can it be executed. 2.2. Process chain instantiation The core of this step is the transformation of the abstract, node-based, and logic process model into a SensorML-based executable process instance. The sensor observation process meta-model is used primarily to describe the related metainformation of the sensor observation system, data, and process. As the input/output (I/O) of the total processes is highly relevant to observation data, the data can include raster, feature, shape, and data services (e.g., SOS, WFS, WCS). The process method comprises the processing algorithms, source codes, and related binary compiler source files (Botts and Robin, 2007a). Similarly, the web-based process service can be linked to the abstract process chain to serve as the remote process method. Fig. 2 shows
121
the instantiation from the abstract model to the resource-filled process chain instance. The data (or data service) and local processing implementation (or processing service) are linked to the abstract process chain. The Sensor Observation Resource Manager (SORM) is a prototype developed by our team to provide the following functions: construction of a SensorML-based process/process chain model, retrieval of model-based process, and execution of the total process chain based workflow. The core of the SORM can establish the metadataGroup model for the entire sensor observation system, in addition to retrieve and bind with available data and process. The SORM stores local physical resources such as data models (e.g., GML, O&M, and SHP) and process method models (e.g., local processing implementation file). It also provides the interface such as CSW to enable the remote retrieval of the resources, including SOS, WFS, WCS, and WPS. As shown in Fig. 3, the entire instantiation depends on local SORM and CSW service. Users first need to identify the local resources in the SORM. If no corresponding resources have been satisfied according to the query, then they need to search the registered data and processing services using the CSW interface. If the satisfactory resources remain unavailable, CSW returns an instance error. The method for local resource retrieval is based on the fact that the SORM is a regular GIS query approach that involves ordinary Windows-based and Lucene-based full text queries. The web-based resources, such as WCS and WFS, are registered as ‘‘WCSCoverage’’ and ‘‘WFSLayer,’’ respectively, in GeoBrain, developed and maintained by the Center for Spatial Information Science and Systems (CSISS) of George Mason University (GMU). In the CSISS CSW, ‘‘WCSCoverage’’ and ‘‘WFSLayer’’ objects are the instances of ‘‘DataType’’ providing minimal metadata for registry coverage or feature data. In our previous work (Chen et al., 2009), which is based on this approach, we discussed a ‘‘SOS þCSW’’ mode for sensor observation data registry and discovery. All the retrieved data and process resources are integrated into the abstract process chain using the ‘‘Xlink’’ mechanism (Lizorkin, 2005) that allows one to link objects that are either internal or extend to the instance document. 2.3. Process chain execution A SensorML-based process engine must understand how to execute the sub-process models within a chain and support the
Fig. 2. Abstract process chain instantiation.
122
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
Fig. 3. Interaction between local and service-based discovery.
flow of data between these sub-process models. The process chain execution is described as follows: Step 1: The main task in this step is to parse the SensorMLbased process instances into an object that the engine can identify. We use the Document Object Model (DOM) parser, which can store variables/information (i.e., metadataGroup, input and output, type of input and output, and an ‘‘Xlink’’ reference to a source outside SensorML, etc.) related to the process chain in array lists, achieving conversion from the XML-based SensorML instance to the object node tree. In this manner, the SORM user can employ the inside application processing interface to more easily access, add, delete, or modify the node, attribute and text content of the tree through the DOM interface function. Step 2: The SensorML-based process chain logically links between these sub-processes through the connection node ‘‘sml:connections’’. The ‘‘sml:connections’’ node uses a ‘‘sml:link’’ object to reference ‘‘sml:source’’ and ‘‘sml:destination’’ of a connector, the ‘‘sml:source’’ is the node from where the data/attribute originates and ‘‘sml:destination’’ is the node that receives the data/
attribute. This step adopts the XPath technology.1 The ‘‘xlink’’ reference description inside of ‘‘sml:source’’ or ‘‘sml:destination’’ as path judges the accessibility of those processes linked in one process chain. It also uses the principle of adjacency matrix2 to verify the transportability of those processes, namely the transportability of data/attributes inside the connectors ‘‘sml:source’’ and ‘‘sml:destination’’. That is, the accessibility and transportability of the dataflow inside the process chain are guaranteed. Step 3: The router of this step records each sub-process node’s condition and its precursor/subsequent position for sorting the processes and running them as needed. Once dthisProcessChain_input passes to dfirstSubProcess_input, ffirstSubProcess_process it activates and generates dfirstSubProcess_output. In this manner, when the first subprocess node has been completely executed, the router triggers 1 XPath technology site, http://en.wikipedia.org/wiki/XPath. (accessed 10th July, 2011). 2 Adjacency matrix site, http://en.wikipedia.org/wiki/Adjacency_matrix. (accessed 10th July, 2011).
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
the next sub-process, and the dfirstSubProcess_output node is used as dsecondSubProcess_input, then executed by the fsecondSubProcess_process node, and so on. That is, the router controls the entire process workflow by concurrently activating the subsequent processing or data transmission, enabling implementation of the entire SensorML-based process chain. Step 4: Given that the SensorML-based process chain involves the link to the web services, the processing actuators designed in this research were extended to two types: local processing implementation actuators and OGC standard service-based actuators. According to the previously built process list, the independent sub-processes are called when its corresponding process input node referenced in the process chain is in running mode. The input of the process engine is an instantiated process chain model. Whether the actuator is a local implementation or standard service implementation, it will execute the total processes in accordance with the sequence of sub-processes specified in the process chain model. The final output is either a standard coded data model (e.g. scalar or raster data) or the desired results prescribed by the process chain.
3. Construction experiments on the NDVI geoprocessing workflow The NDVI score is a vegetation index (discriminates green areas from water and bare soil) generated from multi-band satellite imagery. The formula is calculated as NDVI ¼(near infrared – red)/(near Infraredþred). MODIS 1B data, retrievable from most CSW registration centers, were selected to calculate the NDVI score. 3.1. Scenario of the geoprocessing workflow: transforming raw remote sensing images to NDVI score calculations First, the coverage data and its selected area must be defined; here we assume that the required data are available or can be found through a search in CSW and retrieved from a WCS. In the experiment, the entire NDVI geoprocessing, involving transformation of the geometric correction of raw MODIS data into a final
123
NDVI result map, was modeled as a composite process chain. The first sub-process is georeferencing, a fundamental processing step before raw coverage data can be useful. To calculate the NDVI results of a given area, the geo-corrected data are clipped by the right zone (Hubei Province was selected as the experimental zone). Clipping is accomplished using the well-known GIS based function ‘‘Clip,’’ available as a WPS process. The valid bands of NDVI calculation are the red and near-infrared bands; thus, the required band was extracted using a band extraction process. Next, a pure mathematical process based on the NDVI calculation formula was executed. In this manner, a composite NDVI geoprocessing workflow was chained by the four sub-processes.
3.2. Experiment on the SensorML process chain to construct the NDVI geoprocessing workflow There are spatial–temporal references involved in the georeferencing and clipping sub-processes. Therefore these two subprocesses are modeled as physical processes. Band extraction and NDVI calculation are modeled as nonphysical processes. Generally, the physical processes are more complex than non-physical processes; the former define a physical model enabling geoprocessing situational awareness, and the latter concerns only the pure mathematical calculation unrelated to physical references. Fig. 4 shows that the entire NDVI calculation presents a geoprocessing workflow that consists of four sub-process models and seven connections. Each process, including an input, output, and process method, is presented in the form of a sensor observation process model. According to the Ps mathematical model elaborated in Section 2.1, the input and output of the four sub-processes constitute data node collection D(Ps), thus, dthisNDVIChain_boundaryinput, dthisNDVIChain_MODIS1Binput, dGeoreference_input, dGeoreference_output, dClipping_input1, dClipping_input2, dClipping_output, dBandExtr_input, dBandExtr_output1, dBandExtr_output2, dNDVICalcu_input1, dNDVICalcu_input2, dNDVICalcu_output, dthisNDVIChain_output. The process methods of the four sub-processes constitute processing node collection F(Ps), namely, fGeoreference_process ¼ [dGeoreference_input, dGeoreference_output], fClipping_process ¼[dClipping_input1 & dClipping_input2, dClipping_output], fBandExtr_process ¼ [dBandExtr_input,
Fig. 4. Relationship between the NDVI mathematical model and process chain model instance.
124
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
dBandExtr_output1 & dBandExtr_output2], fNDVICalc_process ¼[dNDVICalcu_input1 & dNDVICalcu_input2, dthisNDVIChain_output]. Seven connections constitute connection node collection C(Ps): CA ¼[dthisNDVIChain_MODIS1Binput, dGeoreference_input], CB ¼ [dthisNDVIChain_boundaryinpu, dClipping_input1], CC ¼ [dGeoreference_output, dClipping_input2], CD ¼ [dClipping_output, dBandExtr_input], CE ¼ [dBandExtr_output1, dNDVICalcu_input1], CF ¼[dBandExtr_output2, dNDVICalcu_input2], CG ¼[ dNDVICalcu_output, dthisNDVIChain_output]. The MODIS 1B data and required boundary act as the input to the total process chain. After running these chained sub-processes sequentially, the output of the last process is the result of the entire process chain. 3.2.1. Process modeling There are three kinds of processes for process modeling: physical processes, non-physical processes, and a composite process chain. The process-modeling module of the proposed SORM is an extensible and domain metadata-based system, which can rapidly construct different sensor observation process-models based on the domain template. As shown in Fig. 5, this is the basic structure and content of the SensorML-based process model. For the metadataGroup modeling of the entire sensor observation system, the metadata include the platform ID, sensor name, location, resolution, spectral bands, and swath, as well as the manner for tasking the sensor. Processing algorithms are encapsulated in the SensorML that describes the functions of the algorithms. In this case, web-accessible documents are created so that meta-information about the sensors and algorithms can be discovered over the
Internet. In addition, information on how to access the sensors and algorithms is provided. For the non-physical process model, we only define the metadataGroup, I/O, and the process methods, such as band extraction and algebraic calculation for the NDVI score. In terms of physical processes, the transformation of raw input into final output often involves a complex process, as evidenced by the georeferencing sub-process. Providing the required information to determine the relationship between the position of a remotely sensed pixel in image coordinates and its geoposition is important. ISO 19130 (Di and Kresse, 2004) specifies a sensor description with the associated physical and geometric information necessary to rigorously construct a georeferencing physical sensor model. Fig. 6 shows the georeferencing process model segments embedding the ISO 19130 metadata into the SensorML framework. The composite process chain model is the inlet and outlet of the geoprocessing workflow. Fig. 7 shows the composite process chain that integrates and connects the sub-process models. 3.2.2. NDVI process chain instantiation Neither specific data nor processing algorithms are contained in the NDVI abstract process chain model. The work of instantiation is to search data and related processes using the metadata of the existing sensor observation process model as basis. Instantiation then binds these data and processes. During the data search (Fig. 8), the boundary data on HuBei Province are found in the SORM, whereas the MODIS 1B data are not. We can, furthermore, use the CSW interface of the SORM to search the MODIS 1B data, and submit a request such as
Fig. 5. Common process model based on SensorML.
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
125
Fig. 6. Physical model segments of the geo-referencing sub-process.
‘‘Instruments Short Name ¼ ‘MODIS’ & Platform Short Name¼ ‘Terra’ & Archive Center ¼‘GMU_LAITS’ & Data Format¼‘GeoTiff’ & North-South Bounding Coordinate range ¼‘291250 -331200 ’ & EastWest Bounding range ¼‘1081210 –1161070 ’’’. The result can be obtained at http://laits.gmu.edu:9099/LAITSCSW2/discovery. In terms of process discovery, the georeferencing process can be retrieved in the SORM local process library. The other relevant sub-processes can be retrieved using the same approach of data retrieval. All the resources retrieved are linked in the abstract process chain through the ‘‘Xlink’’ mechanism, for example, the clipping process resource can be linked in the sml:process method tag like ‘‘/sml:processmethod gml:id¼‘‘clipping_process’’ xlink:href¼‘‘clipping process retrieved URI’’S’’. 3.2.3. Execution of the NDVI process chain instance A process diagram can depict any type of workflow. For the NDVI process chain presented in Section 3.2.1, Business Processing Management Notation is used as a middleware workflow generator for the user interface. The application processing interface of SORM translates the abstract process chain model into generic directives for the SensorML process chain engine. Fig. 9 shows the graphic design of the NDVI geoprocessing workflow, which consists of the backend-supported sub-process models.
In the experiment, we used a top–down recursive approach to establish the NDVI process chain; that is, the output of the precursor process is the input of the successor process, and so on. By running the corresponding processing method in the SORM process driven engine, the final NDVI score result map is represented in ArcGIS software (Fig. 10).
4. Discussion 4.1. Using the process chain for the comprehensive integration of sensor web resources The proposed process chain in the NDVI geoprocessing workflow originates from the platform-sensor of Terra-MODIS that integrates the observation information of the Terra-MODIS system through the metadataGroup (mentioned in Section 2.1). The information includes the platform and hardware information of its sensor, sensor capability, and other relevant geospatial aspects of the data and sensors. Unlike WPS, which only encapsulates the data processing algorithms (including the input, output, and methods employed by the algorithm); our proposed process chain is presented in the form of an integrated infrastructure. It can also
126
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
Fig. 7. Composite process chain model.
As described in Section 4.1, the SensorML-based process chain can include any algorithm, calculation, observation data, or model that operates on a sensor observation and sensor system. The retrieval function of the SORM can search the process instances by sensor name, sensor capability and characteristic, observation type, algorithm name, and so on. Furthermore, the ‘‘space-time-theme’’ solver function of the SORM searches usable observation data and sub-processes. Thus, accurate discovery of sensor web resources in a specific emergency task is relatively easier. Taking a CO2 annual concentration analysis for example, we retrieve satisfactory resources using the query ‘‘observation¼‘CO2’ & temporal¼‘2010-01-01 to 2011-01-01’ & spatial¼ ‘WuChang district’ & process¼‘concentration analysis’’’ in SORM.
of the BPELþWPS service chain is the logical chaining of these services. The proposed process chain approach not only connects these logical processes, but also supports the physical processes that define a physical model with metadata descriptions enabling user determination of geographic transformation from observations. Taking georeferencing as an example, the WPS-based service lacks ancillary description necessary for geopositioning and analyzing data. Unlike the WPS mode, the proposed process chain can integrate the physical sensor model with metadata description about the sensor and observation system further describing the procedures involved midway into the process, thereby allowing users to have a better understanding/estimate of the entire geoprocessing situation. In dataflow-oriented e-Science applications, users focus more on the dataflow delivered by the executable process model, than on the business control flows and activities. As previously discussed, our proposed approach can integrate the sensor web resources and chain the physical sensor process model. Moreover, it acts as a function model that both chains a series of sub-processes and executes them.
4.3. More suitable e-Science environment to depict geoprocessing
5. Conclusions and outlook
WPS aims to provide the process performed by web services. It enables the user to input data and call the process services. The core
Based on an analysis of SensorML, an executable function model in the form of a process, and the limitation of BPEL- and
be discovered over the Internet, providing information on how to access the sensors and algorithms. 4.2. Use of meta-information in the process model for easy search of sensor web resources
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
127
Fig. 8. Research result of MODIS for resource management.
Fig. 9. Process diagram of the NDVI geoprocessing instance.
WPS-based web service chains, we proposed a method for constructing a SensorML process chain-based geoprocessing e-Science workflow. The development of the workflow proceeded in three phases: process chain modeling, instantiation, and execution. NDVI geoprocessing workflow construction was adopted as the experiment. The proposed workflow model has the following advantages over the existing workflow implementation: Integration of sensor web resources into the sensor observation process model. It will facilitate reasoning and computing needs for collecting, archiving, processing, integrating, and representing the complex sensor observation process.
Chaining the logical and physical processes into composite geoprocessing. It improves the previously service chain-based workflow mode, which can only logically link to the method employed by processing algorithms. In the sensor web environment, it is vital that a physical process model provide ancillary information necessary for geoprocessing and data analysis. Construction of the dataflow-oriented geoprocessing e-Science workflow model. The proposed approach has succeeded in integrating the geospatial aspects related to NDVI geoprocessing from the Terra MODIS observation system. By constructing the dataflow-oriented process chain, the raw data produced by the MODIS
128
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
Fig. 10. NDVI result map.
sensor were transformed into the final NDVI result map, thereby verifying that the proposed method is a feasible way to construct a complete geoprocessing workflow based on SensorML. A large part of our efforts was invested in the construction of composite process chains. We modeled the metadataGroup of a sensor observation system; integrating the sensor, observation, and process model together, expressing the physical process model. We chained these resources into a geoprocessing e-Science workflow model with a data-oriented flow and executable process. Our proposed process chain-based workflow model is an important component for achieving sensor web resource real-time collaboration when a complex observation task occurs; however, the efficiency of our method still requires improvement. Yildiz et al. (2009) noted that an e-Science workflow requires complex control-oriented flow modeling in an essential dataoriented flow environment. Thus, our follow-up work will involve designing a robust, error-handling, and controllable workflow used to extend the present approach.
Acknowledgments This work was supported by grants from the National Basic Research Program of China (973 Program) (no. 2011CB707101), National High Technology Research and Development Program of China (863 Program) (no. 2011AA010500), National Nature Science Foundation of China (NSFC) program (nos. 41171315, 41023001, and 41021061), and the Fundamental Research Funds for the Central Universities (no. 201161902020004). We sincerely thank Mr. Steve Mcclure for the language polishing. The authors would like to thank the editors and anonymous reviewers for their valuable comments and insightful ideas. References Alameh, N., 2003. Chaining Geographic Information Web Services. IEEE Internet Computing 7 (5), 22–29. Botts, M., Robin, A. (Eds.), 2007a. Open Geospatial Consortium (OGC) document number: 07-000. Botts, M., Robin, A., 2007b. Bringing the sensor web together. Geosciences 6, 46–53. Broring, A., Echterhoff, J., Jirka, S., Simonis, I., Everding, T., Stasch, C., Liang, S., Lemmens, R., 2011. New generation sensor web enablement. Sensors 11 (3), 2652–2699. Chen, N., Di, L., Gong, J., Yu, G., 2010b. Automatic on-demand data feed service for autochem based on reusable geo-processing workflow. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 3 (4), 418–426. Chen, N., Di, L., Yu, G., Gong, J., 2009. Use of ebRIM-based CSW with SOSs for registry and discovery of remote-sensing observations. Computers and Geosciences 35 (2), 360–372. Chen, N., Di, L., Yu, G., Gong, J., 2010a. Geo-processing workflow driven wildfire hot pixel detection under sensor web environment. Computers and Geosciences 36 (3), 362–372.
Chen, N., Li, D., Di, L., Gong, J., 2010c. An automatic SWILC classification and extraction for the AntSDI under a sensor web environment. Canadian Journal of Remote Sensing 36, S1–S12. Chu, X., Kobialka, T., Durnota, B., Buyya, R., 2006. Open Sensor Web Architecture: Core Services. In: Proceedings of the 4th International Conference on Intelligent Sensing and Information Processing (ICISIP 2006). Piscataway, New Jersey, USA, pp. 98–103. Cox, S. (Ed.), 2007a. Observations and Measurements—Part 1—Observation Schema. Open Geospatial Consortium (OGC) document number: 07-002r1, Wayland, Massachusetts, USA, pp. 85. Cox, S. (Ed.), 2007b. Observations and Measurements—Part 2—sampling features. Open Geospatial Consortium (OGC) document number: 07-002r3, Wayland, Massachusetts, USA, pp. 46. Curbera, F., Khalaf, R., Nagy, W.A., Weerawarana, S., 2006. Implementing BPEL4WS: the architecture of a BPEL4WS implementation. Concurrency and Completion: Practice and Experience 18 (10), 1219–1228. Dadi, U., Di, L., 2009. Creating Web service interfaces and scientific workflows using command line tools: a GRASS example. In: Proceedings of the 17th International Conference on Geoinformatics, Fairfax, VA, USA, pp. 1–6. Delin, K., Jackson, S., 2001. The sensor web: a new instrument concept. In: Proceedings of the International Society for Optical Engineering (SPIE)’s Symposium on Integrated Optics San Jose, California, USA, pp. 1–9. Di, L., 2005. A framework for developing web-service-based intelligent geospatial knowledge systems. Journal of Geographic Information Sciences 11 (1), 24–28. Di, L., 2007. Geospatial sensor web and self-adaptive Earth predictive systems (SEPS). In: Proceedings of the Earth Science Technology Office (ESTO)/ Advanced Information System Technology (AIST) Sensor Web Principal Investigator (PI) Meeting, San Diego, USA, pp. 1–4. /http://esto.nasa.gov/sensor webmeeting/Papers/Di.pdfS. Di, L., Kresse, W., 2004. The current status and future plan of the ISO 19130 project. In: Proceedings of the XXth ISPRScongress, Istanbul, 2004. Liang, S.H.L., Croitoru, A., Tao, C.V., 2005. A distributed geospatial infrastructure for Sensor Web. Computers and Geoscience 31 (2), 221–231. Lizorkin, D., 2005. The query language to XML documents connected by XLink links. Programming and Computer Software 31 (3), 133–148. Na, A., Priest, M. (Eds.), 2007. OpenGISs Sensor Observation Service implementation specification (version 1.0). Open Geospatial Consortium (OGC) document number: 06-009r6, Wayland, Massachusetts, USA, pp. 104. Poole, C., Evans, J., 2008. A meta-model for generalized algorithm and model enablement of sensor web applications. In: Proceedings of IEEE Aerospace Conference (AC 2008), Big Sky, Montana, USA, pp. 2175–2183. Regner, K., Conover, H., Goodman, H., Zavodsky, B., Maskey, M., Jedlovec, G., Li, X., Lu, J., Botts, M., Berthiau, G., 2008. Using sensor web processes and protocol to assimilate satellite data into a forecast model. In: Proceedings of IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2008), Boston, USA, pp. 302–305. Robin, A.(Ed.), 2011. OGCs SWE Common Data Model Encoding Standard (version 2.0.0). (OGC) document number: OGC 08-094r1, Wayland, Massachusetts, USA, pp. 207. Simonis, I., Dibner, P. (Eds.), 2007. OpenGISs Sensor Planning Service implementation specification (version 1.0). Open Geospatial Consortium (OGC) document number: 07-014r3, Wayland, Massachusetts, USA, pp. 186. Simonis, I., Wytzisk, A. (Eds.), 2003. Web Notification Service (version 0.1.0). Open Geospatial Consortium (OGC) document number: 03-008r2, Wayland, Massachusetts, USA, pp. 46. Schut, P. (Ed), 2007. OpenGISs Web Processing Service, OpenGISs Standard (version 1.0.0). Open Geospatial Consortium (OGC) document number: 05-007r7, Wayland, Massachusetts, USA, pp. 73. Stollberg, B., Zipf, A. 2007. OGC web processing service interface for web service orchestration—Aggregating geo-processing services in a bomb threat scenario. In: Proceedings of Web and Wireless Geographical Information Systems, (W2GIS 2007), pp. 239–251.
N. Chen et al. / Computers & Geosciences 47 (2012) 119–129
Van Zyl, T., Vahed, A., 2009. Using SensorML to describe scientific workflows in distributed Web Service environments. In: Proceedings of IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2009), Cape Town, South Africa, pp. 3800–3802. Voges, U., Senkler, K. (Eds.), 2005. OpenGISs Catalogue Services specification 2.0—ISO19115/ISO19119 application profile for CSW 2.0 (version 0.9.3). Open Geospatial Consortium (OGC) document number: 04-038r2, Wayland, Massachusetts, USA, pp. 89. Vretanos, P.A. (Ed.), 2002. OGCTM Web Feature Service implementation specification (version 1.0.0). Open Geospatial Consortium (OGC) document number: 02-058, Wayland, Massachusetts, USA, pp. 105.
129
Whiteside, A., Evans, J. (Eds.), 2008. OGCTM Web Coverage Service implementation standard. Open Geospatial Consortium (OGC) document number: 07-067r5, Wayland, Massachusetts, USA, pp. 133. Yildiz, U., A. Guabtni, A., Ngu, 2009. Business versus Scientific Workflows: A Comparative Study, In: Proceedings of IEEE Congress on Services—1 (Services’ 2009), Washington, DC, USA, pp. 340–343.