Annals of GIS
ISSN: 1947-5683 (Print) 1947-5691 (Online) Journal homepage: http://www.tandfonline.com/loi/tagi20
GeoSquare: collaborative geoprocessing models’ building, execution and sharing on Azure Cloud Huayi Wu, Lan You, Zhipeng Gui, Kai Hu & Ping Shen To cite this article: Huayi Wu, Lan You, Zhipeng Gui, Kai Hu & Ping Shen (2015) GeoSquare: collaborative geoprocessing models’ building, execution and sharing on Azure Cloud, Annals of GIS, 21:4, 287-300, DOI: 10.1080/19475683.2015.1098727 To link to this article: http://dx.doi.org/10.1080/19475683.2015.1098727
Published online: 07 Nov 2015.
Submit your article to this journal
Article views: 3
View related articles
View Crossmark data
Citing articles: 1 View citing articles
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tagi20 Download by: [Chinese University of Hong Kong]
Date: 11 November 2015, At: 18:13
Annals of GIS, 2015 Vol. 21, No. 4, 287–300, http://dx.doi.org/10.1080/19475683.2015.1098727
GeoSquare: collaborative geoprocessing models’ building, execution and sharing on Azure Cloud Huayi Wua,b, Lan Youc*, Zhipeng Guib,d, Kai Hua,b and Ping Shena,b a
LIESMARS, Wuhan University, Wuhan, Hubei Province, 430071, China; bFaculty of Computer Science and Information Engineering, Hubei University, Wuhan, Hubei Province, 430062, China; cSchool of Remote Sensing and Information Engineering, Wuhan University, Wuhan, Hubei Province, 430071, China; dCollaborative Innovation Center of Geospatial Technology, 129 Luoyu Road, Wuhan, Hubei Province, 430079, China
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
(Received 5 August 2015; accepted 11 September 2015) Collaborative geoprocessing models have become one of the major solutions to significantly enhance the capacity to derive knowledge over a network, which are critical for the support of comprehensive analyses in a virtual geographic environment (VGE). With the emergence and growing maturity of the cloud computing infrastructure, a cloud-based platform for collaborative geoprocessing models promises to provide a pattern for the next generation of geoprocessing collaboration in the GIS realm. However, the problems with the existing collaborative geoprocessing models remain numerous, including the following: heterogeneity in description specifications hinders different geoprocessing services in collaborative work; the heterogeneity in messages mechanisms makes the cooperation among the geoprocessing services difficult and an integrated geoprocessing model framework centring on the collaborative model’s lifecycle is absent. To address these problems, this article proposes a cloud-based framework for building, executing and sharing collaborative models called GeoSquare: (1) a lifecycle model was designed for convenient and flexible collaborative geoprocessing; (2) a collaboration mechanism was implemented to solve specification heterogeneity; (3) a collaboration method and its proxy were used to resolve the heterogeneity in message communication and (4) to acquire better scalability, some elastic cloud features were utilized in the framework. A GeoSquare prototype was implemented on the Microsoft Azure Cloud to demonstrate the applicability and availability. Results show that users can build, execute, publish and share collaborative geoprocessing models with high efficiency in GeoSquare. GeoSquare provides a novel collaborative geoprocessing pattern enabling further geographic research in a cloud infrastructure. Keywords: Collaborative geoprocessing; services composition; model; GeoSquare
1. Introduction Since geodata has become available widely through spatial data infrastructures (SDI) (Groot and McLaughlin 2000), web geoprocessing based on service collaboration has been receiving increased attention for geoscientific knowledge discovery (Kiehle, Greve, and Heier 2007). By using collaborative workflow technologies, a series of geocomputation and analysis services can be integrated into a geoprocessing model for dealing with more complex tasks (Brauner et al. 2009). Collaborative geoprocessing models have become one of the major solutions to significantly enhance the capacity to derive geo-information and knowledge over a network (Zhao, Di, and Yu 2012). Collaboration models and the design of related tools are critical for the support of comprehensive analyses in a virtual geographic environment (VGE) (Lin et al. 2013). Improvement of the customizable workflow greatly enhances the collaborative ability in a VGE (Lin, Chen, *Corresponding author. Email:
[email protected] © 2015 Taylor & Francis
and Lu 2013). At the same time, the emergence of the cloud computing is promoting a transformation from traditional desktop geoprocessing to distributed collaborative geoprocessing (Bian, Xincai, and Jian 2010). Therefore, a cloud-based platform for building, executing and sharing collaborative geoprocessing models promises to provide a pattern for the next generation of geoprocessing collaboration in the GIS realm. However, there are still some challenges concerning collaborative geoprocessing models: the heterogeneity in description specifications hinders the different geoprocessing services from collaborative work in the geoprocessing models; the heterogeneity in message mechanisms makes the cooperation among the geoprocessing services difficult and a cloud-based framework centring on the collaborative geoprocessing model’s lifecycle is absent. Some research on geospatial services framework has been conducted. You et al. (2012) proposed a geospatial services composition framework supporting real-time
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
288
H. Wu et al.
monitoring. Sun, Yue, and Di (2012) introduced a taskoriented web geoprocessing system that leveraged web service and workflow technologies to design and execute tasks, and monitor and visualize the execution of tasks. SAW-GEO is a prototype framework combining a visual modelling solution based on industry specifications and semantic reasoning (Hobona, Fairbairn, and James 2007). Maosheng and Tianhe (2012) suggested a semantic geospatial web service sharing framework for finding, evaluating, accessing and using services. Li et al. (2011), You et al. (2012), Wu et al. (2011) and Yang et al. (2011) focused on optimizing an integrated framework for supporting geospatial services. Existing research however, has paid more attention to the constructing of the geospatial services composition than collaborative geoprocessing. To address these problems, this article proposes a cloud-based framework for building, executing and sharing collaborative geoprocessing models called GeoSquare centred on the collaborative geoprocessing model lifecycle. It has the following features: (1) A lifecycle of the collaborative geoprocessing model was designed in the proposed framework; divided into four stages. Centred on the lifecycle, this framework integrates several functions and modules for convenient and flexible collaborative geoprocessing. (2) To enable the geoprocessing services for harmonious collaborative work, a mechanism was implemented to solve the specification heterogeneity problem. (3) Considering that message communication heterogeneity is universal in collaborative geoprocessing models, a message proxy and collaboration mechanism was designed for GeoSquare. (4) To deliver better scalability, some elastic features found in the Azure Cloud were utilized in the proposed framework. The remainder of this article is organized as follows. Section 2 discusses related work. Section 3 introduces the architecture of the GeoSquare. Section 4 describes the key technologies in GeoSquare framework that enable the GIS resources to work collaboratively. In Section 5, a prototype of the GeoSquare is introduced in detail and is implemented on Azure Cloud. Finally, Section 6 presents conclusions and plans for future work.
2. Collaborative geoprocessing Geoprocessing services are the web services encapsulating geocomputation algorithms, which can be shared over the internet. As a means to promote the on-demand instant transformation of geodata into knowledge in the web
environment, geoprocessing services have attracted attention both from industry and academia. The World Wide Web Consortium (W3C) defines a Web service generally as: a software system designed to support interoperable machine-to-machine interaction over a network (Wikipedia, 2015). There are a variety of specifications associated with web services in varying degrees of maturity and are maintained or supported by various standards bodies and entities. These various specifications are the basic web services framework established by the first-generation standards as represented by WSDL (Web services description language), SOAP (Simple Object Access Protocol), UDDI (Universal Description, Discovery and Integration) and XML (Extensible Markup Language). Since these specifications do not include the metadata and spatial information standards, the special problems inherent to GIS cannot be easily resolved. The Open Geospatial Consortium (2007) (OGC) published the Web Processing Service (WPS) specification 1.0.0, a tangible sign indicating that geoprocessing services are becoming an integral part of standardized GIServices. A number of geoprocessing service resources were thereafter published online to enable collaborative geoprocessing over the network, such as the GeoBrain Processing Web Services (Li et al. 2010), the Algorithm Development and Mining System (ADaM) (http://projects. itsc.uah.edu/datamining/adam/), OpenRS-Cloud (Guo et al. 2010), and so on. To build large-scale computational calculations as complex geospatial simulation models, scattered geoprocessing services distributed on the web are integrated into a geoprocessing services workflow (Brauner et al. 2009). Geoprocessing service collaboration can be realized as a collaborative geoprocessing chain model by using workflow technologies (Peltz 2003). By collaborating scattered individual geoprocessing services into a geoprocessing model, the design and execution of a complex processing model across domains and applications is enabled. The collaborative geoprocessing model is important for geoscience research and applications, since their complexity (the geodata and computation problem) often requires the functionality of a series of processes. A collaborative geoprocessing model provides a flexible way of implementing cross-application, multi-resource and multi-step complex geoprocessing. Several research projects, different vendors and open source projects carry out work in the context of collaborative geoprocessing. Alameh (2003) proposed a web services workflow model, allowing users to easily combine web services to create customized geospatial information applications. Deng et al. (2004) implemented a prototype system to build a geoprocessing workflow for image processing. Di et al. (2006) and Granell, Gould, and Francisco (2005) proposed an abstract process and devised elementary workflow patterns as a foundation for reusing existing models and services in
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
Annals of GIS geoprocessing applications. Gui, Wu, and Wang (2008) developed a data dependency relationship directed graph and block structure-based abstract geospatial service chain model. Friis-Christensen et al. (2009) proposed the concept of DGIP (Distributed Geographic Information Processing) and explored different architectural patterns for collaborative geoprocessing services. The 52° North WPS workflow modeller (http://52north. org/wps) allows the modelling of collaborative geoprocessing services by chaining OGC services. Wu et al. (2014) presented a FAST pattern for collaborative geoprocessing in messages level. Gong et al. (2012) proposed a concept model of geospatial services web towards integrated cyberinfrastructure for GIScience. This work on collaborative geoprocessing discussed, however, focused on the modelling mechanisms and service composition in a geoprocessing services workflow, but without considering how to make GIS resources work collaboratively in harmony. With the birth and gradual maturing of the cloud computing infrastructure, geocomputation is shifting from traditional desktop geoprocessing to distributed collaborative geoprocessing. This article proposes a cloud-based platform for building, executing and sharing the collaborative geoprocessing models and enables geoprocessing services with different specifications and message
Figure 1.
Architecture of GeoSquare.
289
mechanisms to cooperate with each other. It provides a cloud-based framework prototype for future collaborative geoprocessing initiatives.
3. Architecture Aiming to make the GIS resources (including geodata, geoprocessing services and models) work collaboratively with high efficiency, this article proposed a cloud-based framework for building, executing and sharing collaborative geoprocessing models – GeoSquare. The basic architecture of GeoSquare is web service oriented. GeoSquare demonstrates a feasible way to achieve a geoprocessing computing paradigm for the future. The framework design of GeoSquare is as shown in Figure 1. As Figure 1 shows, the GeoSquare framework consists of three tiers: the application tier, computation tier and resource tier. For the collaborative geoprocessing models, seven core components including georesource registry centre, geoprocessing model builder, geoprocessing model executer, geoprocessing model monitor, geoprocessing model visualizer, geoprocessing model publisher and user management were designed in the application tier. The computation tier is in charge of making georesources work collaboratively, smoothly and efficiently at the
290
H. Wu et al.
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
backend. The basis of the whole framework is the resources tier, which involves the management and organization of the geodata, geoprocessing services and geoprocessing models. The whole framework is built on the Microsoft Azure Cloud. 3.1. GIS resource registry centre The GIS resource registry centre is used to register and query the metadata of different kinds of geodata, geoprocessing services and models for further processing and utilization. The GIS resources are divided into three groups (data, services and models). The geodata group includes vector data and image data that can be the input data for the geoprocessing model. The services group refers to the algorithms for dealing with geodata, usually encapsulated into two forms: as W3C web services and OGC WPS services. A series of geoprocessing services can be united into the geoprocessing model. The geoprocessing models are categorized into a model group in the registry centre. 3.2. Geoprocessing model builder A geoprocessing model is a geoprocessing services composition, which includes a series of steps for one complex geocomputation task. This module has the functionality to provide users a graphic user interface (GUI) to build new geoprocessing models or edit old ones. Users specify the properties of a model and design its structure, which can be sequential, parallel or conditional. The model element repository provides the components that will be available in a new geoprocessing model. By dragging and dropping a model element in the model editor, a geoprocessing model can be built visually. When a geoprocessing model is finished, it can be stored in the model repository for reuse at an another time. Users can either select an old model from the model repository or create a new one. This method is flexible in that users can easily build a geoprocessing model using atomic geoprocessing services. 3.3. Geoprocessing model executer This module is used to remotely invoke the geoprocessing model previously deployed in the collaborative engine. Once a geoprocessing model is invoked, a model instance will be activated and executed in a sequence as defined in WS-BPEL. The collaborative engine in the computation tier is an execution engine for WS-BPEL workflows.
and shows them to the users. The status collector in the computation tier is used to collect the information of the model and update continually. The monitor acquires the updated status by communicating with the status collector periodically. This approach allows users to view the dynamic process information in real time.
3.5. Geoprocessing result visualizer Once the running process of the geoprocessing model is finished, the results will be generated in the backend. For users to conveniently browse the result, a geoprocessing result visualizer was designed in the application tier. This module is in charge of visualizing processing results according to the result types. This module integrates the Google Earth for previewing the image data by its location information. For the statistical results for numeric types of operations, this module provides a statistical chart method. For other result types, a general method is file downloading and local viewing using third-party tools.
3.6. Geoprocessing model publisher When a geoprocessing model is created in the model builder, it must be published on one collaborative engine for user invocation. This module is used to publish an existing model on the execution engine, while the model metadata is registered in the resource registry centre.
3.7. User profiles User profiles refer to the individual definitions management of private information, GIS resources hosting, sharing, usage and related privileges. In order to let users distributed on the internet share their collaborative geoprocessing models in different scopes, there are three levels of resource sharing scopes in user profiles including public, group and private. Users can deliver the geodata or models to the public if they want. Also, GIS resources can be shared within a specific scope when the user defines a group. The private definition enables user host data, algorithms and models on the platform while only the owner can use these resources. This method is flexible as users can share their resources within different scopes as required.
4. Methodologies 3.4. Geoprocessing model monitor This module captures the runtime information of the geoprocessing model from the status collector in the backend
In this section, the key technologies in GeoSquare platform for building, executing and sharing of the geoprocessing models are described in detail.
Annals of GIS
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
4.1. Lifecycle of collaborative geoprocessing models From the user perspective, a general geoprocessing model lifecycle can be divided into the following four stages: (1) Design stage: search GIS resources, build collaborative geoprocessing models and deploy the model to the server engine. (2) Execution/monitor stage: run the deployed model on the collaborative engine while monitor the process status. (3) Optimization stage: view the processing results and adjust the model if necessary. (4) Publish stage: register the geoprocessing model into the registry centre, put the model into repository and publish the model for other users. Figure 2 shows the major operations and relevant modules at each stage in the lifecycle of collaborative geoprocessing models. The lower half of Figure 2 shows the supporting function modules in the application tier; the geoprocessing model can be executed and adjusted repeatedly once it is built. When the optimization of the geoprocessing model is finished, it can be put into the model repository and published to other users for reusing. For users with an interest in specific geoprocessing models, they can search the keywords in the registry centre and run the model to process their own geodata. One geoprocessing model can be reused without any modification and reedited iteratively as needed in any stage. In this lifecycle model, the execution/monitor stage is the key to collaborative geoprocessing. Heterogeneity in implementation specifications and message transmission must be considered and solved during the collaboration process for different geoprocessing services. In the execution/monitor stage, the message mediator and asynchronous agent in the computation tier are used to enable the geoprocessing services to seamlessly work in collaboration. The related implementation details are introduced in the following sections.
Figure 2.
Lifecycle of collaborative geoprocessing models.
291
In this proposed lifecycle, the collaborative geoprocessing model can be built, used and shared smoothly. The GeoSquare platform is conceptualized and implemented around this lifecycle model and integrates supporting tools and functions for collaborative geoprocessing.
4.2. Collaboration mechanism for heterogeneity in specification Once a geoprocessing model is activated, the collaboration engine will invoke geoprocessing services in the sequence as defined in model file. Generally, there are two main geoprocessing services: common web services and WPS services. The common geoprocessing web services utilize WSDL specification to describe the function and operation details, while the WPS services publish their capabilities using their capabilities documents. Since differences between the two capabilities description specifications exist, how to enable the two sorts of geoprocessing services to work collaboratively is the primary problem to be solved in a geoprocessing model. This article proposes a collaboration mechanism using WPS services as the common geoprocessing service in the geoprocessing model. The WPS capabilities descriptions and WSDL descriptions focus on different aspects. The WPS capabilities documents list the name of operations but do not include the relevant parameters for invocation. If users want to invoke a WPS services, the DescribeProcess function must be executed first to obtain the detailed invocation parameters. The WPS capabilities documents focus on the metadata description including layers, coordination, geographical range, supporting formats and so on, but lack grammar information. The WSDL description document is just the opposite, it introduces the detailed input/output data as the grammar. For a collaborative geoprocessing
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
292
Figure 3.
H. Wu et al.
A transformation example between WPS capabilities and WSDL description.
model, the WSDL specification is more convenient when constructing an invocation message. This proposed collaboration mechanism firstly transforms the capabilities description of a WPS service to the WSDL description, and then invokes the WPS service as a common geoprocessing service. Figure 3 shows an example of a transformation from a Capabilities description to a WSDL description. In Figure 3, the left-hand side shows the original WPS capabilities fragment and the right-hand side shows the corresponding WSDL fragment after transformation. It can be seen that the original I/O parameters including the title and data structure have been mapped into the new WSDL. In this transformation, the simple datatype
Figure 4.
is still simple in new WSDL while the complex datatype becomes a string datatype. An additional parameter WPSURL for each operation in the WPS capabilities document is added in the corresponding WSDL document and refers to the actual address. In the newly created WSDL description, the information details are enough to invoke a WPS service. Thus, the collaboration engine can treat the WPS as the normal geoprocessing service when it invokes a WPS service in a geoprocessing model. Figure 4 shows the collaboration mechanism between the WPS and common geoprocessing services. The collaboration engine can invoke a common geoprocessing service as described by a WSDL document, but cannot generate a WPS request directly. To solve this problem, a
Collaboration mechanism between the WPS and common geoprocessing services.
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
Annals of GIS WPS mediator is designed to help the engine construct a WPS request. In Figure 4, a WPS mediator consists of four components: WPS request parsing, message transformation, WPS response parsing and message forwarding. The WPS request parsing component is used to capture the request from the engine server. The message transformation component converts the WSDL request format into the WPS request format, and also the WPS response format into the WSDL response. The message forwarding component involves in sending transformed messages (WPS requests or WSDL responses) to the receivers (WPS servers or collaboration engines). All the WPS service requests from the collaboration engine are first forwarded to the WPS mediator; then the WPS mediator deals with them uniformly. This approach enables a common geoprocessing services and allows WPS services to seamlessly work collaboratively in a geoprocessing services workflow.
4.3. Collaboration mechanism for heterogeneity in communication There are two general communication modes: synchronous message mechanism and asynchronous message
Figure 5.
Collaboration among different communication mechanism.
293
mechanism. The asynchronous mechanism is more suitable for large-scale geodata computation, while the synchronous mechanism works well with smaller projects. Usually, in a collaborative geoprocessing model, both synchronous services and asynchronous services are included. To further detail the collaboration mechanism, a simple image processing model was built including a Median Filter geoprocessing service and a SVM geoprocessing service. Figure 5a shows the geoprocessing workflow logic. Figure 5b shows how the synchronous services and asynchronous services collaboratively work in this case. In Figure 5b, the geoprocessing model is described in WS-BPEL specification. The Median Filter geoprocessing service is asynchronous, while the SVM geoprocessing service is synchronous. A pair of asynchronous messages (invoke, receive actions) based on WSAddressing defined in the geoprocessing model are used to request the Median Filter service. An invoke action defined in the geoprocessing model is used to request the SVM service. All the messages from the collaboration engine are sent to the message proxy. Then, it forwards the requests to the actual services. The message proxy is in charge of identifying and forwarding messages. By
294
H. Wu et al.
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
distributing different request messages to the geoprocessing services with different communication mechanisms, the message proxy enables the geoprocessing services work together.
4.4. Scalable framework mechanism on Azure Cloud Azure Cloud is a cloud computing platform and infrastructure created by Microsoft for building, deploying and managing applications and services through a global network of Microsoft-managed and Microsoft partner hosted data centres (Microsoft 2015). It provides large amounts of cloud storage and computing resources for hosting applications and services. To implement a scalable geoprocessing mechanism, GeoSquare platform utilized some features in Azure Cloud. It was built on the Azure Cloud infrastructure. Figure 6 shows the design for scalable geoprocessing mechanism in GeoSquare. Azure load balancer (ALB) is a load-balancing mechanism that acts as a proxy and distributes network or application traffic across a number of servers, which are used to increase the capacity (concurrent users) and reliability of applications on Azure Cloud. Considering the intensive computation features found in most geoprocessing models, ALB was used in three places of GeoSquare architecture. The collaboration engine pool hosts several engine servers for executing geoprocessing models having identical configurations. All the geoprocessing models are put in the model
Figure 6.
Scalable collaborative geoprocessing on Azure Cloud.
repository and are shared among the engine servers. When a user executes a geoprocessing model to deal with the geodata, the invocation request will be delivered to the ALB. Then the ALB will distribute the invocation to an available engine server that is idle. Since the engine pool on a cloud can host engine servers as needed, the engine pool can scale up as the number of the concurrent users increases. The same mechanism is designed in asynchronous proxies and WPS mediators for elastic geoprocessing needs when executing collaborative geoprocessing models. As shown in Figure 6, the ALB was designed in the places where performance bottlenecks in the framework are likely to occur. The model repository and users workspaces were created based on Azure blob storage, providing elastic storage and acting as a file system with unlimited capacity. The queue feature in Azure Cloud is a message mechanism that stores the messages as a sequence. The preprocessing tasks queue is a task queue based on a queue feature that hosts a sequence of processing messages waiting to be executed. These elastic mechanisms enable GeoSquare to achieve robust extensibility and scalability.
5. Implementation In order to verify the efficiency of the GeoSquare platform, a prototype for building, using and sharing collaborative geoprocessing models was developed on the Microsoft Azure Cloud. The application tier was
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
Annals of GIS developed on Eclipse 6.0 IDE. The GIS resources registry centre was built with the Ext Google Web Toolkit (GXT) (Google 2012). The geoprocessing model builder was developed using Java RCP techniques. The geoprocessing model visualizer was developed on the Google Earth API. The ActiveBpel engine (Endpoints 2004) was taken as the collaborative engine for the geoprocessing models. The Java Servlet was used to implement the WPS mediator, status collector and asynchronous agent. All the GIS resources including geodata, geoprocessing services and models were stored in the Azure blob storage. The metadata of the GIS resources was organized in SQL Azure, which is a SQL server database on Azure Cloud. Figure 7 shows the main GUI of the GeoSquare. In Figure 7, the left panel is the GIS resources registry centre that presents the GIS resources as a tree catalogue. The right panel is the metadata viewer for the GIS resources in registry centre that presents details and a snapshot. When building collaborative geoprocessing models, users can easily search the GIS resources by location or keywords. At the top-right corner, there is an entrance for user logins. Figure 8 shows the changes after a user logs in. When users log in, they can share resources within three ranks: Public, Group and Private. Figure 9 shows the geoprocessing model builder in which users can build a geoprocessing model composed of several geoprocessing services and geodata. In Figure 9, the main part is the model editor and the right panel presents the models resources stored in geoprocessing model repository. Once a geoprocessing model has been built, it can be executed on the collaboration engine. Figure 10 shows the
Figure 7
GUI of GeoSquare.
295
GUI of geoprocessing model executer. The green bar of each geoprocessing service indicates the running status. It can be seen that all the geoprocessing services in the spatial filter model have finished their tasks. The model’s execution results can be preprocessed for visualization on virtual earth. Figure 11 visualizes the result image on Google Earth using latitude and longitude information. After building and executing a collaborative geoprocessing model, the owner can publish it to the registry centre and share it with other collaborators. Figure 12 shows the metadata of the geoprocessing model, which includes the name, provider, WSDL URL and other details. These metadata provides enough details so that other users can reedit and reuse the geoprocessing model. As Figure 13 shown, a group can be created if the user wants to share models within a limited range. In Figure 14, the spatial filtering model in the user’s favourite folder can be shared within the spatial model group or all the public users. Through this method, the collaborative geoprocessing model can be shared within the free scope and improves model reuse efficiency.
6. Conclusions and future work Aiming to make the GIS resources (including geodata, geoprocessing services and models) work collaboratively at high efficiency, this article proposed a cloud-based framework for building, executing and sharing collaborative geoprocessing models – GeoSquare. The GeoSquare platform has following features that help to resolve problems associated with collaborative geoprocessing.
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
296
H. Wu et al.
Figure 8.
Resource sharing ranks after users’ login.
Figure 9.
Geoprocessing model builder.
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
Annals of GIS
Figure 10.
Geoprocessing model executer.
Figure 11.
Geoprocessing results visualizer.
(1) A lifecycle for the collaborative geoprocessing model was proposed that adds flexibility and convenience of the model framework. In this proposed lifecycle, a collaborative geoprocessing model can be built, used and shared smoothly. The GeoSquare lifecycle platform integrates supporting tools and functions necessary for a collaborative geoprocessing model. (2) To solve the problem of the specification heterogeneity existing in the geoprocessing services, this
297
article proposed a collaboration mechanism to enable harmonious collaborative work. Through transforming the WPS capabilities to WSDL, the WPS can be invoked as common geoprocessing services using the normal collaboration engine. A WPS mediator was designed for the message exchange between the collaboration engine and the WPS services. In this way, the two sorts of geoprocessing services can be united to work together in the geoprocessing model.
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
298
H. Wu et al.
Figure 12.
Publishing the model on GeoSquare.
Figure 13.
Creating a group for sharing models.
(3) A collaborative approach to solve the heterogeneity problem in message communication common to all collaborative geoprocessing models was designed in GeoSquare. By adding a message proxy, all the requests coming from an engine server were handled uniformly and distributed to the corresponding services whether asynchronous or not.
The message proxy enabled the geoprocessing services in different communication mechanisms to work well with each other in a way that promotes the compatibility of the framework. (4) To achieve improved scalability, some features of the Azure Cloud were utilized in GeoSquare. The ALB mechanism was used to balance the
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
Annals of GIS
Figure 14.
299
Sharing the geoprocessing models within a group.
processing loads in collaboration engines, message proxies and WPS mediators to satisfy increased processing demands. The geoprocessing models were stored centrally in the Azure blob, a file system with unlimited capacity. Because of using the elastic cloud features, the proposed platform obtained more scalability for extensibility needs likely to occur in the future. The prototype of the GeoSquare on Azure Cloud demonstrates that users can build, execute, publish and share the collaborative geoprocessing models at high efficiency in GeoSquare. The integrated modelling environment provides tools to create a new model or edit an old one. The collaboration mechanisms make geoprocessing services with different description specifications or communication messages work collaboratively in a geoprocessing model. Furthermore, some assistant modules such as the model monitor results visualizer and user profiles are integrated in GeoSquare, prompting users to smoothly deal with geoprocessing tasks. GeoSquare provides a novel collaborative geoprocessing pattern for further geographic research on the cloud infrastructure. Future research in this framework will emphasize: (1) continually improving the flexibility and availability of the framework and (2) further clarifying the collaborative geoprocessing models classification.
Disclosure statement No potential conflict of interest was reported by the authors.
Funding This work was supported by the National Natural Science Foundation of China under Grants [41401464], [41371372]; and MSRA [PO# 96161227].
References Alameh, N. 2003. “Chaining Geographic Information Web Services.” Internet Computing, IEEE 7: 22–29. Bian, W., W. Xincai, and H. Jian. 2010. “Geospatial Data Services within Cloud Computing Environment.” International Conference on Audio Language and Image Processing (ICALIP), Shanghai, November 23–25, 1577– 1584. (IEEE Catalog Number: CFP1050D-ART and ISBN: 978-1-4244-5858-5.). Brauner, J., T. Foerster, B. Schaeffer, and B. Baranski. 2009. “Towards a Research Agenda for Geoprocessing Services.” 12th AGILE International Conference on Geographic Information Science 1: 1–12. Deng, M., P. Zhao, Y. Liu, A. Chen, and L. Di. 2004. “The Development of a Prototype Geospatial Web Service System for Remote Sensing Data.” The International Archives of Photogrammetry, Remote Sensing, and Spatial Information Sciences 30 (Part 2): 1–5. Di, L., P. Zhao, W. Yang, and P. Yue. 2006. “Ontology-Driven Automatic Geospatial-Processing Modeling Based on WebService Chaining.” Proceedings of the Sixth Annual NASA
Downloaded by [Chinese University of Hong Kong] at 18:13 11 November 2015
300
H. Wu et al.
Earth Science Technology Conference, College Partk, MD, June 27–29. Endpoints, A. 2004. ActiveBPEL. Friis-Christensen, A., R. Lucchi, M. Lutz, and N. Ostländer. 2009. “Service Chaining Architectures for Applications Implementing Distributed Geographic Information Processing.” International Journal of Geographical Information Science 23: 561–580. doi:10.1080/13658810802665570. Gong, J., H. Wu, T. Zhang, Z. Gui, Z. Li, L. You, S. Shen, et al. 2012. “Geospatial Service Web: Towards Integrated Cyberinfrastructure for GIScience.” Geo-spatial Information Science 15 (2): 73–84. doi:10.1080/10095020.2012.714098 Google 2012. GXT. Granell, C., M. Gould, and R. Francisco. 2005. “Service Composition for SDIs: Integrated Components Creation.” Proceedings of the Sixteenth International Workshop on Database and Expert Systems Applications, Copenhagen, Denmark, August 26, 475–479. IEEE. Groot, R., and J. D. McLaughlin. 2000. Geospatial Data Infrastructure: Concepts, Cases, and Good Practice. Oxford: Oxford University Press. Gui, Z., H. Wu, and Z. Wang. 2008. “A Data Dependency Relationship Directed Graph and Block Structures Based Abstract Geospatial Information Service Chain Model.” Proceedings of the Fourth International Conference on Networked Computing and Advanced Information Management, Gyeongju, Korea, 2–4 September. Guo, W., J. Gong, W. Jiang, Y. Liu, and B. She. 2010. “OpenRSCloud: A Remote Sensing Image Processing Platform Based on Cloud Computing Environment.” Science China Technological Sciences 53: 221–230. doi:10.1007/s11431-010-3234-y. Hobona, G., D. Fairbairn, and P. James. 2007. “SemanticallyAssisted Geospatial Workflow Design.” In Proceedings of the 15th Annual ACM International Symposium on Advances in Geographic Information Systems, 26. Seattle, WA: ACM. Kiehle, C., K. Greve, and C. Heier. 2007. “Requirements for Next Generation Spatial Data Infrastructures-Standardized Web Based Geoprocessing and Web Service Orchestration.” Transactions in GIS 11: 819–834. doi:10.1111/tgis.2007.11.issue-6. Li, X., L. Di, W. Han, P. Zhao, and U. Dadi. 2010. “Sharing Geoscience Algorithms in a Web Service-Oriented Environment (GRASS GIS Example).” Computers & Geosciences 36: 1060–1068. doi:10.1016/j.cageo.2010.03.004. Li, Z., C. P. Yang, H. Wu, W. Li, and L. Miao. 2011. “An Optimized Framework for Seamlessly Integrating OGC Web Services to Support Geospatial Sciences.”
International Journal of Geographical Information Science 25: 595–613. doi:10.1080/13658816.2010.484811. Lin, H., M. Chen, and G. Lu. 2013. “Virtual Geographic Environment: A Workspace for Computer-Aided Geographic Experiments.” Annals of the Association of American Geographers 103: 465–482. doi:10.1080/ 00045608.2012.689234. Lin, H., M. Chen, and G. Lu et al. 2013. “Virtual Geographic Environments (VGEs): A New Generation of Geographic Analysis Tool.” Earth-Science Reviews 126:74–84. Maosheng, H., and C. H. I. Tianhe 2012. “Semantic Geographic Web Service Sharing Framework.” In Recent Advances in Computer Science and Information Engineering, edited by Z. Qian, L. Cao, and W. Su, et al., 553–559. Berlin: Springer. Microsoft. 2015. Microsoft Azure. Open Geospatial Consortium. 2007. OpenGIS Web Processing Service version 1.0.0. Open Geospatial Consortium (OGC). Peltz, C. 2003. “Web Services Orchestration and Choreography.” Computer 36: 46–52. Sun, Z., P. Yue, and L. Di. 2012. “Geopwtmanager: A TaskOriented Web Geoprocessing System.” Computers & Geosciences 47: 34–45. doi:10.1016/j.cageo.2011.11.031. Wikipedia 2015. Web services. Wu, H., Z. Li, H. Zhang, C. Yang, and S. Shen. 2011. “Monitoring and Evaluating the Quality of Web Map Service Resources for Optimizing Map Composition over the Internet to Support Decision Making.” Computers & Geosciences 37: 485–494. doi:10.1016/j. cageo.2010.05.026. Wu, H., L. You, Z. Gui, S. Gao, Z. Li, and J. Yu. 2014. “FAST: A Fully Asynchronous and Status-Tracking Pattern for Geoprocessing Services Orchestration.” Computers & Geosciences 70: 213–228. doi:10.1016/j. cageo.2014.06.005. Yang, C., H. Wu, Q. Huang, Z. Li, and J. Li. 2011. “Using Spatial Principles to Optimize Distributed Computing for Enabling the Physical Science Discoveries.” Proceedings of the National Academy of Sciences 108: 5498–5503. doi:10.1073/pnas.0909315108. You, L., Z. Gui, W. Guo, S. Shen, and H. Wu. 2012. “A Geospatial Web Services Composition Framework Supporting Real-Time Status Monitoring.” International Society for Photogrammetry and Remote Sensing I–4: 175– 179. doi:10.5194/isprsannals-I-4-175-2012. Zhao, P., L. Di, and G. Yu. 2012. “Building Asynchronous Geospatial Processing Workflows with Web Services.” Computers & Geosciences 39: 34–41. doi:10.1016/j.cageo.2011.06.006.