Geoinformatica DOI 10.1007/s10707-016-0252-3
Developing a web-based system for supervised classification of remote sensing images Ziheng Sun 1 & Hui Fang 2 & Liping Di 1 & Peng Yue 2 & Xicheng Tan 3 & Yuqi Bai 4
Received: 27 June 2014 / Revised: 18 February 2016 Accepted: 23 March 2016 # Springer Science+Business Media New York 2016
Abstract Web-based image classification systems aim to provide users with an easy access to image classification function. The existing work mainly focuses on web-based unsupervised classification systems. This paper proposes a web-based supervised classification system framework which includes three modules: client, servlet and service. It comprehensively describes how to combine the procedures of supervised classification into the development of a web system. A series of methods are presented to realize the modules respectively. A prototype system of the
* Liping Di
[email protected] Ziheng Sun
[email protected] Hui Fang
[email protected] Peng Yue
[email protected] Xicheng Tan
[email protected] Yuqi Bai
[email protected]
1
Center for Spatial Information Science and Systems (CSISS), George Mason University, 4400 University Drive, MSN 6E1, Fairfax, VA 22030, USA
2
Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University, Wuhan, Hubei, China
3
Spatial Information and Digital Technology Department, International School of Software, Wuhan University, Wuhan, Hubei, China
4
Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing 100084, China
Geoinformatica
framework is also implemented and a number of remote sensing (RS) images are tested on it. Experiment results show that the prototype is capable of accomplishing supervised classification of RS images on the Web. If appropriate algorithms and parameter values are used, the results of the web-based solution could be as accurate as the results of traditional desktop-based systems. This paper lays the foundation on both theoretical and practical aspects for the future development of operational web-based supervised classification systems. Keywords Supervised classification . Geoprocessing web service . Web-based processing system . Remote sensing image . Cyberinsfrastructure
1 Introduction Nowadays, most classification activities on remote sensing (RS) images are conducted on desktop-based systems [1–5]. Web-based systems, which can be accessed from any corner of the world with Internet connection, are independent from client platforms and allow users to focus on business logic instead of software tools, offer many obvious and attractive advantages over desktop-based systems [6]. Consequently, building web-based systems has been identified as an important issue in facilitating image classification for common users. As unsupervised classification generally has less complex procedures than supervised classification, most existing researches engaged in building web-based systems of unsupervised classification [7–10]. Therefore web-based supervised classification systems are still rarely studied so far and special efforts on both theory and practice are urgently needed. This paper proposed a web-based system framework to realize online supervised classification of RS images. The framework contains a service module to decompose and execute classification procedures, a servlet module to bridge stateful user sessions and stateless web services and a client module to receive and transfer user requests and display service responses. A series of methods are presented to realize the three modules respectively. Also, we developed a prototype web-based supervised classification system to demonstrate the feasibility of the framework. The work lays both theoretical and practical foundations for the future development of operational web-based supervised classification systems. The remainder of this paper is organized as follows. Section 2 lists the general procedures in supervised classification. Section 3 describes the web-based supervised classification framework. The methods for realizing the framework are presented in Section 4. Experiments and results on the prototype system are provided in Section 5. Finally, the conclusion and a discussion on future work are given out in Section 6.
2 The procedures of supervised classification Supervised classification, which has great potential power to improve classification accuracy [11], has already been used as a common image analysis tool in commercial desktop RS/GIS (geographic information system) software, e.g. eCognition, ArcGIS, ERDAS and ENVI. There are two major kinds of supervised classification schemas: pixel-based and object-based. In this paper, we choose object-based schema as example. Generally, there are six basic procedures in an objectbased supervised classification processing cycle (Fig. 1). First, an image is segmented into objects. Each object is composed by some adjacent image pixels. Then a class list is built based on user
Geoinformatica
Image
Segmentation
Build Class List
Sample Selection Adjust&Redo
Create Feature Space
Classification
Evaluation
Results Fig. 1 The common procedures in supervised classification methods
requirements. In the subsequent sample selection process, some training samples for each class are selected from the objects. A feature space containing several textural, spatial or spectral features is created. A vector in the feature space is calculated and composed for each object. All the vectors are used as evidences to cluster similar objects into one class. In order to facilitate the subsequent evaluation and provide more objective information, the probability for each object belonging to other classes could be calculated and exported to the results as well. Finally, an overall evaluation is conducted on the classified objects to judge whether the result meets user requirements. The classification results can be evaluated on multiple aspects. Several criteria are usually adopted as objective metrics, such as overall accuracy, Kappa coefficient, F-1 measure, precision and recall, which reflect the accuracy in different perspectives and have been studied since the nineteen eighties [12, 13]. In the evaluation step, if the criteria values are lower than the minimum threshold, input parameter values of corresponding steps will be adjusted on purpose and the four steps (segmentation, sample selection, classification, and evaluation) will be repeated until satisfying results emerge. For instance, if some objects are classified into a wrong class, the results will be corrected by adding a few of the wrongly classified objects into the training samples or adjusting the thresholds that control the clustering range in the classification. Besides, since there are many algorithms for image segmentation and image classification, each step can replace the adopted algorithms with new ones for a better outcome. The key solution for the inaccuracy in the results is finding the step that is responsible and adjusting the step on purpose. In addition, subjective human observation is an important way to evaluate the results. For example, through observing the results by eyes, human can easily identify whether the scale of segmented objects is too big to distinguish geospatial objects (under-segmented), or too small to keep the completeness of geospatial objects (over-segmented). A good classification result commonly requires a lot of human observing and tuning. When designing a web-based supervised classification system, all the reasonable procedures that might occur within a supervised classification process should be taken into
Geoinformatica
consideration. Based on the procedures of the object- oriented supervised classification, this paper will propose a framework to realize web-based supervised classification in next section.
3 A framework for web-based supervised classification system In this section, a novel framework containing three modules, client, servlet and service is proposed (Fig. 2). The three modules mutually communicate via the Internet. The client module is responsible to interact with users and send their requests to the servlet module where they are sorted, formatted and transferred to the service module. Web services in the service module will parse the requests, process images and return messages to the servlet module. The servlet module codes the messages and transfers them to the client module. The client module decodes them and displays the contained information to users. Each module has several submodules that are detailed in the following.
3.1 Client module Similar to the desktop supervised classification systems, the client of web-based supervised classification systems essentially includes a data viewer, a sample selector, a class tree builder and a feature tree builder. The data viewer enables browsing images and vector files. The sample selector allows choosing sample objects from a vector file. The class tree and feature tree builders are respectively used to create class hierarchies and feature spaces. Other function requirements also have respective
Client
Data Viewer
Sample Selector
Class Tree Builder
Workflow Invoker
Result Evaluator
Data Uploader
Resource Register
Request Recorder
Request Loader
Message Sender/Receiver Servlet
Feature Tree Builder
Message Transfer
File Uploader
Service Register
Data Register
Request Builder
Response Parser
Workflow Manager
Service Monitor
Service Interface
WSDL
Service Function
Image Segmentation
Image Classification
Publish data on WMS
Small Polygon Cleaner
Feature Merger
3 bands to 1 band
Raster to Vector
Feature Calculator
Result Evaluation
WPS
Metadata Getter Fig. 2 The function modules in the framework
WMS
WCS
WFS
REST
Geoinformatica
submodules. The workflow invoker invokes the execution of the segmentation-post-process workflow and the classification-post-process workflow (a workflow involves a series of web services linked in a specific order). The result evaluator assesses the accuracy of the final classified results. The data uploader uploads the to-be-classified images to a web accessible address. The resource register registers or deregisters webservices. The request recorder and loader are capable of recovering the old sessions users closed before. These submodules provide users an entry to correctly describe and adjust their requests. As in a browser-to-server (B/S) environment, the client module has very limited data processing capabilities, the major processing tasks are assigned to the service module. The message sender/receiver submodule sends user requests to the servlet module and receives the responses. Above all, the client module is responsible for helping users describe requests, sending requests, and displaying responses.
3.2 Servlet module The existence of servlet module is a result of the constraints of web services and protocol rules. Web services are usually stateless but user sessions are stateful. A middleware servlet is needed to bridge the two by recording the states of user sessions. Besides, for security reasons, the web pages in browsers are prohibited by default to visit web services on any server with a different domain name. However, the current web services are generally scattered in different domains. Thus a message transfer submodule is necessary to transfer user requests from the client module to web services. In addition, as the only direct contact to the client module, this module supports many small useful functions. For example, the file uploader is built to receive the uploaded files from the client module, store the files under a web accessible folder and return the URLs of the files. Service register could wrap the metadata of candidate services into standardized requests and forward them to web registries like OGC (Open Geospatial Consortium) Catalogue Service for the Web (CSW) [14] or Universal Description Discovery and Integration (UDDI) services [15]. Moreover, different web services, such as OGC WPS (Web Processing Service) [16] and SOAP (Simple Object Access Protocol) services [17, 18], may have different standard schemas on requests and responses. To smoothly invoke web services and decode the results, the request builder and response parser are established to create correct service inputs and parse the outputs according to standard schemas. The workflow manager is responsible for deploying a workflow onto a workflow engine, executing the workflow and obtaining the results. The service monitor is used to record the status of every service execution and make the information available for the client module to query. In summary, the servlet module serves as a message bridge between the client module and the service module, and meanwhile a small function provider for processing some daily simple tasks in a web system.
3.3 Service module The image processing tasks in the supervised classification are materially finished by geoprocessing web services in this framework. To reuse the existing resources and improve the generality and feasibility of this framework, the service module is designed to be able to adopt all the public and standardized web services in Cyberinfrastructure (CI) [19]. The supported interface standards include WSDL (Web Service Description Language), OGC WPS (Web Processing Service), WFS (Web Feature Service), WMS (Web Map Service), WCS (Web Coverage Service)
Geoinformatica
and WNS (Web Notification Service). Also, REST (REpresentational State Transfer) services [20] are scheduled to be involved in our future plan. Other web services that do not follow these standards cannot be used in this framework. Specifically, in supervised classification the basic required functions for web services include:
& & & – – – – – – – –
Image segmentation: Input an image and output a segmented image combined by a number of image objects. In each object all the pixels have the same value. Image classification: Input an image and its segmented image. Output a vector file of polygons and every polygon has a class property and value. The required functions in post process include: transforming geospatial data into the formats supported by web browsers; combining the three bands in a RGB image into one band (the transformation should be reversible); vectorizing an image into geometry polygons; removing small polygons in a vector file; merging contiguous polygons of the same class; calculating the textual/spectral/spatial property values of polygons; evaluating the accuracy of classification results; getting the metadata like extent, projection, band size and property tables of raster or vector files.
4 Methodology According to the framework, three major problems are retained: 1) how to build robust geoprocessing web services to meet the function requirements in the supervised classification; 2) how to create a flexible servlet to bridge stateful user sessions with stateless web services; 3) how to realize a web graphical user interface for the supervised classification. In this section, we present a collection of solutions to fully solve the three problems.
4.1 Building robust geoprocessing web services The loose coupling character of web services results in plenty of choices on how to build web services [21, 22]. The services generated by a good service building schema will be robust and trustful. In this section we compare some service building schemas and address their benefits and drawbacks. Some algorithms and methods for developing and evaluating web services in supervised classification are also introduced.
4.1.1 Choosing a good schema for building web services A web service building schema briefly contains two steps: building the interface and developing the processor. The former means coding an interface program to receive standardized requests, create standardized responses and send them back to the requesters. Many open source libraries are available online to help build service interfaces, e.g. using 52°North to build OGC standard interfaces or Apache Axis2 [23] to build SOAP-WSDL interfaces.
Geoinformatica
Developing the processor, which means coding core data processing programs, has three major methods as shown in Fig. 3: 1) Software-command-based method, in which service main program starts up a software like GRASS GIS [24–26] or ArcGIS and invoke its internal commands to do data processing; 2) Single-command-based method that directly uses installed system commands; 3) Internal-program-based method, which performs the whole function only by service’s internal programs. Generally, the latter two methods are better than the first one in building a robust web service. Because once facing multiple concurrent requests, the first method will launch many instances of software and usually cause resource conflict and service collapse. The second method is suitable for complex data processing since single commands could use most system resources without the constraints of service containers (web containers, like Apache Tomcat [27] and Oracle WebLogic [28], have constraints on the maximum memory they can adopt). The third method is used to do small and simple tasks. In our framework, the second and third methods are recommended. A steady and concise data processor and a light weight service interface will make a web service robust.
4.1.2 Image segmentation algorithm Many algorithms are available for image segmentation. We demonstrate the mean shift (MS) algorithm as it has been validated as a robust algorithm [29] which satisfies the robustness requirement of web service. The MS algorithm builds a feature space and puts all the pixel points of an image into the space. Each pixel point in the space is iteratively shifted towards the local maximum peak point by a kernel density estimation function until it reaches a peak point. After all the pixel points are totally shifted, a group of peak points are obtained. Each peak point represents a cluster of pixel points which reached the same peak point. If the value of every peak point is set to its subordinate pixel points, the corresponding image will be segmented into objects. Then the boundaries of each image object are delineated as polylines that are closed into polygons subsequently (see Fig. 4). Every polygon corresponds with a ground object. The polygons will be used as inputs in the subsequent classification. Also, the uncertainties in the boundaries of segmented polygons are very important and need be quantitatively measured and conveyed to users together with the segmented polygons. If the uncertainty of the segmentation quality is high, users should be alerted and aware of the caused errors.
4.1.3 Image classification algorithm K nearest neighbor (KNN) algorithm, a widely usedmethod insupervised classification, is adopted to illustrate how to build a classification web service. The inputs of KNN algorithm include an image Service Main Program
Service Main Program
Call
Startup
Service Main Program
Call Call
Cmd1
Cmd2
Cmd1
Cmd2
Software
Single Commands
(a)
(b)
(c)
Fig. 3 Three kinds of processor developing methods for web services: a Software-command-based method, b Single-command-based method, c Internal-program-based method
Geoinformatica
Raw
Segmented
Boundary
Fig. 4 Three example images and their segmented results
and its segmented polygons, a class list, a group of sample polygons with known classes, a feature spaceandthevalue ofK.Severalfeature properties,e.g. area,length,meanandstandarddeviation, are specified in the feature space. The properties of every polygon in the segmented polygons are calculated and combined into a feature vector. Then all the feature vectors are split into two groups: vectors of class-unknown polygons and vectors of sample polygons. For each class-unknown polygon, its nearest K sample polygons are collected through comparing the Euclidean distance between its vector and the vector of every sample polygon. Among the collected K sample polygons, the class which occurred most frequently will be assigned to the class-unknown polygon. The output of the KNN algorithm is a file of class-known polygons. The meanshift and KNN algorithms are usedin this paper only for demonstration. The framework allows people to easily replace the old algorithms with new segmentation and classification algorithms such as k mean, ISODATA, logistic regression, random forests, support vector machine and neural network. Because the framework is based on web services in Cyberinfrastructure, the newly proposed algorithms can be integrated quickly by implementing new web services for them.
4.1.4 Evaluation of classification results Two indices, overall accuracy (OA) and kappa coefficient, based on error matrix are usually used to measure the accuracy of classification results [30]. Given there are n classes of polygons in a classification result, its error matrix will be an n by n matrix. k is the total number of polygons. kij is
Geoinformatica
thenumberofthepolygonswhosetrueclassisCi andcalculatedclassis Cj.Equations (3)and(4)show how to calculate ki+ and k+j. Equations (1) and (2) are the formulas of OA and kappa coefficient, respectively. The higher/lower the values are, the more/less accurate classification results will be. The inputs of calculating the two indices include a vector file containing class-known polygons and a list of validation sample polygons. The outputs are the values of the two indices. Xn k i¼ j¼1 i j ð1Þ OA ¼ k
kappa ¼
k
Xn
Xn k − k *k i¼ j¼1 i j i¼ j¼1 iþ þ j Xn k2− k *k þ j i¼ j¼1 iþ
k iþ ¼
kþ j ¼
Xn
k j¼1 i j
Xn
k i¼1 i j
ð2Þ
ð3Þ
ð4Þ
In fact, there are many other evaluation indices that are not listed here. The proposed framework could support new measures through building new web services for them. For instance, if a new measure such as F-1 measure is required, a new web service, which takes classification results and validation samples as input and outputs a F-1 score value, will be implemented and integrated into the framework.
4.1.5 Exception control mechanism Remote callers of web services always desire to know the particular reasons causing the exceptions in service execution. To find out the particular reasons, a comprehensive exception control mechanism is required. Typically, for the single-command-based web services, the exceptions are split into two categories: the exceptions caused by service main programs and the exceptions caused by single system commands. The former ones could be directly caught. The latter ones need to be found in the command execution logs. An exception usually contains an error type and an error message, which describes the location and reason. If the exception is a build-in exception, its error message will be full of technical details and lack of intuitive logic reasons. During the web service development it is strongly suggested to convert build-in exceptions into user-defined exceptions annotated by logic reasons, which are much easier for requesters to understand. Taking Java as example, a build-in exception is like: java.lang.NullPointerException: at edu.gmu.csiss.igfds.Test.main(Test.java:19). A user-defined exception is like: edu.gmu.csiss.service.exception.InterfaceException: BFail to parse the request. There is a format error in the request XML^. Once an exception happens, its error message is going to be packed into a fault response template and sent to the original requester. Based on the message, the requester can adjust his/ her inputs and retry until correct responses are returned.
Geoinformatica
4.1.6 Performance enhancement The performance information of a web service is required by both service clients and providers. Clients need to know the time, cost and quality of the service. Providers want to evaluate their services by comparing with the services of the other service providers. The metrics of service performance include server throughput, server latency, client throughput, client latency, error rate, etc. A lot of factors such as server processing capability, server load, algorithm complexity and network status could impact service performance. To enhance performance, each impact factor needs to be unit tested to address the bottleneck factor responsible for the poor performance. If a factor is identified as the source of slowness, a solution should be pursued on purpose to reduce the negative effects of the factor.
4.2 Constructing a bridge servlet Servlet is a concept coming from Java servlet that is widely used to dynamically modify client pages and manage client sessions via HTTP protocol. In the framework the bridge servlet majorly fulfills three tasks: transferring messages between client and web services, recording the statuses of user sessions, executing and managing workflows of web services. The following sections describe the specific methods respectively.
4.2.1 Message transfer Transferring a message needs an address and the information of message type. The address specifies the transferring destination and the message type decides the category of delivery method. For instance if a message is an OGC WPS XML request, it is usually delivered via HTTP POST request method. Figure 5 displays the internal mechanism of message transfer. After inputting the message content to the target service, message transfer will wait for service response, either correct response or error message, and return it to the client module instantly.
SOAP
Client
HTTP
Message Type
Request Method
Message Content
Message Transfer
Target Address Service Response Fig. 5 The flow chart displaying the internal mechanism of message transfer
Web Services
Geoinformatica
4.2.2 Metadata registry The servlet module uses a metadata registry to store the status of user sessions. The status information is usually shared over a series of requests. Through the status information the servlet can memorize the positions visitors stayed and adjust their subsequent operation strategies. For example, if a user is trying to repeat a process on the same image, the servlet module will check existing status information and inform him/her with the previous generated results. The status information is closely related to web services and service-generated data. So the registry also records the metadata of geoprocessing web services and service-generated data, especially those derivative products generated by coarse grain web services. The metadata registry can not only give a complete overview of user sessions, but also provide the provenance of final classified results. If users are not satisfied with a result, the provenance information will serve as firsthand materials for problem seeking. Technically, building such a registry needs a relational database where the metadata can be recorded as table record rows. The actions like accessing, inserting, deleting, querying and updating records are controlled by some professional database management system (DBMS). The servlet module uses a third party library to access and operate on database.
4.2.3 Workflow management The whole process of the supervised classification includes many data processing steps each of which is implemented as an independent web service. The services can be chained together in a workflow and automatically executed by one single click [31–33]. An automatic workflow is very attractive for users who are reluctant to be involved in technique details. To satisfy those users, a workflow management submodule is created to compose, deploy, execute and monitor workflows. First, users could use workflow designing tool (e.g. Oracle BPEL Designer [34] or Workflow Model Designer (WMD) [35, 36]) to design workflows. Secondly, this module will deploy the workflow packages onto some online workflow engine (e.g., BPELPower [37]), execute the workflows by user inputs, monitor workflow status and return the workflow outputs to users. In our implementation, the post processes on the segmented results and classified results are completed by two such workflows.
4.3 Building graphic web interface A good user interface is supposed to be user-friendly and intuitive. The way to build a userfriendly interface is out of the scope of this paper. This section focuses on the methods solving the usual problems in developing the interface of a web-based supervised classification system.
4.3.1 Displaying geospatial datasets in supported formats by web browsers In the perspective of RS/GIS web developers, data format is always a sensitive problem. Most of raster or vector file formats, like GeoTiff, HDF, GeoJSON, ESRI Shapefile and KML, are not supported by web browsers. Format transformation service is required to transform those files into browser supported formats such as PNG, JPG and BMP. The method we adopt here is to publish geospatial datasets into a WMS and obtain images in browser-supported formats through WMS requests. The method can conveniently combine multiple raster and vector files into a simple image map that can be directly displayed by
Geoinformatica
browsers (Fig. 6). In addition, most commercial or noncommercial web maps (e.g., Google Maps, Bing Maps, OpenLayers) support to load WMS layers directly.
4.3.2 Enabling sample selection in web page This is one of the most difficult functions to develop in a supervised classification system. A flexible sampleselectionmoduleis thekeytosuccessofthewholesystem.Inthedesignofsampleselector,the inputs include an image, a vector file and a class list. The output should be a sample table. A general methodforbuildingweb sample selectors is composedof three procedures. First, a mapis createdbya web map API (application program interface). The map should be capable of displaying geospatial image and vector files. Then a class tree is added to display the class list. When the map and the tree are ready, some event triggers and listeners are set within them. For instance, if a polygon on the map is selected when a node in the class tree is checked on, the polygon will be considered as a sample of the class that the node represents and recorded into a sample table. If a selected feature is unselected on the map, the correspondingsample rowwill be removed fromthe sample table. A sample selector built on the OpenLayers API is shown in Fig. 7.
4.3.3 Using asynchronous way to call web services The duration time of a service execution is decided by various factors like the size of input images or the load status of servers. It may take a long time to classify a big image, e.g. 1 h for a 2 GB image. In that situation, asynchronous requests will be better than synchronous requests that cause a long freeze on the web page. Ajax (Asynchronous JavaScript and XML) [38], the key technique in Web 2.0, provides a convenient object XMLHttpRequest to asynchronously call services. However, an XMLHttpRequest object will be forced to close server connections and relieve object resources if http://provenance.csiss.gmu.edu/cgi-bin/mapserv?MAP=/usr/local/apache-tomca t-7.0.39-9006/webapps/GeoprocessingWS/temp/config1400599628992.map&LA YERS=seabrook2_modified.tif,gis1400599628494&SERVICE=WMS&VERSION=1. 1.1&REQUEST=GetMap&FORMAT=image/png&TRANSPARENT=true&SRS=epsg :4326&BBOX=-70.85585501719737,42.89681634584505,-70.84763648674154,42. 90000375093675&WIDTH=941&HEIGHT=365
Fig. 6 An example WMS request and the response image
Geoinformatica
Class List
Image and Vector
Sample Selector generate
Sample Table
Fig. 7 A sample selector and a sample table in HTML page
no responses are received by long hours, even though the timeout of the connections is set to a very high value. Thus we use a substitute asynchronous request method to prevent the time out problem. We make a few modifications on message transfer to make it able to record the three status of a service execution: Running, Done and Failed. The client module first sends a synchronous request containing service address, request content and service type to the message transfer. The transfer forwards the request content to the address and meanwhile generates a code for the service execution. The client receives and uses the code to query the execution status from the status recorder. If the returned status is Running, a new query will be resent in a while. If the status is Done or Failed, the client will get a correct response or an error message at the same time. Figure 8 exhibits the workflow of the asynchronous request method. Service Request Client Program
Message Transfer
Web Service
Code Status & Response Status Query Code
Status Recorder
Status & Response Fig. 8 An asynchronous way for the client module to call web services
Multi times of interactions One Time interaction
Geoinformatica
4.3.4 Decreasing the communication traffics between client and servlet modules Toreducethe burdensontheservletmodule andthenegativeeffects of poornetwork,the frequencyof communicationbetweentheclientmoduleandtheservletmoduleneedtobedecreased,especiallythe status queries in the asynchronous method described in Section 4.3.3. An optimization method is to make the interval between two sequential queries longer as long as the overall querying times become larger. For example, the interval between the 1st query and 2nd query is 10 s and increases to 1 min for the 10th query and 11th query. The changing pattern of the interval could be specified to follow some increasing curves like sigmoid curves or para curves. Another method preventing web traffics is taking advantage of browser cookies (cache memories) to temporarily save information. When the information is requested, the client module could directly extract it from cookies rather than interacting with the servlet module.
5 Experiments and results A prototype web-based supervised classification system has been developed to implement the proposed framework. The system adopts all the methods in Section 4 and integrates several state-ofthe-art techniques such as HTML5 and CSS3. The applied third party libraries and tools include GDAL, OpenLayers, MapServer, JQuery Javascript Library, Apache Axis Web Service toolkit, Shell script and MySQL database. The system is available online at http://www3.csiss.gmu.edu/igfds. It contains some web pages for image segmentation (http://www3.csiss.gmu.edu/igfds/imageseg.jsp), image classification (http://www3.csiss.gmu.edu/igfds/ooc.jsp) and result evaluation (http://www3. csiss.gmu.edu/igfds/ooc_eval.jsp). To highlight the specific differences between web-based and desktop-based solutions, a pure desktop-based prototype system which uses exactly the same algorithms and image processing code as the web-based prototype is also realized. The desktop-based prototype is command-line-based with no graphical user interface. eCognition [39], one of the mostly used commercial supervised classification desktop software packages, is used for reference. The webbased prototype is deployed on a blade server with eight Intel(R) Xeon(R) 2.13GHz CPU, 8 GB memory and Ubuntu 12.04 64-bit Linux operating system. The desktop-based prototype and eCognition are tested on the same machine with Intel(R) Core(TM) i5-2450 2.50 GHz CPU, 6 GB memory and 64-bit Windows 7 operating system. Several RS images have been tested on the three systems and three of them are selected here for demonstration. The two prototype systems use the MS algorithm for image segmentation and the KNN algorithm for image classification. eCognition uses a region growing algorithm for image segmentation and also KNN as the classification algorithm. The same class list and feature space are applied on each image in the three systems. Figure 9 shows the final classification results. As the two prototypes use exactly the same programs, their classification results are the same. In Fig. 9 and Table 2, the prototype represents both the web-based and desktop-based prototype systems. Generally, the results of the prototypes and eCognition agree with each other on the large and apparent objects. The discrepancies, which are reasonable because their segmentation algorithms and trainingsamplesaredifferent, mainlyexistinsmallerobjects.Throughcomparingthe resultsinFig.9, we find that the prototype systems have a fairly good accuracy in classifying common geospatial objects like buildings, water bodies and fields. The OA and kappa coefficient values of the prototype systems are calculated on some manually selected samples from the results and close to the values of eCognition (see Table 1). However, the comparison results between the prototypes and eCogntion on accuracy may vary along with the differences in algorithms, parameters, images and operators. This
Geoinformatica
Fig. 9 Three RS images and their classification results by the two prototype systems and eCognition (the sizes of the three images from up to down are 3.27, 13.42 and 9.65 MB)
experiment shows that the proposed approach is able to complete the supervised classification of remote sensing images. The results demonstrate that if the adopted algorithms and parameter values are appropriate, the results of web-basedsupervised classification solution may have the same level of accuracy as the results of traditional desktop-based software. Users have major concerns about the user experiences of web-based systems. A number of tests were conducted in this experiment to compare the responding speed of the prototype and eCognition. Table 2 lists the duration time of two major steps in the supervised classification on the web-based prototype, desktop-based prototype and eCognition, respectively. The web-based prototype is a little slower in the segmentation step but faster in the classification step than eCognition. For all the three images, the overall duration of the prototype is shorter than eCognition. On the other side, the process in the desktop-based prototype is faster than the web-based prototype. The difference
Geoinformatica Table 1 The OA and kappa coefficient values of the results The prototype
eCognition
OA
Kappa
OA
Kappa
3.27 MB Image
0.92
0.88
0.91
0.85
9.65 MB Image
0.91
0.89
0.92
0.87
13.42 MB Image
0.89
0.83
0.90
0.86
is mainly caused by the time spent on transferring data over network and transforming the images to browser supported format in the web-based prototype. However, the results are obtained in a clean laboratory environment where there is no concurrent access and network conflicting. If the webbased prototype is invoked by multiple clients simultaneously, the network connectivity is poor, or the image is very large, the overall time consumption of the web-based prototype will definitely increase. Further studies are needed in order for the web-based approach to achieve the same responding speed as desktop-based systems in an operational environment. Overall, the results prove the web-based prototype can accomplish the supervised classification of RS images. Meanwhile, if the adopted algorithms and the values of the input parameters are appropriate, the result accuracy of the web-based classification approach may also reach to the same accuracy as traditional desktop-based systems. On the aspect of user experiences, the web-based system has a fair or even better responding speed in a clean web environment with no interferences. However, the Web has actually full of resource conflicts and unexpected situations such as poor network connectivity, concurrent access, memory crush, and hack attacking. It still needs a lot of work in future to solve the problems caused by these interferences and maintain a good user experience in an operational environment.
6 Conclusion This paper presents a novel framework with three modules for building the web-based supervised classification systems. A series of methods are proposed to solve the major problems in the framework. A prototype web-based supervised classification system is implemented for validating and demonstrating the framework. A number of images are tested on the prototype. The experiment results show that the web-based prototype system can accomplish the supervised classification of RS Table 2 The average response time of the prototype and eCognition for images with different size (unit: second) Web-based Image segmentation
Image classification
Total time
Desktop-based
eCognition
3.27 MB Image
15
13
12
9.65 MB Image
55
54
21
13.42 MB Image
77
73
20
3.27 MB Image
5
5
67
9.65 MB Image
12
11
234
13.42 MB Image
7
6
349
3.27 MB Image
20
18
79
9.65 MB Image
67
65
255
13.42 MB Image
84
79
369
Geoinformatica
images. The classification results could be as accurate as that of desktop-based systems if appropriate algorithms andinput parametervalues areusedandeachstepintheprocess is carefullytuned.Interms of user experience, the results also demonstrate that the web-based prototype has a very good responding speed in a clean laboratory environment. The performance of the web-based prototype is mainly influenced by network statuses and server’s available processing capabilities. In application scenarios, the web-based supervised classification system could benefit domain experts with many conveniences. As the web-based approach requires no installment and hides all the technical details on the server side, users can save a lot of time and labor on installing and managing the system. It enables users to conduct the supervised classification of RS images on all kinds of devices rather than only workstation computers. Experts can access the supervised classification capability through web browsers everywhere and anytime. Due to the loosely coupled character of the web service framework, it also becomes very easy for service providers to do system maintenance and improvements without disturbing clients. In conclusion, this paper discusses the possibility and advantages of the web-based supervised classification, proposes a framework and implements a prototype to turn the possibility into a reality. This framework is set in the background of Cyberinfrastructure and the Internet. The web-based supervised classification system can be considered as an instance of online interoperable RS/GIS analysis services, and could be integrated into web-service-based application systems in other domains like drought monitoring, city planning, water management, agriculture yield estimation. This paper paves a way on both theory and practice aspects for RS researchers and developers to build operational web-based supervised classification systems in future. In addition, the framework provides a practical solution for remedying the limited processing and storage capability problem associated with desktop-based systems. If the existing hardware configurationisnotpowerfulenoughforrunningsomesteps ofthe supervisedclassification,this framework provides an alternative way. As the framework has no constraints on where the web services are materially located, researchers can deploy their services on cloud platforms like Eucalyptus or Amazon EC2 so that cloud computing techniques can be used to augment the processing and storage capabilities. In addition, more image segmentation and classification algorithms can be wrapped as web services to serve the increased number of research groups in the future. Meanwhile, to improve user experience on the web-based supervised classification system, more research activities are needed to better handle unexpected situations in an operational web environment. Acknowledgments This research was partially supported by grants from the U.S. Department of Energy (Grant # DE-NA0001123, PI: Dr. Liping Di), U.S. National Science Foundation (Grant # ICER-1440294, PI: Dr. Liping Di), National Natural Science Foundation of China (91438203, 41271397 and 51277167) and Hubei Science and Technology Support Program (2014BAA087). The authors appreciate Ms. Julia Di of Columbia University for proofreading and improving the manuscript.
References 1. Malamas EN et al (2003) A survey on industrial vision systems, applications and tools. Image Vis Comput 21(2):171–188 2. Eastman JR (2001) idridi32/r2 Guide to GIS and image processing volume 1. Clark Labs. http://www. researchgate.net/profile/Ronald_Eastman/publication/242377547_Guide_to_GIS_and_Image_Processing_ Volume_2/links/5419a9d10cf25ebee9887ac2.pdf. Accessed on 30 Sept 2014 3. Canty MJ (2014) Image analysis, classification and change detection in remote sensing: with algorithms for ENVI/IDL and python, Third Editionth edn. CRC Press, Boca Raton 4. Long W III, Sriharan S (2004) Land cover classification of SSC image: unsupervised and supervised classification using ERDAS Imagine. in Proceedings of IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2004.Vol. 4, 2707–2712
Geoinformatica 5. Flanders D, Hall-Beyer M, Pereverzoff J (2003) Preliminary evaluation of eCognition object-based software for cut block delineation and feature extraction. Can J Remote Sens 29(4):441–452 6. Di L (2004). Distributed geospatial information services-architectures, standards, and research issues. The international archives of photogrammetry, remote sensing, and spatial information sciences, 35(Part 2) 7. Ferran A, Bernabe S, Rodriguez PG, Plaza A (2013) A web-based system for classification of remote sensing data. IEEE J Appl Earth Obs Remote Sens 6(4):1934–1948 8. Ma W, Mao K (2009) Development and application of online image processing system based on applet and JAI. Proc Int Conf EnvironSci Inf Appl Technol ESIAT 2009:382–385 9. Zhang D, Yu L, Deng C, Di L (2008) OGC WPS-based remote sensing image processing in Web environment. J Zhejiang Univ (Eng Sci) 7:018 10. Banman C (2002) Supervised and unsupervised land use classification. Notes for the Advanced Image Processing Class held by James S. Aber at Emporia State University, Emporia, Kansas, USA. http:// academic.emporia.edu/aberjame/student/banman5/perry3.html. Accessed 25 Jan 2014 11. Blaschke T (2010) Object based image analysis for remote sensing. ISPRS J Photogramm Remote Sens 65(1):2–16 12. Karen LS, Prescott AP (1986) Issues in the use of kappa to estimate reliability. Med Care 24(8):733–741 13. Zhang H, Fritts JE, Goldman SA (2008) Image segmentation evaluation: a survey of unsupervised methods. Comput Vis Image Underst 110(2):260–280 14. Stock K (2009) OGC catalogue services-OWL application profile of CSW. OGC online document. https:// portal.opengeospatial.org/modules/admin/license_agreement.php?suppressHeaders=0&access_license_id= 3&target=http://portal.opengeospatial.org/files/%3fartifact_id=32620. Accessed 16 Nov 2013 15. Graham S, Davis D, Simeonov S, Daniels G, Brittenham P, Nakamura Y, Fremantle P, König D, Zentner C (2004) Building web services with Java: making sense of XML, SOAP, WSDL, and UDDI. Sams publishing 16. Schut P. (Ed.) (2007) OpenGIS web processing service. OGC standard document. http://portal. opengeospatial.org/files/?artifact_id=24151. Accessed 10 Aug 2013 17. Gudgin M, Hadley M, Mendelsohn N, Moreau JJ, Nielsen HF, Karmarkar A, Lafon Y (2007) SOAP Version 1.2. W3C recommendation specification 18. Christensen E, Curbera F, Meredith G, Weerawarana S (2001) Web services description language (WSDL) 1.1. W3C working draft 19. Hey T, Trefethen AE (2005) Cyberinfrastructure for e-Science. Science 308(5723):817–821 20. Foerster TA, Brühl, Schäffer B (2011) RESTful web processing service. in Proceedings 14th AGILE International Conference on Geographic Information Science. Utrecht, Netherlands 21. Di L (2005) The implementation of geospatial web services at geobrain. in Proceedings of 2005 NASA Earth Science Technology Conference 22. Zhao P, Yu G, Di L (2006) Geospatial web services. In: Hilton B.(Ed.) Emerging spatial information systems and applications. Idea Group Publishing, 1–33 23. Apache (2015) Apache Axis2 version 1.6.3. http://axis.apache.org/axis2/java/core/. Accessed 05 Sept 2015 24. Yue P, Gong J, Di L, Yuan J, Sun L, Sun Z, Wang Q (2010) GeoPW: laying blocks for the geospatial processing web. Trans GIS 14(6):755–772 25. Neteler M, Bowman MH, Landa M, Metz M (2012) GRASS GIS: a multi-purpose open source GIS. Environ Model Softw 31:124–130 26. Li X, Di L, Han W, Zhao P, Dadi U (2010) Sharing geoscience algorithms in a Web service-oriented environment (GRASS GIS example). Comput Geosci 36(8):1060–1068 27. Chopra V, Li S, Genender J (2007) Professional Apache Tomcat 6. John Wiley & Sons 28. Munz F (2014) Oracle WebLogic server 12c: distincitve recipes (Architecture, Administration and Development). 2nd Edition, munz&more publishing 29. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619 30. Congalton RG (1991) A review of assessing the accuracy of classifications of remotely sensed data. Remote Sens Environ 37(1):35–46 31. Di L, Zhao P, Yang W, Yue P (2006) Ontology-driven automatic geospatial-processing modeling based on webservice chaining. In Proceedings of the sixth annual NASA earth science technology conference (pp. 27–29) 32. Yue P, Di L, Yang W, Yu G, Zhao P (2007) Semantics-based automatic composition of geospatial Web service chains. Comput Geosci 33(5):649–665 33. Yue P, Di L, Yang W, Yu G, Zhao P, Gong J (2009) Semantic Web Services‐based process planning for earth science applications. Int J Geogr Inf Sci 23(9):1139–1163 34. Juric MB, Krizevnik M (2010) WS-BPEL 2.0 for SOA Composite applications with oracle SOA Suite 11g. Packt Publishing Ltd 35. Sun Z, Yue P (2010) The use of Web 2.0 and geoprocessing services to support geoscientific workflows. In: Proceedings of 18th International Conference on Geoinformatics, 18–20 June 2010, Beijing, China. 1–5
Geoinformatica 36. Chen A, Di L, Wei Y, Bai Y, Liu Y (2009) Use of grid computing for modeling virtual geospatial products. Int J Geogr Inf Sci 23(5):581–604 37. Yu GE, Zhao P, Di L, Chen A, Deng M, Bai Y (2012) BPELPower—A BPEL execution engine for geospatial web services. Comput Geosci 47:87–101 38. Garrett JJ (2005) Ajax: a new approach to web applications. Adaptive Path, www.adaptivepath.com. Accessed 17 Aug 2015 39. Trimble Inc. (2016). eCognition, http://www.ecognition.com. Accessed on 16 Feb 2016
Ziheng Sun received B.S. in geographic information system and Ph.D. in photogrammetry and remote sensing from Wuhan University in 2009 and 2015, respectively. He is now a Research Assistant Professor with Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA, USA. His research interests include geoprocessing workflow modeling and services, geospatial cyberinfrastructure, location based system development, geospatial standard implementation and application, standards-based remote sending data and information sharing, remote sensing image classification and knowledge extraction.
Hui Fang received the B.S. degree in geographic information system from Shandong Normal University, Jinan, China, in 2009 and the M.S. degree in geographic information system from Wuhan University, Wuhan, China, in 2011. Her research interests include object-based image analysis of remote sensing images, geospatial information recognition, and automatic knowledge extraction from high-resolution remote sensing images.
Geoinformatica
Liping Di received the B.Sc. degree in remote sensing from Zhejiang University, Hangzhou, China, in 1982; the M.S. degree in remote sensing/ computer applications from the Chinese Academy of Sciences, Beijing, China, in 1985; and the Ph.D. degree in geography from the University of Nebraska–Lincoln, Lincoln, NE, USA, in 1991. He was a Research Scientist with the Chinese Academy of Sciences from 1985 to 1986 and the NOAA National Geophysical Data Center from 1991 to 1994. He served as a Principal Scientist from 1994 to 1997 and a Chief Scientist from 1997 to 2000 at Raytheon ITSS. Currently, he is a Professor of geographic information science and the Director of the Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA, USA. His research interests include remote sensing, geographic information science and standards, spatial data infrastructure, global climate and environment changes, and advanced earth observation technology.
Peng Yue received the B.S. degree in geodesy and surveying engineering from Wuhan Technical University of Surveying and Mapping, Wuhan, China, in 2000, the M.S. degree in geodesy and survey engineering from the State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing (LIESMARS), Wuhan University, Wuhan, in 2003, and the Ph.D. degree in geographic information system (GIS) from LIESMARS in 2007. From 2011 to 2013, he was a Research Associate Professor with George Mason University. He is now a Professor with LIESMARS. He also serves as the director at the Institute of Geospatial Information and Location Based Services (IGILBS), and associate chair of the Department of Geographic Information Engineering, School of Remote Sensing and Information Engineering, Wuhan University. His research interests are Earth science data and information systems, Web GIS and GIServices, and GIS software and engineering.
Geoinformatica
Xicheng Tan received the B.Sc. degree in surveying engineering from Taiyuan University of Science and Technology, Taiyuan, China, in 2002, the M.S. degree in geographical information systems from Wuhan University, Wuhan, China, in 2004, and the Ph.D. degree in photogrammetry and remote sensing from Wuhan University, in 2007. He is now an Associate Professor of Geographic Information Science with the International School of Software in Wuhan University. His research interests include geospatial web service, distributed computing, cloud computing, high performance computing, and 3-D GIS.
Yuqi Bai received the Ph.D. degree in cartography and GIS (geography) from the Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing, in 2003. He is now an Associate Professor at the Center for Earth System Science at Tsinghua University, Beijing, China. His current research interests are geospatial semantics, and spatial-temporal analysis of climate change data, and in-situ sensor videos.