Enhancing Uses of Spatial Data through Building Online Learning and Research Environment Meixia Deng, Liping Di Center for Spatial Information Science and Systems (CSISS), George Mason University (GMU),
[email protected],
[email protected] Abstract GeoBrain, a NASA funded project, builds an online learning and research environment with many cutting-edge capabilities in geospatial data publishing and accessing, information processing and retrieving, and knowledge building and sharing by adopting and developing latest Web services and knowledge management technologies. This paper addresses some important technologies for establishing this unique online environment, the innovative methods provided by the environment in discovering, accessing, visualizing, and analyzing geospatial data, and the functionalities of the environment on enhancing uses of spatial data. The GeoBrain Data Download with interoperable, personalized, on-demand data access and services (IPODAS) and the GeoBrain Online Analysis System (GeOnAS) are introduced as major components of this online environment. The GeoBrain online learning and research environment is freely available to worldwide users and has great potentials to make impacts and differences to the global geoscience higher education. Keywords: spatial data, geoscience, Web service, online learning and research 1.
INTRODUCTION
Enormous efforts have been put on spatial data collection and use in international organizations and societies, national and regional agencies, state-run and private firms, and individuals all over the world since spatial data are of significant importance in addressing social, economic, and environmental issues. The United States National Aeronautics and Space Administration (NASA) is one of the leading governmental agencies conducting those efforts. Its Earth Observing System (EOS) is “a coordinated series of polar-orbiting and low inclination satellites for longterm global observations of the land surface, biosphere, solid Earth, atmosphere, and oceans” (http://eospso.gsfc.nasa.gov/). The NASA EOS has collected huge amounts of geospatially referenced satellite image data. Those data are of great values and importance to “enable an improved understanding of the Earth as an integrated system” (http://eospso.gsfc.nasa.gov/) for the scientists and the general public alike. However, there are some problems and barriers in using those data. Due to huge volumes and the complexities of the EOS satellite image data (e.g. multiple sources and formats, different projections and resolutions, variable coverage and scales), it has never been an pleasant and easy experience in using EOS data. Though many efforts have been thrown to improve use of EOS data, some with limited flexibility by great ease of use (e.g. Google Earth) and others more local in nature with better coverage (e.g. the AmericaView sites, http://www.americaview.org), many users interested in advanced scientific exploitation and problem-solving research with EOS data may still need to search for and order them from NASA and affiliated centers through EOS Data Gateway (EDG). The minimum size for an order in EDG is a data granule, which may cover a larger or smaller geographic area than desired. The data center takes the order and distributes the data to the user as whole granules in the archival form. Services to transform data from archival to a user-specified form are normally not provided. The
user needs to spend significant amount of time and computing resources to transform the data to a user-specified form through data preprocessing, so that the acquired data will be at the same format, map projection, spatial and temporal coverage, spatial and temporal resolution as the other datasets being used in the analysis. In this way, it may take a user several weeks to do some simple analysis of data. Therefore many studies requiring real or near-real time data cannot be conducted. And also, in this way a user needs to have adequate data preprocessing skills and computing resources to fulfill his tasks. This situation puts big obstacles to using EOS data. In order to fulfill the EOS goals and take full advantage of those precious data collected by the EOS, NASA Earth science program has put a lot of efforts on removing the barriers to using EOS data, on facilitating and enhancing uses of the EOS data [Scheweizer et al, 2003]. The GeoBrain project funded by NASA in 2004 is one of the efforts. The major GeoBrain project goal is to enable easy access to, analysis of, and modeling with the large amount of NASA EOS data and their derived products by creating a unique data-enhanced online learning and research environment. This online environment especially facilitates use of those data and resources for higher education and research in Earth system sciences. This paper introduces the uses and functionalities of the environment by addressing its two major components: the GeoBrain data download with interoperable, personalized, on-demand data access and services (IPODAS) and the GeoBrain Online Analysis System (GeOnAS). Some important Web service technologies for building this online environment are introduced first to provide readers a better understanding. The GeoBrain Data Download with IPODAS and GeOnAS provide innovative methods in discovering, accessing, visualizing, and analyzing geospatial data, which enhance uses of geosptial data significantly and have great potentials to make strong impacts and big differences to the global geoscience higher education. 2.
TECHNOLOGIES IN GEOBRAIN
The GeoBrain project builds an open, interoperable, Web-based, standardscompliant data and information system called GeoBrain by adopting and developing latest Web services and knowledge management technologies and standards [Di, 2004]. This GeoBrain system has been implemented with many cutting-edge capabilities in geospatial data publishing and accessing, geospatial information processing and retrieving, and geospatial knowledge building and sharing based on ISO, W3C and OpenGIS Consortium (OGC) Web service standards or specifications [Di, 2006]. The important technologies developed in GeoBrain for establishing the online learning and research environment include, but not limited to, (1) OGC standards-based Web Coverage Service (WCS), Web Map Service (WMS), Web Feature Service (WFS), Catalog Service-Web Profile (CSW) servers and clients; (2) OGC Geoprocessing services, e.g., Web Coordinate Transformation Service (WCTS), Web Image Classification Service (WICS), Feature Cutting Service, Reformatting Service; (3) Value-added Geospatial Web services; (4) The BPEL Workflow Engine: BPELPOWER; (5) Web service chaining and modeling tools. In order for readers to better understand the technology foundations of the GeoBrain online environment on enhancing uses of spatial data, some details are given as the followings. The OGC WCS specification (http://www.opengeospatial.org/standards/wcs) supports the networked interchange of multi-dimensional and multi-temporal geospatial data as "coverage" and provides intact geospatial data products encoded
in HDF-EOS, NetCDF and GeoTIFF. GeoBrain WCS server based on OGC WCS is fully implemented to enhance the capabilities of coordinate transformation, domain subsetting, range subsetting, spatial scaling and resampling, data format encoding and result file provision. And it can deal with multiple data formats and do reprojection to satisfy different user’s requirements. The OGC WFS specification (http://www.opengeospatial.org/standards/wfs) supports the networked interchange of geographical vector data as "feature" encoded in Geographic Markup Language (GML), a widely used extensible markup language for support and storage of geographic vector data to meet the requirements of complex spatial analysis. And the OGC WMS specification (http://www.opengeospatial.org/standards/wms) supports the networked interchange of geospatial data as "map" which is generally rendered in a spatially referenced pictorial image, such as PNG, GIF or JPEG, dynamically from real geographical data. GeoBrain WFS and WMS servers based on OGC WFS and WMS enable users easy access to nationwide feature data and image map, including state, county, city, road, railroad and river. To facilitate user to discover data and services, catalog service is a core component of the GeoBrain. The OGC CSW (http://www.opengeospatial.org/standards/cat) defines a common interface that enables diverse to publish, discovery, browse and query collections of descriptive information (metadata) for data, services, and related information objects against distributed heterogeneous catalog servers. An innovative catalogue federation based on OGC CSW is developed in GeoBrain [Bai and Di, 2007]. The GeoBrain catalog service provides a single access point to two distributed catalogues: CSISS CSW and NASA Earth Observing System Clearinghouse (ECHO, http://www.echo.nasa.gov/). The CSISS CSW is developed for maintaining metadata for data and services in the GeoBrain system. The NASA ECHO is a metadata clearinghouse and order broker being built by NASA’s Earth Science Data and Information System (ESDIS) to support efficient discovery and access to Earth Science data, especially those at NASA Distributed Active Archive Center (DAAC). In addition to the OGC standard based geoprocessing services stated as above, GeoBrain also develops large amount of value- added geospatial Web services on top of GRASS (Geographic Resources Analysis Support System) and some GDAL (Geospatial Data Abstraction Library) functions compliant to OGC standards and specifications or some other national or international standards to provide an interoperable way to analyze raster and remotely sensed data. GRASS is an open source GIS software package with over 350 programs and tools for raster and vector data analysis. Those standards-compliant, Web-based geoprocessing services, data services and data processing services in GeoBrain to generates customized scientific data products. 3.
GEOBRAIN ONLINE ENVIRONMENT
Powered by the above state-of-art Web services technologies, the GeoBrain online environment enhances uses of spatial data with innovative methods in publishing, discovering, accessing, visualizing, and analyzing geospatial data. Unlike
GeoBrain IPODAS and GeOnAS are two main components of the online environment on enhancing uses of spatial data. While the functionalities and usages of them are quite different, both of them provide users easy access to NASA EOS data deposited in GeoBrain Data Repository or NASA EOS online data pools. GeoBrain Data Repository is the data server in the GeoBrain system and is populated with about 20 TB key NASA EOS data to provide fastest and easist access to the most popular satellite data. There are four NASA EOS online data pools running, at NASA GSFC, USGS EDC, NSIDC, and NASA LaRC. The total EOS data in the pools is close to 600 TB. Machine-to-machine interfaces between GeoBrain and the NASA ECHO system has been established to enable users transparent access to NASA EOS on-line data at data pools. By putting large amounts of NASA EOS data in its fast accessible data repository and by querying and fetching data through ECHO, GeoBrain makes petabytes of NASA EOS data and information, especially those in the EOS Core System (ECS) data pools, as easily accessible as users’ local resources. The followings will discuss functionalities and usages of GeoBrain IPODAS and GeOnAS in some details. 3.1 GeoBrain IPODAS GeoBrain IPODAS is designed to provide users with the most conveniences for downloading data. Users can obtain data in the exact form they want through GeoBrain data download Web portal (http://geobrain.laits.gmu.edu:8099/GeoDataDownload/) powered with IPODAS. In addition to the functionalities of discovering, sub-setting, re-sampling, reformatting, georectifying, and re-projecting data, it also has some very useful features, such as, using Google map to specify spatial area, support of clicking and dragging to define areas of interest, support for input of a country name to define the coordinates of the bounding box, support for multiple projections and multiple data formats, support to downloading the data in compressed format, and support to asynchronized data download. Figure 1 is a snapshot of using the GeoBrain data download Web portal (version 2.0). The portal provides an integrated easy-to-use interface for users to obtain EOS data in the form they want (e.g., format, coverage, resolution, projection, etc). Figure 1: Using GeoBrain Data Download Web Portal
3.2 GeOnAS
GeOnAS (http://geobrain.laits.gmu.edu:8099/OnAS/) is designed for online analysis of EOS and other online data. Based on service oriented architecture, the online analysis system provides multisource geospatial data discovery, heterogeneous geospatial data retrieval, simultaneous geospatial data visualization, and powerful geospatial data analysis equipped by those large amounts of interoperable geospatial Web services. It allows users to dynamically explore and preprocess any part of the petabytes of archived data and get back customized information products rather than raw data. All this can be done with a regular Internet connection to users’ desktop/laptop computers. Current GeOnAS supports Internet Explorer and Firefox browsers. Figure 2 is the snapshot of the GeOnAS Web interface. Figure 2: GeOnAS Interface
3.
SERVING HIGHER EDUCATION
GeoBrain is dedicated to serve Earth science higher education and remove the barriers to data use in education. Satellite imagery data are most commonly used in Earth science education. The most common barrier to using those data for education investigated by the AccessData and Data Services Workshops (serc.carleton.edu/usingdata/accessdata/index.html) is discoverability (finding the data), data format problems, incomplete datasets, and poor documentation on the website [Lynds, et al, 2007]. GeoBrain perfectly solves these problems. In order to better serve higher education for their classroom teaching, learning and research activities, faculty members in multiple universities have been funded as GeoBrain education partners to explore the use of GeoBrain in the classroom teaching and student research. Their feedbacks have been used to guide the development of GeoBrain for the higher education user community. The GeoBrain online environment also put efforts to relieve universities from worries not having adequate software systems and sufficient resources for students to do dynamical analysis of the large amounts of EOS data. Higher education and research can use GeoBrain in four major ways: as an online unlimited global geodata source, as an on-line data analysis system, as an online platform for geospatial processing modeling, and as an online platform for building and sharing geospatial knowledge. With GeoBrain, professors who uses EOS data for teaching in the class would not have to waste time to obtain the needed samples of EOS data and then do
the georectification, reprojection, and reformatting of the data into the form acceptable to the in-house analysis systems. Teaching tasks that were formerly very challenging or even impossible can become much easier or practical with the help of GeoBrain. The online system supports dynamic classroom demonstrations and problem-based learning in which students deal with real- world applications that involve issues of global climate and environment changes that require data- intensive information. Using GeoBrain, each student can be trained to handle multi-terabytes of EOS and other geospatial data with a simple Internet connected computer. They can use this data for simulation and modeling for solving global-scale problems relevant to their own research interests. In this meaning, the GeoBrain online environment play an unique role to help universities for preparing students as the well-trained work force demanded by the changing societies. The GeoBrain online learning and research environment is freely available to worldwide users. User statistics in Year 2006 and 2007 show significant increases of distinct user number and data download volume by month. For example, the distinct user number increased 999 and the volume of data downloaded increased 10.5 GB from October to November 2007. Since GeoBrain targets to provide customized data and information products instead of huge size raw data, the absolute data volumes downloaded doesn’t mean much, but the increases of distinct users and data downloaded volumes each month show that the GeoBrain online environment has enhanced uses of spatial data significantly and has great potentials to make impacts and differences to the global geoscience higher education. 5.
CONCLUSION
In conclusion, The GeoBrain online learning and research environment powered by GeoBrain IPODAS and GeOnAS serves Earth science higher education with innovative capabilities in discovering, accessing, visualizing, and analyzing geospatial data and enhances uses of spatial data largely. In this environment, users can easily access huge volumes of NASA EOS data as if they possess them in their local resources. They can interactively, through their desktop computers, explore answers to scientific questions by mining the petabytes of NASA EOS data and doing further analysis. Such an environment inspires students’ curiosity in sciences and enables faculty members and students to do many studies that could not be done before. The GeoBrain online environment makes changes to the global geoscience higher education in many aspects. REFERENCES Articles in journals Reports Chapters in Edited Volumes Di, L. (2006). " The Open GIS Web Service Specifications for Interoperable Access and Services of NASA EOS Data”, in Qu, J. etc. (Ed). Earth Science Satellite Remote
Sensing, Springer-Verlag. pp. 254-268. Web-based articles Scheweizer, D. and Wei, M., (2003). NASA REASoN: Innovative Uses Of
Earth Science Enterprise Data In Earth System Science Education , at http://gsa.confex.com/gsa/2003AM/finalprogram/abstract_65411.htm Proceedings Di, L., (2004). “ GeoBrain-A Web Services based Geospatial Knowledge Building System.”, Proceedings NASA Earth Science Technology Conference 2004. June 22-24, 2004, Palo Alto, CA, USA. (8 pages. CD-ROM). . Copyright Notice As a condition for inclusion in the GSDI 10 Conference Proceedings, this work is licensed under the Creative Commons Attribution 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/us/. Any use of the work in contradiction of the Creative Commons License requires express permission by the authors of the article.