EnviroInfo 2012: EnviroInfo Dessau 2012, Part 1: Core Application Areas Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
Monitoring of Environmental Status through Long Term Series: Data Management System in the EnvEurope Project Alessandro Oggioni 1, Paola Carrara1, Tomas Kliment 2, Johannes Peterseil 3 and Herbert Schentz3 Abstract The last innovations in the information science have improved developing systems relating to the creation, collection, storage, processing, modelling, interpretation, display and dissemination of data and information focused on Environmental Science (Page and Wohlgemuth, 2010). International initiatives such as SEIS (Shared Environmental Information System), GMES (Global Monitoring Environmental and Security), GEOSS (Global Earth Observation System of Systems); projects like Humboldt, NatureSDIplus, BioFresh, GIGAS; consortia such as GBIF (Global Biodiversity Information Facility), LifeWatch, DataONE (Data Observation Network for Earth) and OGC (Open Geospatial Consortium) and finally legal framework in Europe (INSPIRE - 2007/2/CE) and in United States (OMB Circular A-16 - 2002) stimulated effective implementation of an information technology innovations. A common feature consists of a development of an infrastructure, which facilitates discovery, evaluation and use of data, information and knowledge. Sharing of large datasets can establish a much deeper understanding for both nature and society, open up many new avenues of research or assist to policy-makers with relation to environmental policies. (AA.VV., 2011). Paper deals with issues related to an establishment of architecture for data exchange within the Long Term Ecological domain in Europe and propose solutions to resolve them in order to provide an interoperable system.
1 Introduction The largest site-based European network for Long Term research in Ecological Science (LTER - Europe)) is proposing a design for environmental high quality monitoring sites and an exemplary establishment of common set of parameters within the Life+ EnvEurope project (LIFE08 ENV/IT/000339). LTER is an international network (ILTER), which consists of sites in US, Japan and also many other countries. LTER Europe builds on existing research sites in European countries where long data series are collected. Many of the LTER sites actively participate in the EnvEurope project, where the objective of the data management action group is to design architecture for the data management system and to provide tools and services to exchange metadata and data among the partners. Moreover the results gained from the resources collected by the involved sites should provide a case study for the whole LTER Europe network. The proposed architecture combines current methodologies being used within the LTER Europe community, with practices adopted by international LTER Communities and related European/International recommendations and standards. Initial phase of the EnvEurope project has investigated the state-of-the-art by questionnaires related to the tools used for data management within the research sites. The results have brought first general overview about the data management approaches as well the expectations from the beneficiaries’ community. They have shown quite a low technical and standardization level in data management so far. Therefor a need to design and implement a common data exchange system becomes more than necessary in order to ensure expected user requirements and interoperability level. Two solutions have been proposed based on current situation mentioned above. First a “short-term” solution has taken a decision to 1 2 3
Institute for Electromagnetic Sensing of the Environment, National Research Council (IREA- CNR). Via Bassini, 15, Milano (Italy).
[email protected] Institute for Marine Sciences, National Research Council (ISMAR- CNR). Arsenale - Tesa 104, Castello 2737/F, Venezia (Italy). Ecosystem Research & Monitoring, Umweltbundesamt GmbH, Spittelauer Lände 5, Wien (Austria).
create both a web based online metadata entry tool and a simple ftp file-based data exchange for the project needs and data upload to the same system. Second a “long-term” solution aims to foster the creation of standard web services and OGC solutions at the site premises. In order to design and implement particular components of the proposed EnvEurope data management system the following actions have been performed: 1. An EnvEurope community metadata profile for the dataset level has been developed based on EML (Ecological Metadata Language) specification. The current draft version 2.0 is available from the EnvEurope DEIMS portal.. In provides also details about a metadata crosswalk between EML and ISO metadata profiles, which has been developed in order to ensure interoperability level with INSPIRE metadata (INSPIRE, 2008). 2. An online metadata entry and management tool, called EnvEurope Drupal Ecological Information Management system, has been developed. It is meant to collect the metadata and its implementation is realized as an extension of the Drupal Ecological Information Management System (DEIMS) developed by US LTER. Its content types, views and modules support all EnvEurope partners in the lifecycle of ecological metainformation: collection, search, display results and distribution of metadata. An indexed and controlled thesaurus for information retrieval has been developed as well and aims to define a common list of ecological concepts. It is based conceptually on the terminology used within LTER community and technically on the semantic web standards. 3. INSPIRE discovery service based on GeoNetwork opensource software is being implemented in order to provide metadata collected with EnvEurope DEIMS for remote discovery clients (i.e. INSPIRE Geoportal). 4. A GeoViewer component for the geographical visualization has been developed. All research stations and EnvEurope sites, where the data are actually collected, may be portrayed on a reference map and linked with their metadata. 5. A data reporting format, based on the UNECE ICP Integrated Monitoring programme, has been developed for a collection of the observation data (e.g. meteorology, air or water quality, abundance of species). Currently, the reported data can be directly uploaded by the partners and then accessed and downloaded using an ftp-repository. 6. OGC Web-Services (e.g. Sensor Observation Service - SOS) and Linked Data approach are currently being considered to be used as observation data access service via Internet. Such a prototype of the service-based solution will be designed, implemented and tested in the near future developments in order to evaluate whether it is feasible and if it will be a benefit for the both data providers and consumers. Paper aims to provide more details about above mentioned points in the following sections with the main focus on the components that have already been developed.
2 State of play The ecological monitoring and long-term study of ecological systems need a shared scientifically-sound basis and a methodological harmonization, at European scale, to improve the environmental management and to support the development of environmental policies and preservation planning through integrated approaches of objectives, resources and disciplines. EnvEurope project (Mirtl and Krauze 2007, Mirtl 2009, Mirtl 2010) proposes a design for environmental high quality monitoring and long-term research sites and the exemplary establishment of a set of common parameters to be collected across the largest site-based network of Long-Term Ecosystem Research in Europe (LTER Europe consists of about 400 research sites). LTER – Europe has been recently established (2006) under the auspices of the FP6 Network of Excellence ALTER-Net, building on existing infrastructures and thus a lot of valuable data series are provided. The project focuses on three types of ecosystems (terrestrial, continental water and marine) from 11 countries (http://www.enveurope.eu/). It aims at defining research and monitoring activities relevant to different levels/scales of investigation, with specific monitoring intensities and with methods adjusted to
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
the respective assessment intensity, implementing a multi-level and multi-functional approach for support future preservation planning and environmental policies (Los 2010), also for obtaining sufficient amounts of information pertaining to widespread ecological phenomena (Pullin and Salafsky 2010). The project has been ideated and planned in the conceptual and operative context of SEIS and will contribute to the development of the GMES initiative as well (Peterseil 2011). The EnvEurope goals are to: • Select and provide data, information and ecological indicators concerning the long-term quality trends of terrestrial, marine, freshwater ecosystems at European scale, inside the monitoring network E-LTER (European Long Term Ecosystem Research network) • Select and collect such selection of data, which will be able to provide information on environmental quality and drivers in respect of indicators and methodologies shared and applied in the main European networks (LTER Europe, EIONET, EU Forest Focus & ICPs of UNECE/CLRTAP/WGE, Natura2000, etc.). • Reorganise the E-LTER network on the basis of suitable sites, reflecting ecological, political and economic stratification of Europe. The reorganisation aims to contribute to the development of SEIS and GMES initiatives as well. Particularly the aim of the EnvEurope data management action group (EnvEurope Action 1) is to provide a framework which allows to share the data within the project and to provide a case study and the resulting recommendations for the SEIS. The main action is taken for the collection of available data sets with respect to environmental quality parameters for selected indicato6rs and the methodologies applied in the relevant European Networks (LTER-Europe, EIONET, EU Forest Focus & ICPs of UNECE/CLRTAP/WGE, Natura2000, etc.) as well as environmental and societal pressures influencing ecosystem quality, as required by the analysis activity. Data coming from pre-existent databases as well as newly generated data in an exemplary survey will be stored and provide. Other relevant information related to LTER and UNECE ICP-IM sites, as well as Natura 2000 sites, will be requested and stored, at least on a description level. Action aims to provide case study for SEIS by the following points (Peterseil 2011): • Demonstration of the usability of the LTER network data to facilitate biodiversity and ecosystem monitoring in Europe; • support of the analysis action with the necessary data flows and interfaces from the measurement at the sites to the calculations of the state and trends of ecosystem quality as well as cause effect relationships; • carrying out of a monitoring process to analyse to what extent the chosen approaches in structuring metadata and data as well as the tools used can contribute to the development of the SEIS. This action, as well as the whole EnvEurope project, touches issues within the scope of LTER Europe. The action can be divided in the following sub-actions: • Metadata collection • Selection and in case adaptation of existing tools for data management • Metadata and data architecture (ontology) • Collection of existing/available historical (past) data and new data from fields monitoring • Discovery and share inside network and among LTER sites. • Analysis on relevance for the European Shared Environmental Information Systems (SEIS)
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
The EnvEurope project did not start from the scratch. Experiences from different European and global projects have been used to support the solutions of the issues. These are for instance on the one hand European projects and initiatives (e.g. ALTER-Net information management, LTER Europe, LifeWatch ICT infrastructure concepts for biodiversity research, INSPIRE, etc.) and on the other hand initiative related to LTER international or US LTER network (San Gil 2010, Michener 2011).
3 EnvEurope data management system design and its components 3.1 Evaluation of the starting position As a first step towards a design and implementation of an integrated data management system to be developed within the EnvEurope project a set of questionnaires were disseminated in order to understand the current situation concerning metadata and data management within the research sites actively involved in the project. The results revealed a significant lack of homogeneities among the stakeholders actively involved in the project concerning both data and related metadata workflows management. More details about the initial investigation results are provided in Peterseil et al. (2011).
3.2 System architecture overview Due to the fragmented situation of long term ecological research and monitoring in Europe but also at the global scale a data management system has to take into consideration the following points: harmonized data models, common data management solutions, interconnected distributed network of data providers, statistical analysis tools, and workflow systems. All of above mentioned points may support the monitoring of the complex ecosystem domain and evolve the knowledge within it. (Reichman 2011). As a first step to meet them within the EnvEurope project we have designed and implemented a tool for metadata collection, discovery and view based on EnvEurope metadata specification described later. The second component is the geoviewer, which portrays EnvEurope research sites on a map. Geospatial position of the research sites is implemented with Geoserver application and distributed as OGC WMS service. On the top of this service the EnvEurope Geoportal 4 has been developed. In the case of the data we made the decision for short and long-term solutions. In the short term a solution based on OCG services could not have been proposed since the beneficiary’s information technology level was not enough sufficient to reach, an adequate level of interoperability. On the other hand for the long-term way the decision between the classic OGC services and Linked Data architecture is being considered. However, some on-going proposals of OGC services (for example SOS, WCS) and Linked Data approach are already in the phase of sample implementation and testing. In this design phase we propose general system architecture for the EnvEurope project (Figure 1) for driven beneficiaries now and after for whole LTER Europe community (about 400 research sites). The future processes will include an automatic data loading from the LTER research sites in order to provide them available through web data portal interface as well as a standard web service interfaces.
4
http://geoportal.lteritalia.it/
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
Figure 1 General EnvEurope System Architecture
3.3 EnvEurope community metadata profile Ecological Metadata Language (EML) has been used as a reference metadata model for the dataset description within the EnvEurope project. EML is developed as a metadata specification for the description of ecological datasets and is widely used within the ecological domain in ‘network of networks’ ILTER (from 2008) and also in US LTER network (Michener 1997, Jones 2001). It is based on prior work done by the Ecological Society of America and associated efforts. Application schema of the EML is implemented as a series of XML schemas that can be used in a modular and extensible manner to document ecological data. The domain specific community profile (DSMP) reflects the results of user requirements exercises performed within the representative stakeholders group. The user requirements for DSMP content were then compared with the INSPIRE and Ecological Metadata Language (EML) metadata specifications. Moreover many consultations with the experts from the project advisory board, involved beneficiaries, or other related projects (e.g. EXPEER) have supported the final definition of this metadata specification. Finally, 20 metadata elements have been proposed and included within DSMP so far (Figure 2) and taken into consideration in the metadata tool development. Considering the metadata requirements from the INSPIRE directive, we have created a crosswalk between DSMP (EML) and INSPIRE metadata profile (based on ISO 19115 metadata standard) (ISO 2003). Basically the first part of the crosswalk maps DSMP to ISO Core metadata and also takes into account both INSPIRE specific constraints defined by implementing rules (INSPIRE, 2008). The metadata crosswalk has been developed according to the methodology described in Nogueras-Iso et al. (2005) and provides first results in a form of INSPIRE compliant metadata records translated from sample metadata record defined in DSMP using XSLT technology (Kliment 2011).
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
Figure 2 EnvEurope metadata profile with all 20 metadata elements.
3.4 Drupal Ecological Information Management system - Metadata component The collection of the metadata for the dataset level description has been defined as a first task to be done within the data management group. In order to fulfil this task, a common online metadata entry and management tool has been developed. It is named the EnvEurope Drupal Ecological Information Management system 5. The current released version provides a GUI to collect the dataset metadata. It is implemented as an extension of the Drupal Ecological Information Management System (DEIMS) developed by the US LTER. Metadata editor is a component of DEIMS, which allows users to create metadata for the datasets, according to the specifications developed for DSMP. DEIMS also provides an advanced search query builder in order to filter the results to be returned.
3.5 Service-based solutions and GeoViewer component Geo-referenced are traditionally managed by means of Geographic Information Systems (GIS). Internet is a powerful innovation engine also in this field and a new type of online applications have been recently spread. They are called Web-GIS. As a matter of fact they are usually applications providing web mapping facilities, where users can visually inspect thematic maps, managed as overlaid layers, and perform simple operations like pan and zoom. Moreover in usual Web GIS applications, each repository that stores the data is strictly associated with a client interface, so that different repositories must be accessed by different user interfaces. A further advance has been introduced by the development of geo-services, i.e., Web services that serve geographic data and tools. They are the building blocks of the so called Spatial Data Infrastructures (SDI), the ICT infrastructures to share and consume spatial data in an interoperable way. In fact, geo-services are based on standard interfaces and allow decoupling the functions of data serving and data accessing. As usual for service oriented architecture (SOA), on one side a geo-data service can serve its content to multiple clients, provided they cope with the same standards; on the other one, a client can access at the same time the data of different and distributed standard geo-services. This approach has been adopted and recommended in the INSPIRE Directive. OCG and Linked Data technology are widely used in the domain of Web services for geographic information. Some of OGC Web Services are included in the INSPIRE recommendations to be used as a network services (WMS as a vies service and WFS as a download service), while SOS is recommended by
5
http://enveurope.geocatalogue.ise.cnr.it/deims/
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
GEOSS and Linked Data have been developed and were adopted by World Wide Web Consortium (W3C). Within the EnvEurope project the data to be served by the beneficiaries are mainly observations/parameter values with a spatial (x,y,z) and a temporal (t) dimension. Only in some cases this information can be provided as maps, for example if they are thematic maps from remote sensing observations or they are the result of a spatial processing over the data from one or more punctual stations. Therefore WMS and WFS will probably have limited usage in the realm of EnvEurope. SOS is instead a more promising standard, the more as in this approach data (observations) are not treated as files (like the layers included in map servers) but as records of databases, thus allowing more flexible analysis and an easier management of data with high temporal granularity. In EnvEurope, serving data by Web services represents a first step towards some valuable goals, i.e. data interoperability and sharing by multiple and independent data repositories, where observations collected by the different beneficiary institutions can be stored and maintained without the need of huge centralized storage facilities. As advanced solution, with Linked Data can be interlink among observations and resource across the World Wide Web, where resource can be a single value, word, graphic element or a whole document, picture, or any resource which can be identified by an URI and invoked via HTTP protocol. Very useful client application is available to visualise data served by WMS and WFS, but also SOS. The EnvEurope project offers now OGC Web Services (like WMS6 and WFS 7) for the resource view and access. The map service allows sharing and publishing the geographic boundaries of LTER sites represented by EnvEurope stakeholders. Map service of all LTER Europe network has been also development to achieve homogeneity between ecological research networks. A GeoViewer is a component of DEIMS and provides the geographical visualization implemented in the EnvEurope GeoPortal. In addition, the EnvEurope GeoPortal offers several features such as searching for metadata and metadata creation using the functionality of the DEIMS. The connection between GeoPortal and Metadata Editor allow users to see, for each LTER site, all geographic information and after discovery the metadata of dataset previously collected or create/modify existing metadata. In addition to this add WMS functionality from another geospatial resource has been introduced, allowing users to load the layers published in other projects, initiatives, institutions or authorities directly to the EnvEurope GeoPortal map window.
4 Conclusions Paper described the current situation within the EnvEurope project concerning the data management system design proposals and first steps taken for its implementation. The next steps, before the project conclusion at 2013, will concern the creation of a real data infrastructure. Will be extended, to all beneficiaries of the project, the web services for data distribution considering the Sensor Web Enablement and Linked Data, according to the classic style of Semantic Service Oriented Architecture. Still following this way, the metadata must be discovered through an integration of a service for the Catalog Service Web (CSW).
6 7
http://geoserver.lteritalia.it/enveurope/wms and http://geoserver.lteritalia.it/lter_europe/wms http://geoserver.lteritalia.it/enveurope/wfs and http://geoserver.lteritalia.it/lter_europe/wfs
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
Bibliography AA.VV. (2011): Challenges and opportunities. Introduction, in: Science, 331(6018): 692. INSPIRE (2008). Commission Regulation (EC) No 1205/2008 of 3 December 2008 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards Metadata. ISO (2003). ISO (2003): 19115:2003 - Geographic information - Metadata. ISO, Switzerland, 2003 Jones, M.B., Berkley, C., Bojilova, J., Schildauer, M., 2001: Managing Scientific Metadata, in: IEEE Internet Computing: 59-68. Kliment, T., Oggioni, A., 2011: Bringing Eco-Biological Metadata to the INSPIRE metainformation world, in: GI2011-X-border-SDI/GDI-Symposium, 2011, Bad Schandau, Germany. Los, W., Goense, D., Pauwels, E. (2010): e-Infrastructures and Sensor Networks for Biodiversity Research. In: Maurer, L., Tochtermann, K. (eds.), Information and Communication Technologies for Biodiversity Conservation and Agriculture. Shaker Verlag, Aachen. pp. 35-47. Michener, W.K., Brunt, J.W., Helly, J.J., Kirchner, T.B., Stafford, S.G. 1997: Non geospatial Metadata for the Ecological Sciences, in: Ecological Applications 7(1): 330-342. Michener, W.K., Porter, J., Servilla, M., Vanderbilt, K. (2011): Long term ecological research and information management, in: Ecological Informatics, 6(1): 13-24. Mirtl, M., Krauze, K. (2007): Developing a new strategy for environmental re-search, monitoring and management: The European Long-Term Ecological Re-search Network´s (LTER-Europe) role and perspectives, in: Chmielewski, T.J. (ed.), Nature conservation management - From idea to practical results. ALTER-Net. Lublin-Lodz-Hesinki-Aarhus, ISBN 83-87414-98-0. pp. 36-52. Mirtl, M. (2010): Introducing the next generation of ecosystem research in Europe: LTER-Europe’s multifunctional and multi-scale approach, in: Müller, F., Baessler, C., Schubert, H., Klotz, S. (eds), Long-term ecological research: between theory and application. Springer, Dordrecht. 456 pp. ISBN: 978-90-481-8781-2. Mirtl, M., Boamrane, M., Braat, L., Furman, E., Krauze, K., Frenzel, M., Gaube, V., Groner, E., Hester, A., Klotz, S., Los, W., Mautz, I., Peterseil, J., Richter, A., Schentz, H., Schleidt, K., Schmid, M., Sier, A., Stadler, J., Uhel, R., Wildenberg, M., Zacharias, S. (2009): LTER-Europe Design and Implementation Report - Enabling “Next Generation Ecological Science”: Report on the design and implementation phase of LTER-Europe under ALTER-Net & management plan 2009/2010. Umweltbundesamt (Federal Environment Agency Austria). Vienna. 220 pp. ISBN 978-3-99004-031-7. Nogueras-Iso, J., Zarazaga-Soria, F., Muro-Medrano, P.R., 2005: Geographic Information Metadata for Spatial Data Infrastructures: Resources, Interoperability and Information Retrieval. Springer. Page B., Wohlgemuth V. (2010): Advances in environmental informatics: Integration of discrete event simulation methodology with ecological material flow analysis for modelling eco-efficient systems, in: Procedia Environmental Sciences, 2: 696-705, International Conference on Ecological Informatics and Ecosystem Conservation (ISEIS 2010). Peterseil, J., Schentz, H., Carrara, P., Oggioni, A. (2011): Accessing long term monitoring data as ground truth - contribution from EnvEurope. Data flow from space to earth, International conference, 2011, Venice, Italy. Pullin, A.S., Salafsky, N. (2010): Save the Whales? Save the Rainforest? Save the Data!, in: Conservation Biology 24(4): 915-917. Reichman, O.J., Jones, M.B., Schildhauer, M.P. (2011): Challenges and opportunities of open data in ecology, in: Science, 331(6018):703–705. San Gil, I., White, M., Melendez, E., Vanderbilt, K. (2010): Case Studies of Ecological Integrative Information Systems: The Luquillo and Sevilleta Information Management Systems, in: Communications in Computer and Information Science 108: 18-35.
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4
Internet resources Shared Environmental Information System (SEIS): http://ec.europa.eu/environment/seis/ Global Monitoring Environmental and Security (GMES): http://www.gmes.info/ Global Earth Observation System of Systems (GEOSS): http://www.earthobservations.org/geoss.shtml Humbolt project: http://www.esdi-humboldt.eu NatureSDIplus project: http://www.nature-sdi.eu/ BioFresh project: http://www.freshwaterbiodiversity.eu/ GEOSS , INSPIRE and GMES an Action in Support project (GIGAS): http://www.thegigasforum.euGlobal Biodiversity Information Facility (GBIF): http://www.gbif.org/ LifeWatch: http://www.lifewatch.eu/ DataOne: http://www.dataone.org/ Open Geospatial Consortium (OGC): http://www.opengeospatial.org/ Infrastructure for Spatial Information in Europe (INSPIRE): http://inspire.jrc.ec.europa.eu/ Circular No. A-16 Revised: http://www.whitehouse.gov/omb/circulars_a016_rev European Long-Term Ecosystem Research Network (LTER-Europe): http://www.lter-europe.net/ EnvEurope (LTER-Europe) Metadata Specification for Dataset Level: http://enveurope.geocatalogue.ise.cnr.it/deims/sites/default/files/EnvEurope_MD_Specification_Fin alDraft_v2.0.pdf
Acknowledgements The work in Action 1 Data Management in EnvEurope is closely linked to Information Management initiatives especially from the US LTER and ILTER as well as projects on the European level. Therefore we want to thank David Blankman for his input to the metadata profile and application and providing the link to US LTER, Inigo San Gil for providing the prototype of DRUAPL MD Editor for EML and the help in getting along the way, John Porter for the US LTER Controlled Vocabulary, Chau-Chin Lin on providing an insight on Linked Data in LTER Taiwan and the colleagues from EXPEER in having fruitful discussion on the way how to proceed in a European perspective on linking data management approaches from long term monitoring and long term experimental sites. We also want to thank all beneficiaries for their contribution to the “boring” work of data and information management.
Copyright 2012 Shaker Verlag Aachen, ISBN: 978-3-8440-1248-4