Conceptual Challenges and Practical Issues in Building The Global ...

9 downloads 1566 Views 1MB Size Report
ecosystem services such as supply of drinking water, ... to build a global scalable, persistent network of lake ecology .... During times of limited charging capacity,.
Conceptual Challenges and Practical Issues in Building The Global Lake Ecological Observatory Network Sameer Tilak, Peter Arzberger, David Balsiger, Barbara Benson, Rohit Bhalerao, Kenneth Chiu, Tony Fountain, David Hamilton, Paul Hanson, Tim Kratz, Fang-Pang Lin, Tim Meinke, Luke Winslow [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Abstract Freshwater lakes provide a number of important ecosystem services such as supply of drinking water, support of biotic diversity, transportation of commercial goods, and opportunity for recreation. Wireless sensor networks allow continuous, fine-grained, in situ measurements of key variables such as water temperature, dissolved gases, pH, conductivity, and chlorophyll. Instrumenting lakes with sensors capable of sampling environmental variables is becoming a standard practice. Furthermore, many limnologists around the world are interested in getting access to and performing research on data collected from lakes around the globe to provide local, regional and even global understanding of lake ecosystems. To that end, a number of limnologists, information technology experts, and engineers have joined forces to create a new, grassroots, international network, the Global Lake Ecological Observatory Network. One of our goals is to build a global scalable, persistent network of lake ecology observatories. However, implementing and designing technology that meets requirements of a largescale distributed observing systems such as GLEON has, thus far, been challenging and instructive. In this paper, we describe several key conceptual challenges in building GLEON network. We also describe several practical issues and lessons learned during operation of a typical GLEON site. 1. I NTRODUCTION Freshwater lakes provide a number of important ecosystem services such as supply of drinking water, support of biotic diversity, transportation of commercial goods, and opportunity for recreation. Unfortunately, statistics indicate that these services are increasingly coming under stress. For example, the Millennium Ecosystem Assessment project concluded that current use of freshwater for drinking, industry, and irrigation is unsustainable. Approximately 1.1 billion people lack access to clean drinking water and inadequate water, sanitation, and hygiene conditions result in deaths of about 1.7 million people per year. This is further exacerbated due to the fact

that the global demand for freshwater is expected to continue increasing into the future (e.g., it has doubled from 1960 to 2000) [1]. These statistics motivate the importance of understanding how changes in landuse, human population, and climate interact with lake dynamics at local, regional, continental, and global scales. Developing this understanding at such scales is a daunting challenge, in part because ecological systems are characterized by high spatial and temporal variability [2], non-linear dynamics [3], [4], and coupled physical/biological processes [5]. For example, ecologists are interested in getting deeper understanding of causes of sudden and short-lived algal blooms, changes in frequency and response to disturbances such as mixing events caused by typhoons, sudden changes in rates of biogeochemical processes, and the wax and wane of fish stocks [6]. Researchers around the globe are using multiple approaches to understand this complexity and variability. Their effort primarily includes modeling, comparative analyses, and long-term observations. Central to all these approaches is high quality, spatially and temporally dense, comprehensive, well-documented, and easily accessible data [7], [8]. To that end, advancements in sensors and wireless network technology is revolutionizing science by giving access to data at unprecedented spatial and temporal granularities. For example, wireless sensor networks allow continuous fine-grained in situ measurements of key variables such as water temperature, dissolved gases, pH, conductivity, and chlorophyll. Instrumenting lakes with sensors capable of sampling environmental variables is becoming a standard practice. Furthermore, many limnologists around the world are interested in getting access to and performing research on data collected from lakes around the globe to provide local, regional and even global understanding of lake ecosystems [9]. A. The Global Lake Ecological Observatory Network To respond to these challenges and to explore opportunities, a number of limnologists, information technology experts, and engineers have created a new, grassroots, international network, the Global Lake Ecological Observatory Network (GLEON, www.gleon.org).

(a) GLEON participants as of January 2007. The yellow dots indicate the sites at the inception of the GLEON and the red dots indicate the sites that joined subsequently. Note that some of the sites do not have a buoy-based deployment, but indicate presence of a participating institute, or support in terms of hardware, software, and personnel. Fig. 1: GLEON network

One of our goals is to build a global scalable, persistent network of lake ecology observatories. It is envisioned that the data from these observatories will allow us to better understand key processes including the effects of climate and landuse change on lake function, the role of episodic events such as typhoons in resetting lake dynamics, and carbon cycling within lakes. The observatories are envisioned to consist of instrumented platforms on lakes around the world capable of sensing key limnological variables and moving the data in near-real time to web-accessible databases. A common web portal is envisioned to allow easy access for researchers and the public. A series of web services supported by this portal is envisioned to allow computation of metrics based on the high frequency data. We now describe the tenets that have been instrumental in guiding GLEONs development [10]: (1) Science is increasingly global in scale. Scientific issues critical to society transcend national boundaries. For example, change in the quality and quantity of freshwater resources and the importance of lakes and reservoirs in regional and global carbon balances are of global concern. (2) Comparative lake studies are critically important. Comparative studies of lakes spanning multiple climatic, geologic, morphometric, and cultural characteristics can provide an unprecedented insight into dynamics of important lake processes, such as lake metabolism and atmospheric exchanges. (3) A global network of instrumented research sites is attainable in the near future. The data gathered by these observatories can be made available in near real time to the scientific community and general public. (4) Multidisciplinary partnerships are crucial. To realize the vision of a global network of lake observing systems, a collaboration among lake scientists, engineers, computer scientists, educators, and information technology and management experts spanning multiple institutions around the world is critical. (5) Scientific questions utilizing the new technology will drive ecosystems science forward. The inaugural GLEON meeting occurred in March 2005, when limnologists, engineers, and information technology experts from ten countries met to discuss the scientific goals and information technology re-

quirements of GLEON. Figure 1(a) shows the current state of the GLEON network. In the next section we describe key conceptual challenges that we encountered during formation of such a large-scale grassroots network that spans multiple countries. B. Relation with other observing systems The Coral Reef Environmental Observatory Network (CREON) is a collaborating association of scientists and engineers from around the world striving to design and build marine sensor networks [11]. We believe that the lessons learned from designing and operating the GLEON network will be directly applicable to other international observing systems such as CREON. 2. C ONCEPTUAL C HALLENGES Since the inception of GLEON we faced the following unique challenges. • The resources are distributed, autonomous, and heterogeneous: The sites are geographically distributed and are administrated locally, at the sitelevel. In particular, each site designs its own architecture. The sites have a broad range of heterogeneous hardware, software resources, and information system support personnel/ expertise. • There are different languages (spoken) used to store data: Not all sites use the English language. Thus, in developing a common query language, the issue of a common language must be addressed. • There are different administrative regimes that govern the access to the resources. In part, these traits reflect the international flavor of the activity, and the different funding streams and responsibilities that these bring (e.g., mandate to make data publicly available in near real-time). These traits provide unique technological and policy challenges beyond those confronted by other large observing systems such as NEON [12], ORION [13], and EarthScope [14]. Because of the need for autonomous sites, our approach is to federate sites. As this process occurs, solutions, both organizational and technological, must be found to address the above considerations. 2

Although, GLEON sites have a diverse set of hardware, software, and personnel resources, they do have some commonalities. We now briefly describe the set up at a typical GLEON site.

radios, one which bridges a serial connection and one which bridges an Ethernet connection. On-buoy energy management strategy: Operation on buoys involves three main steps. First, the datalogger samples the attached sensors at regular intervals. Once the data is acquired from sensors, the data is logged at the datalogger. Finally, the datalogger uses a serial wireless radio to transmit the data to the field-station. Although buoys have a renewable source of energy (solar energy as mentioned earlier), short-term weather conditions and and longer-term seasonality can lead to power shortages. It is therefore necessary to use energy in an intelligent fashion. On the buoys, power is stored in two separate batteries. One of these batteries is preferentially charged by the charge controller. The datalogger and other critical sensors are powered from the preferentially charged power bus. The radio and other non-critical system components are powered from the secondary battery. During times of limited charging capacity, the secondary battery is neglected while the primary battery, and thus all logging activities, is maintained. Once sufficient power is available, the secondary radio battery is charged, and the logged data is transmitted to the field-station. This preferential charging of the logger battery prevents loss of data when there is not enough power to operate the wireless radio. Figure 2(a) shows variations of primary and secondary battery voltages for approximately four months during winter when solar radiation availability is scarce. The low sun angle and shortness of daylength results in a lack of sufficient power to charge both radio and datalogger batteries simultaneously and preference is given to the datalogger battery. Importantly, when sufficient solar radiation returns, both batteries are charged and both the radio and and the datalogger become operational. Operation and weekly visits: Regular field-visits for maintenance are critical to buoy operation.. Visits are done on a regular basis at all GLEON sites, but the frequency depends on the lake as some lakes are more biologically productive than others and biofoul of equipment occurs more quickly. As a specific example, with the current sensor package at the NTL, regular maintenance consists primarily of optical sensor window cleaning, collection of calibration samples, and overall buoy inspection every two weeks. Depending on the lake and weather, anchors are sometimes tightened or re-set. All other monitoring activities are done remotely via the wireless connection. Instrument management: A core activity in building and maintaining an observing system is bringing a new instrument or site into an existing network. The general problem encompasses coordination between field engineers and central network administrators to install a new device and ensure that the complete data/control path is functioning properly. Our current approach is comprised of a series of manual steps, starting from adding a new sensor and ending when it is set up for real time monitoring of data from the new sensor. The goal is to automate this process as much as possible and thus reduce requirements on field

3. GLEON TESTBED As an concrete example, we describe the set-up at Trout Lake Station, which is part of the North Temperate Lakes (NTL) Long-Term Ecological Research program (LTER) in northern Wisconsin, USA. NTL is one of the first GLEON sites. At NTL, scientists have deployed instrumented buoys in lakes to monitor key limnological variables (ref. Figure 2(b)). Each buoy is solar-powered and hosts a datalogger. Typically, a digital or an analog sensor is connected to a datalogger device, which acts as a simple computer to drive the sensor, store the results, and create derived data streams as specified in the datalogger program. The datalogger is equipped with a wireless radio, allowing it to be periodically polled for data transfer. At NTL a datalogger is connected to approximately 20 sensors, and the buoy is instrumented with sensors to measure variables including dissolved oxygen (DO), water temperature, wind speed/direction, and precipitation. Please refer to Table 1 for details on sensors deployed at NTL. The dataset is continuously growing, but as of 4-302007 we have over 21,225,012 datapoints collected. The oldest dataset begins in 1989 and the youngest in 2006. The data is available in the public domain at (http://lter.limnology.wisc.edu/). 4. P RACTICAL I SSUES AND L ESSONS L EARNED After describing a typical site-level setup, we now describe challenges faced during the operational phase of a GLEON site. We would like to point out that although we use NTL as a vehicle to discuss challenges and our efforts to mitigate them, the described issues are applicable to other GLEON sites and also other similar environmental monitoring initiatives. Solar Panel Orientation: As mentioned earlier, since buoys are solar powered, selection and orientation of the solar panels is an important issue. We use many different solar panels from different manufacturers. Our results with solar panels have varied little between manufacturers as long as the panel wattages were similar. However, orientation was found to be most important. Some sites have settled upon a vertically oriented (ref. Figure 2(b)) solar panel due to interesting practical reasons. In the vertical orientation the panel does not collect snowfall during the winter and bird dung during the summer, reducing the amount of maintenance required in all seasons. Although the orientation is not optimal with respect to sun angle, the resulting reduction in panel output has not been a power issue. Wireless radio technology: Our initial plan was to use a 802.11-based wireless radio to connect the buoy to the field-station. However, our attempts at using 802.11 2.4GHz radios were not successful as they did not seem to penetrate foliage as well or have sufficient range. With our buoys, we utilize Freewave brand, 900MHz serial radios [15]. We have two types of 3

TABLE 1: S ENSORS DEPLOYED AT NTL

Sensor Name Dissolved Oxygen sensor Thermistor chain containing 9-20 thermistors Anemometer Wind vane Photosynthetically Active Radiation (PAR) sensor Precipitation sensor Atmospheric pressure sensor Relative humidity sensor Air temperature sensor Acoustic Doppler Current Profiler (ADCP)

(a) NTL battery and datalogger voltages.

Measured Variable Dissolved oxygen (absolute, percentage) Water temperature Wind speed Wind direction Radiations at 400-700 nm wavelengths Precipitation Atmospheric pressure Relative humidity Air temperature Water velocity

(b) NTL Buoy. Note the vertical orientation of the solar panel to minimize snow loading during winter and fouling by bird dung during summer.

Fig. 2: NTL battery and datalogger voltages and a typical instrumented buoy.

Acquisition Reconfiguration System (AARS) system is being developed as a prototype that aims to address the issue of sensor detection, configuration and metadata representation with the help of a service-oriented architecture (ref. Figure 3(a)).

engineers and administrators. Automating the process requires maintaining necessary metadata and allowing services to access it. The next section describes our preliminary effort to address this issue. 5. S ENSOR AUTO - DETECTION We now describe a scenario where a technician receives a sensor that needs to be deployed at a buoy. First the sensor is calibrated and checked against basic faults. This calibration and testing is performed in the laboratory, and the data is documented in an Excel spreadsheet. Then the technician stops the module responsible for receiving data and the datalogger itself. These steps are necessary since we need to reprogram the datalogger to detect the new sensor and acquire data from it. Then the datalogger is reprogrammed with the new sensor configuration using the aforementioned calibration data. The database tables are then reconfigured so that the data can be deposited into the database. In this approach, the process of sensor detection, sensor configuration and metadata representation involves multiple manual steps. A process with such a high level of manual intervention does not scale well with the number of sensors. To design a system that scales well, we need to develop a support system that replaces these manual steps with automated procedures. To that end, the Automatic

A. Configuration Wizard The system includes a portal built around GridSphere [10]. One functionality included in this portal is a configuration wizard. Before deploying a new instrument, a field-technician specifies the associated metadata (such as instrument id, sampling frequency etc.) using the configuration wizard. The data is stored in the Automatic Acquisition Reconfiguration System (AARS) Metadata Repository (Figure 3(b)). Once an instrument is deployed, the instrument agent running on a laptop that is physically co-located with the datalogger first auto-detects the new instrument and then it queries this metadata using Web Services, that we describe below. B. AARS Instrument detection agent The instrument agent is responsible for the auto detection of sensors. The instrument detection routine works by polling all possible instrument addresses, noting when new instruments have been added. This routine runs in the datalogger itself and is uploaded to the datalogger by the instrument agent. This polling 4

(a) System Architecture for Instrument Management

(b) AARS Deployment Diagram for GLEON Instrument Management.

Fig. 3: GLEON Instrument Management System Prototype

Repository is to make the sensor metadata digitally available. We decided to use RDF [16] to represent sensor metadata to enable resource discovery. We have used Jena [8] to store and retrieve the metadata. Jena is a Java framework for building Semantic Web applications. Jena provides a set of API’s for working with ontology models and also for manipulating RDF models for providing persistent storage of RDF data in relational databases. In our implementation, we have used MySql as the backend database [17]. During our experimentation, we found that the response time of Jena was in an acceptable range even with the increase in the size of metadata.

routine polls the addresses by sending a send identification instruction to the addresses. If a sensor is present at an address being polled it responds with its sensor identifier. If a sensor is not present at the polled address, a value is sent that indicates the absence of a sensor at that address. The sensor identifiers and values indicating absence of a sensor are stored in the form of binary data in the final storage area of datalogger. The instrument agent decodes the data from binary format to ASCII text format. For each newly added sensor, the instrument agent verifies if metadata for the sensor is present in the internal cache. If metadata about a sensor is not present, it is retrieved from the metadata repository. The program running on the datalogger is responsible for reading data from the connected sensors. These data values are also stored in the final storage area in binary format. They are decoded by the instrument agent and then stored in the database. The instrument agent internally maintains state of the datalogger addresses and the sensors present at the addresses. The information that the instrument agent keeps consists of sensor ids and their channels. The instrument agent periodically polls the datalogger to check the final storage for binary data. The instrument agent uses the internal state and the decoded binary data to check if any existing sensors have been removed or whether any new sensors have been added.

6. O NGOING ACTIVITIES A. Data stream management: GLEON sites have instrumented lakes with a broad range of sensors to get deeper understanding of lake metabolism. Managing a broad suite of instruments and their data streams is a serious challenge. More specifically, management of real-time sensor data streams presents major processing, communication and administrative challenges. We now describe two main requirements that arise in this process (1) Site-level sensor data acquisition and stream data management (2) Data and resource sharing across multiple sites. To address these requirements we have taken an approach based on RBNB DataTurbine – an open-source streaming data middleware. RBNB DataTurbine was developed and is owned by Creare Inc. [18]. After several years of collaboration, executives at Creare Inc. have released RBNB DataTurbine into open-source in collaboration with the San Diego Supercomputer Center (SDSC) [19]. An open-source RBNB DataTurbine is a tremendous asset to the Observing Systems community, and the opensource announcement has generated considerable interest. RBNB DataTurbine provides an excellent basis for developing robust streaming data middleware. The current RBNB DataTurbine streaming data middleware system satisfies a core set of critical infrastructure requirements including reliable data transport, the promotion of sensors and sensor streams to first-class objects, a framework for the integration of heterogeneous instruments, and a comprehensive suite of services

C. Instrument Management Web Service and AARS Metadata Repository If the instrument agent detects the presence of a new sensor, it performs the following steps. First, the agent needs to know the metadata for the newly detected sensor, and retrieves the sensor metadata from the AARS Metadata Repository by invoking a Web service. A parameter to this Web service invocation is the identifier of the new sensor (obtained in the last step). The Web services allow the user to store metadata and allow the instrument agent to retrieve the stored data and metadata. Using this metadata, the instrument detection agent generates the program to reprogram the datalogger and uploads the new program to the datalogger. After this step the instrument detection agent is ready to receive data and store it in the backend database. The goal of the AARS Metadata 5

for data management, routing, synchronization, monitoring, and visualization. Streaming data middleware provides scientists and system users with richer control over data streams, sources, and sinks. To that end, we have started deploying RBNB DataTurbine at individual GLEON sites [18]. This allows individual sites to perform sensor data acquisition, transport, and dissemination in a scalable and reliable fashion. Since GLEON participating sites are distributed across multiple continents, sharing both the historical and real-time data among them poses a significant challenge. This is further exacerbated by the autonomous nature of individual sites and heterogeneous set of software and hardware technologies, and variability in personnel support levels at the participating sites. RBNB DataTurbine also enables networking of sites allowing them to share real-time sensor data streams. As a concrete example, two sites in Wisconsin (USA), and Lake Erken site (Sweden) are already using RBNB DataTurbine for site-level data acquisition, transport, and dissemination. In addition to site-level support, their RBNB deployments allows them seamless sharing of their datasets in real-time with other sites and authorized users. We are planning to work with other GLEON sites to help them install and use the RBNB DataTurbine-based system.

are working to develop a set of sophisticated, contextdependent intelligent agents capable of using disparate data sources to learn, make decisions, and provide early warning of sensor malfunction, instrument drift, or novel relationships among sensor data.

B. Integrating heterogeneous instruments

[1] “Millenium ecosystem assessment. 2005. ecosystems and human well-being: Synthesis,” 2005. [2] T. Kratz, L. A. Deegan, M. E. Harmon, and W. K. Lauenroth, “Ecological variability in space and time: insights gained from the us lter program,” 2003. [3] S. Carpenter, D. Ludwig, and W. A. Brock, “Management of eutrophication for lakes subject to potentially irreversible change,” 1999. [4] M. S. Scheffer, S. R. Carpenter, J. A. Foley, C. Folke, and B. Walker, “Catastrophic shifts in ecosystems,” 2001. [5] D. Hamilton and S. Schladow, “Prediction of water quality in lakes and reservoirs. part i - model description. ecological modelling,” 1997. [6] S. R. Carpenter, “Regime shifts in lake ecosystems: Pattern and variation,” 2003. [7] “National research council, 2000a. ecological indicators for the nation,” 2000. [8] “National research council. 2000b. our common journey: A transition toward sustainability,” 2000. [9] J. Porter, P. Arzberger, H. Braun, P. Bryant, S. Gage, T. Hansen, P. Hanson, F. Lin, C. Lin, T. K. Kratz, W. Michener, S. Shapiro, and T. Williams, “Wireless sensor networks for ecology,” in Bioscience, 2005. [10] K. Timothy, P. Arzberger, B. Benson, C.-Y. Chiu, K. Chiu, L. Ding, T. Fountain, D. Hamilton, P. Hanson, Y. H. Hu, F.-P. Lin, D. McMullen, S. Tilak, and C. Wu, “Towards a global lake ecological observatory network,” in Publications of the Karelian Institute, 2006. [11] “Creon: The coral reef environmental observatory network.” [Online]. Available: http://www.coralreefeon.org/ [12] “The national ecological observatory network.” [Online]. Available: http://www.neoninc.org [13] “The ocean research interactive observatory network.” [Online]. Available: http://www.orionprogram.org [14] “The earthscope project,” http://earthscope.org. [15] “Freewave radios.” [Online]. Available: http://www.freewave. com/products/industrial-radios.html [16] “Resource description framework (rdf).” [Online]. Available: http://www.w3.org/RDF/ [17] “Mysql.” [Online]. Available: mysql.org [18] “Creare incorporated.” [Online]. Available: http://rbnb.creare. com/rbnb/WP/WebWP/rbnbwp.html [19] “San diego supercomputer center (sdsc).” [Online]. Available: www.sdsc.edu [20] “National instruments.” [Online]. Available: www.ni.com/ [21] “Campbell scientific inc.” [Online]. Available: http://www. campbellsci.com/index.cfm

7. C ONCLUSION Understanding dynamics of important lake processes, such as lake metabolism and atmospheric exchanges, can benefit immensely from comparative studies of lakes around the globe that have different climatic, geologic, morphometric, and cultural characteristics. In this paper, we describe GLEON, a grassroots network, which strives to build a global scalable, persistent network of lake ecology observatories. We describe key conceptual challenges that we encountered because of the distributed, autonomous, and heterogeneous nature of GLEON sites. We also described several practical issues and lessons learned during operation of a typical GLEON site. 8. ACKNOWLEDGMENTS This work was partially supported by a grant from the Gordon and Betty Moore Foundation and the following NSF grants OISE 0314015, OCI 0722067, 0627026, 0446802, 0446017, 0446298, and DBI 0639229. R EFERENCES

Environmental observing systems incorporate instruments from across the spectrum of complexity, from temperature sensors to Acoustic Doppler Current Profilers (ADCP), to streaming video cameras. Again, our RBNB DataTurbine based approach is crucial in mitigating this challenge. More specifically, we have prototype versions of device drivers for National Instruments [20] and Campbell dataloggers [21]. Through our efforts, we have integrated numerous sensors with RBNB DataTurbine including, Apprise Templine Thermistor Chains, Vaisala Weather Transmitter WXT510, Vaisala Digital Barometric Pressure Sensor PTB210, Axis video camera and Axis cameras on pan-tilt-zoom (PTZ) platforms, Nikon 5700 Digital camera, Greenspan Dissolved Oxygen Sensor, Labview-based DAQs, and strain gages, string potentiometers, and linear variable displacement transducers (LVDTs). Based on past experience, we are confident that RBNB DataTurbine can accommodate a broad variety of instruments that a global-scale observing system such as GLEON would have. As we continue with RBNB DataTurbine deployments at GLEON sites, we will have more opportunities to integrate an even broader set of instruments. C. Data Quality Assurance and Quality Control (QA/QC) A crucial component of research based on data streaming from sensors is automation of quality assurance and control of the acquired data. Ideally, QA/QC agents should be able to monitor the sensor operation continuously, spot abnormal data readings, and provide information so that appropriate actions can be taken to mitigate any potential problems. To that end, we 6