Document Template Standard

3 downloads 0 Views 285KB Size Report
During the transition of the TAO array from PMEL to. NDBC, it was decided to refresh the TAO system by replacing the obsolete components to ensure ongoing ...
Refreshed Data System for Tropical Atmosphere Ocean (TAO) Buoy Array Landry Bernard1 Kevin Kern1 Jing Zhou2 Chung-Chu Teng1 National Data Buoy Center1 Science Applications International Corporation2 Stennis Space Center, MS 39529 USA The Tropical Atmosphere Ocean/Triangle TransOcean Buoy Network (TAO/TRITON) moored buoy array is a central component of the El Niño-Southern Oscillation (ENSO) Observing System to support research and forecasting of El Niño and La Niña. The present composition of TAO/TRITON consists of 55 TAO legacy ATLAS moorings developed by NOAA Pacific Marine Environmental Laboratory (PMEL) and maintained by NOAA National Data Buoy Center (NDBC), 12 TRITON moorings maintained by Japan Agency for Marine-Earth Science and Technology (JAMSTEC), and 5 subsurface Acoustic Doppler Current Profiler (ADCP) moorings (4 maintained by NDBC and 1 by JAMSTEC). Abstract:

During the transition of the TAO array from PMEL to NDBC, it was decided to refresh the TAO system by replacing the obsolete components to ensure ongoing continuity of the TAO array. The major refreshed components include the data logger, underwater conductivity/temperature (CT) sensors, and the compass for measurement of wind direction. Meanwhile, to increase the transmission frequency and transmitted data volume, NDBC decided to use the Iridium communication system for the refreshed TAO system so that high temporal resolution data could be transmitted to NDBC each hour in near real-time. Accordingly, the shore-side data system for data ingest, processing, quality assurance/quality control (QA/QC), and display were modified and enhanced. Thus, NDBC decided to redesign the data system for the TAO buoy system. This refreshed data system will work with both the legacy TAO buoy system (via Argos) and the refreshed TAO buoy system (via Iridium). This paper presents the refreshed IT architecture and design for both the legacy and refreshed TAO buoy systems. Details of NDBC data management services, which include data acquisition, data quality control, data storage and retrieval functionality, metadata management, and user interfaces for distribution to the public, will be discussed.

1.

TAO DATA FLOW OVERVIEW

The purpose of this paper is to explain the TAO data flow and describe NDBC data management services for the TAO moorings according to National Oceanic and Atmospheric Administration (NOAA) Global Earth Observation – Integrated Data Environment (GEO-IDE) guidance. The services constitute a comprehensive end-to-end process including movement of data and information from the observing system sensors to data users. This process includes the data acquisition, quality control, metadata cataloging, validation, storage, retrieval, dissemination, and archival of data. 1.1 INTRODUCTION

The Tropical Atmosphere Ocean/Triangle Trans-Ocean Buoy Network (TAO/TRITON) moored buoy array is a central component of the El Niño-Southern Oscillation (ENSO) Observing System to support research and forecasting of El Niño and La Niña. The present composition of TAO/TRITON consists of 55 TAO legacy moorings maintained by NDBC, 12 TRITON moorings maintained by Japan Marine Science and Technology Center (JAMSTEC), and 5 subsurface Acoustic Doppler Current Profiler (ADCP) moorings (4 maintained by NDBC and 1 by JAMSTEC). The corresponding TAO array layout is shown in the picture below.

Figure 1 TAO Array Location Map

Much of the transitioned technology from Pacific Marine Environment Laboratory (PMEL) is approaching 10 years of age and an increasing number of components are being discontinued or are no longer supported by manufacturers. The legacy moorings, also referred as Autonomous Temperature Line Acquisition System (ATLAS) buoys, use Service Argos for data transmission via Polar Operational Environmental Satellite (POES), which only allows a couple of transmissions during a day. Therefore the ocean temperatures are averaged over 24 hours on board the Atlas buoy and then transmitted on a daily basis. To ensure ongoing continuity of the TAO array, NDBC decided to refresh the TAO system by replacing the obsolete components. The major refreshed components include the data logger, underwater Conductivity/Temperature (CT) sensors, and the compass for measurement of wind direction. Meanwhile, to increase the transmission frequency and transmitted data volume, NDBC decided to use the Iridium communication system for the refreshed TAO system so that high temporal resolution data could be transmitted to NDBC each hour in near real-time. Accordingly, the shore-side data system for data ingest, processing, quality assurance/quality control (QA/QC), and display were modified and enhanced [1]. The picture below is a refreshed TAO buoy.

1.2

REAL-TIME DATA FLOW

The primary data transmitted from TAO legacy moorings in real-time are daily mean surface measurements (wind speed and direction, air temperature, relative humidity, and sea surface temperature) and subsurface temperatures. Optional enhanced measurements include precipitation, short and long wave radiation, and barometric pressure, salinity, and ocean currents. Service Argos receives the legacy moorings’ data via the POES satellites and delivers the raw data to NDBC as shown in the TAO Data System diagram below. Service Argos also receives sensor calibration coefficients and release controls from NDBC and places converted data on the Global Telecommunication System (GTS). The legacy mooring data are processed daily at NDBC immediately after receipt. The first step in the daily processing is the application of calibration functions. Once the data are converted to engineering units, they are subjected to a series of automated quality control checks, which produces a daily status report to be viewed by DQAs (Data Quality Analyst). The released data are then pushed to the TAO web data display and data delivery systems for public access. The TAO refreshed buoys transmit data to the Department of Defense (DOD) Iridium gateway via Iridium satellites. High temporal resolution (10-minute and hourly) measurements are available in the hourly transmitted Short Burst Data (SBD). The Iridium gateway delivers the SBD to the NDBC Real-time Data Ingest and Dissemination subsystem. A decoder is used to decode the SBD and store them in the underlying MySQL database. As the data are already in engineering units, there is no need to apply calibration functions to the data. Hourly or high resolution data quality control checks are performed and the results are integrated into the same daily status report used for the legacy moorings. Quality controlled data are then pushed to the TAO web data display and data delivery subsystems and the NWSTG GTS.

Iridium SBD

Argos ADS

NWSTG GTS

Database and File Management Public Web Presentation

Public Users

Figure 2 Refreshed TAO Buoy Figure 3 TAO Data System In the following sections, the data flow from both the 55 TAO legacy moorings as well as the refreshed moorings is covered with emphasis on shore-side data processing.

DAC Analysts

Automated QC and Alerts

Console Interfaces

Scientists

Delayed Mode Analysis

Real-time Data Ingest and Dissemination

1.3 DELAYED-MODE DATA FLOW

After every legacy mooring recovery, the internally recorded, high-temporal resolution data are subjected to a series of quality checks similar to those for real-time data. Data are de-spiked as necessary, and additional analysis is performed, such as computation of spectra and histograms. In order to compare the high resolution data with the daily means returned in real-time, daily means are re-computed from the high-resolution data. In general, the re-computed daily means from delayed mode data are considered to be of equal or higher quality. If this is the case as determined by DQAs, the re-computed daily means replace the ones returned in realtime. The high-resolution data are also made available on the NDBC TAO website. For refreshed mooring recovery, the recovered data are to be used to fill any data gaps caused by transmission outages in real-time. Personnel on the TAO array service cruises also perform ship-board Conductivity Temperature Depth (CTD) cast and shipboard ADCP measurements. The collected CTD data are inspected and made available on the NDBC TAO website with metadata and correction information. The collected ADCP data are sent to University of Hawaii.

2. DATA MANAGEMENT SERVICES

NDBC TAO Data Management services include data acquisition, data quality control, data storage and retrieval functionality, metadata management, and user interfaces for distribution to the public. 2.1. INTERFACE TO THE OBSERVING SYSTEM

The real-time data ingest subsystem receives Service Argos’s Automated Distribution Service (ADS) messages from the TAO legacy moorings around 10:00 am GMT each day. The ADS messages are organized by Argos Platform Transmitter Terminal (PTT) numbers and data from one PTT are organized by four 16-word ATLAS buffers. As each PTT is associated with a unique buoy at sea, the real-time processing decodes the buffers and converts the raw data to engineering units and stores the results in the MySQL database for the corresponding buoy deployment [2]. For the TAO refreshed moorings, the Advanced Modular Payload Systems (AMPS) are used along with Iridium modems to transmit Short Burst Data (SBD). The TAO AMPS buffer map is re-designed within the SBD format to carry high resolution measurements. The subsystem receives hourly SBD, processes, and stores them in the database immediately [3]. There are delayed mode data recovered from both types of mooring. The data formats depend on instrument types/vendors as well as on-board payload systems. The valid

daily averages computed from recovered data are eventually stored in the database as the results of delayed mode QA/QC processing.

2.2

DATA QUALITY CONTROL AND ASSEMBLY

2.2.1

REAL-TIME DATA MONITORING

Once the real-time data in engineering units are stored in the database, a series of quality checks is performed by automated procedures. General checks include checking for values which exceed pre-determined range limits, change suddenly according to time continuity thresholds, are constant, are missing, or are intermittent. Tests specific to certain data types include checking water temperature gradient, density inversions, and wind data for properly functioning compass and vanes. Ancillary data, such as mooring location, depth of pressure sensors, and other indicators of the general condition of the mooring or instruments are also checked. The Data Management Console is the primary tool for DQAs to monitor the real-time data. When necessary, a DQA is also able to control the release statuses of individual sensors as well as to assign quality flags to individual data elements or a range of data elements over a specified time period. Several monitoring reports are described in the following sections. 2.2.1.1.

DAILY STATUS REPORT

A Daily Status Report for the entire TAO buoy array is generated each day at 12:00:00 GMT. The report is readily accessible from the front page of the Data Management Console and from the side menus. It analyzes the most recent data for each measurement type on a buoy and displays the relevant flags. The TAO buoy array graphic status is on the top of the report. Each of 55 nominal sites in the TAO Array is displayed as a colored dot. A green dot indicates 80% to 100% of sensors are good, a yellow dot indicates 60% to 79% of sensors are good, and a red dot indicates less than 60% of sensors are good or the buoy drifts outside its data grid or it has a transmission outage. These dots also lead to the sensor status reports for the corresponding sites. The sensor statuses are calculated in two phases: flag generation phase and flag inspection phase. In general, low level automated quality check is performed in the flag generation phase and high level automated quality check is performed in the flag inspection phase. Low level quality check settings are specified in range limits for either a site or region. A range limit can be configured to specify the upper and lower limits for hourly data or for daily average data. It can also be configured to specify the upper and lower limits for hourly data changes or for daily average changes. The lower limit checking is to examine the near constant outputs from sensors and the upper

limit checking is to examine the sudden changes from the sensors, which is equivalent to the time continuity checking. The Environmental Quality Control (EQC) flag associated with the range limit indicates the meaning and severity (hard or soft) when the range limit is exceeded. After the low level flag generation phase, certain related checking are performed. For example, if the water temperature at a given depth is missing or failed, then the corresponding conductivity, salinity, and density at that depth will also be failed. The water temperature gradient, the density inversion checking, the air temperature vs. sea surface temperature checking are performed and soft flags may be raised. A special class of range limits checking is called TAO 90day checking. The mean and standard deviation of water temperature and salinity at a given site and depth range are calculated over a 90-day period centered at the calculation time. The upper limits of these range limits are assigned to the mean plus 3 times of the standard deviation and the lower limits are assigned to the mean minus 3 times of the standard deviation. The range limits are updated on weekly basis over the corresponding 90-day time window. 2.2.1.2

SITE STATUS REPORT

As part of the Daily Status Report, a Tabular Data Report of the most recent data for each site is provided via a hyper link. When a DQA finds something in the Daily Status Report that needs further investigation, clicking on the tabular data link will bring DQAs to this report. For legacy buoys, daily average data and hourly data is displayed. For refreshed buoys, though, hourly data is displayed; daily average data will probably be incorporated in the future. 2.2.1.3.

DAILY QC PLOTS

A daily plot is generated for each measurement type on each sensor from an active deployment. For legacy buoys, the data plotted spans 10 days. For refreshed buoys, the data plotted spans 2 days. In addition, isotherm plots are also provided as shown below.

2.2.1.4

DRIFT PLOTS

Another aspect of TAO data quality control involves the monitoring and plotting of buoys that have gone adrift. A processing script runs daily to check the position of each buoy deployed in the TAO array. If an active deployment has drifted more than 6 nautical miles from its mooring location or the deployment drift flag is on, the drift plots is generated automatically for the day and placed in the Data Management Console. A drift plot image also links to the table of position measurements recorded from the buoy so that DQAs may see the actual position values. Drift plots are also made available to the public via the NDBC TAO website. 2.2.2.

DELAYED MODE DATA ANALYSIS

Delayed mode data (raw data) recovered from legacy moorings are first processed using computer programs (and redesigned Matlab tools) that apply pre-deployment calibrations and generate time series in engineering units. These programs also flag for missing data and perform gross error checks for data that fall outside physically realistic ranges. A log of potential data problems is automatically generated as a result of these procedures. Next, time series plots, spectral plots, and histograms are generated for all data. Statistics, including the mean, median, and standard deviation, variance, minimum and maximum are calculated for each time series. Individual time series and statistical summaries are examined by DQAs. Data that have passed gross error checks but which are unusual relative to neighboring data in the time series, and/or which are statistical outliers, are examined on a case-by-case basis. Mooring deployment and recovery logs are searched for corroborating information such as problems with battery failures, vandalism, damaged sensors, or incorrect clocks. Consistency with other variables is also checked. Data points that are ultimately judged to be erroneous are then flagged. For some variables, additional post-processing after recovery is required to ensure maximum quality. These variables are rain rate, subsurface pressure, and salinity. Much of results of the delayed mode processing are the recomputed daily means, which will replace the corresponding daily means returned in real-time. The real-time daily, weekly, and monthly QA/QC procedures can be found in the “DMAC Real-time Operating Procedures Manual.doc” [4]. The delayed mode QA/QC procedures can be found in the “DMAC Delayed Mode Operating Procedures manual.doc” [5]. 2.3

Figure 4 Isotherm Plot

DATA ARCHIVE

NDBC does not have data archive responsibility. External government agencies, such as National Oceanographic Data Center (NODC) and National Climatic Data Center (NCDC), are responsible for the data archive. The exact interfaces to the data archive agencies are to be defined at NDBC.

However, NDBC does store all real-time and recovered data including CTD cast data in databases and file systems, and quality controlled data are made available to the public via the NDBC TAO website [6]. The ADCP shipboard data samples are transmitted to University of Hawaii daily via an email message generated by the shipboard ADCP system. 2.4 DATA DISCOVERY AND ON-LINE BROWSE

2.4.1

METADATA MANAGEMENT

Metadata are essential elements in advanced data analysis and scientific modeling. They are integral parts of observation data and stored in the underlying database to support the data discovery and on-line browse capabilities in the TAO website. The TAO metadata are organized around the three core concepts, sensor, deployment, and site, to describe observation data. The Data Management Console provides user interfaces for DQAs to manage all metadata to support field services. In the following sections, the high level TAO supported metadata entities are described. 2.4.1.1.

SENSOR

Sensor Once a sensor is possessed by NDBC, a unique identifier, the NDBC property number, is assigned to it. A sensor also has a unique serial number and a Sensor Type assigned by its vendor. Sensor Type The sensor type is derived according to well known common names or model names from vendors, such as wind monitor, serial CT sensor, inductive CT sensor, etc. Organization The organization is an entity with a legal name and a physical address and some contacts. When referenced in a Sensor Type, it indicates the vendor of the sensors with the Sensor Type. When referenced in a Site, it indicates the owner of the Site.

Sensor Type share the same specification. The specification provides the information about the minimum/maximum values, resolution, engineering units, sampling frequency, and sampling duration, etc. Sensor Calibration For each of legacy sensors, calibration functions are applied to its raw data in order to convert them to the corresponding engineering units. A sensor calibration specifies the calibration time, the effective start/end time if any, and the calibration coefficients. Sensor Correction For each of refreshed sensors, linear correction functions may be applied to the engineering units to achieve calibration equivalent functions. A sensor correction specifies the effective start/end time, the linear coefficients.

2.4.1.2.

DEPLOYMENT

Deployment The deployment specifies deployment name, start and end time, mooring latitude and longitude, water depth, watch circle, primary/secondary transmission identifiers (e.g. IMEI for Iridium and PTT for Argos), deployment stage (e.g. testing or deployed), buoy type, drift status, transmission status, and deployment status (primary deployment or co-located deployment or recovered deployment). A deployment is always associated with a Site, where its nominal latitude and longitude, WMO number, owner, etc. can be found. Deployed Sensor The deployed sensor associates a Sensor with a Deployment. In addition, it specifies the sensor’s mooring depth, deployed start/end time, payload address, primary sensor flag, and backup sensor for the Sensor. Sensor Measurement Type The sensor measurement type reflects the actual configured Measurement Type of the deployed Sensor.

Measurement Type

Event Type

The measurement type specifies the type of measurements from a sensor, such as air temperature, relative humidity, salinity, etc.

The event type classifies events that could occur during a Deployment, such as drift events, transmission outage events, etc.

Sensor Specification

Deployment Note

The sensor specification is factory or end-user configured for a given Sensor Type. A group of sensors with the same

The deployment note describes deployment events for a Deployment with classifications by Event Type.

Deployed Sensor Note The deployed sensor note describes sensor events for a specified deployed Sensor. 2.4.1.3 SITE

Site The site specifies site name, site type, WMO number, nominal latitude and longitude, elevation, declination, and description. In addition, a site may belong to a Region and/or owned by an Organization. Region The region consists of a group of Site(s), such as the TAO region containing 55 sites. A Site can be associated with a region.

deployment status. Similar to the Data Management Console, the colored site dots lead public users to explore a site data by clicking on it. Once clicked, the corresponding active deployment page will be displayed. The deployment page displays all sensors and the last received data. The public users are also able to explore a sensor by clicking on the desired sensor in the deployment page. Once clicked, the corresponding sensor page will be displayed with the sensor’s data and metadata. All historical and co-located deployments are grouped by sites and made available in the primary active deployment pages. The public users are able to explore historical/co-located deployments the same way as the primary active deployments. In addition, the TAO Data Delivery page is designed to provide the capabilities of selecting multiple sites and one measurement type and a time period for the public users to download data. OPeNDAP NetCDF data files are also organized by sites and made available for the public users to download data at http://dods.ndbc.noaa.gov/. 2.4.3

Range Limits The range limits specify minimum/maximum values, hard or soft EQC flag to be raised when the limits are exceeded. The range limits are applied to either a Site or a Region with an applicable depth range. When a Deployment is placed in a Site or a Region, the deployed Sensor(s) within the given depth range, if specified, is subject to system automatic check according to the range limits.

2.4.1.4

OBSERVATION DATA

For each measurement type, a dedicated database table is allocated to store measurement values. This allows a good partition of observation data and quick access to them. Each measurement value is associated with a NDBC quality flag. There is also a release flag associated with each measurement value. The measurement values with release flag on will be made available to the public, which may have quality flags 2, 3, and 4 for questionable data, good data, and changed data respectively. As PMEL had different quality flags, netCDF data products are delivered to the public with two formats, one with the NDBC standard quality flags, one with the PMEL traditional quality flags for backward compatibility.

2.4.2

ON-LINE BROWSE

The NDBC TAO website also provides a rich set of graphics to the public users. For each site, data plots are generated daily for all configured sensors with a selectable time range. For each sensor, plots with different time ranges are available when the mouse is placed over the desired small image icons. For sensor comparison plot purposes, users may select up to four different sites to plot each site's corresponding sensor data. The plots can be grouped by measurement types or by sites with desired time ranges. The Latitude-Longitude Plots under the Data Display option shows plots of the data across the entire TAO array are also being generated daily. These plots are created using the 5-day and monthly NetCDF files and feature the following variables: zonal wind, meridional wind, wind speed, sea surface temperature, depth-averaged temperature, air temperature, dynamic height, 20 degree C isotherm, heat content, and relative humidity. See the SST and Wind latitude-longitude plot example below. Historic plots are kept in order to more easily see trends within the data over time. Date selection boxes are also provided.

DATA DISCOVERY

The NDBC TAO website (http://tao.noaa.gov/) provides the Data Display web page in which the TAO Array graphic status is displayed. A site in the TAO Array is rendered as a location dot in different colors according to the corresponding

Figure 5 Lat-Lon Plot

Section plots are yet another feature of the public TAO website. A section is determined by a selected latitude or longitude and either depth or time. Section plots are also created using the 5-day and monthly NetCDF files. The latitude-depth and longitude-depth section plots only allow the viewing of temperature data across the selected section. However, the latitude-time and longitude-time section plots are created using the same ten variables as the latitudelongitude plots. Section plots are also rendered daily, stored for historical purposes, and allow for selection of a date range.

Department of Commerce “IT Security Program Policy and Minimum Implementation Standards” are followed.

3. CONCLUDING REMARKS

As NDBC is responsible for all field services to the entire TAO buoy system, the TAO data system is redesigned from ground up to support both legacy and refreshed buoys. The IT architecture and design follow the NOAA Global Earth Observation – Integrated Data Environment (GEO-IDE) guidance and assist NDBC managements to achieve the TAO array transition rationale: 

Make operations more cost effective



Protect against changes in personnel



Ensure continuity of the data streams

Many new features have been introduced to the TAO data system while existing ones are maintained for backward compatibility. Our phased implementation so far has been transparent to TAO data users. Figure 6 Section Plot

2.5

INTEROPERABLE DATA ACCESS

The interoperable data access to TAO data are achieved via three standard mechanisms at NDBC [6]: GTS Data Dissemination: For legacy moorings, Service Argos converts a subset of ATLAS data to FM18 messages and puts them on GTS. For refreshed moorings, NDBC converts AMPS data to FM18 and/or FM64 and puts them on GTS.

The future enhancements of the TAO data system will focus on data and metadata discovery and representation to the public in addition to provide stronger supports to field services. The TAO data user communities and data archive centers will be consulted for preferred data delivery means and formats. Further, the NOAA Service-Oriented Architecture (SOA) will be considered to address the aspects of data management integration at NOAA: data sharing and application interoperability. The continuity of the TAO data streams will be better supported in the NOAA SOA environments. REFERENCES

NDBC TAO Web Data Delivery: The Data Delivery page is responsible for delivering raw ASCII files and NetCDF files to the public via standard HTTP protocols. In addition, NDBC submits certain TAO documents to NOAA’s Climate Diagnostics Bulletin monthly. NDBC OPeNDAP Data Delivery: The NDBC OPeNDAP server also supports TAO data delivery via OPeNDAP protocols to the public. 2.6

OPERATIONS

The TAO data system is operational at the NDBC Stennis Space Center (SSC) facility. Another data system is also setup at Silver Spring, MD, to provide site-wide fail over support. There is a master MySQL database that supports the Data Management Console. On public web servers, replicated MySQL databases are configured to support the public web interfaces. DQAs use the internal web interfaces of the Data Management Console via the HTTPS protocol. The

[1] Technology Refresh of NOAA’s Tropical Atmosphere Ocean (TAO) Buoy System and TAO Buoy Funded Requirements, NDBC [2] ATLAS System User Manual, vol. 4.1, NDBC [3] AMPS System User Manual, vol. 2.6, NDBC [4] DMAC Real-time Operating Procedures Manual – TAO Array, vol. 1, NTSC/DMAC [5] DMAC Delayed Mode Operating Procedures Manual – TAO Array, vol. 2, NTSC/DMAC [6] TAO Refreshed IT Detail Architecture, NTSC/Data Systems