A best practices guide for managing sensor networks and data
A working group of practitioners experienced in the entire life cycle of streaming sensor data is developing a best practices guide for managing sensor networks and data. The best practices focus on establishing and managing a fixed environmental sensor network for on- or near-surface point measurements for purposes of long-term environmental data acquisition. The guide builds on the collective experience of working group members and is organized around the following six topics:
- Sensor site platform selection – selecting sites and acquisition systems
- Remote data acquisition – transmission from field to server
- Streaming data management middleware – available software systems
- Sensor management tracking and documentation – life cycle events to document
- Sensor data quality – QA/QC procedures
- Sensor data archiving of large data streams – update frequency, accessibility, data quality level
The preliminary findings are available on the ESIP EnviroSensing Cluster (http://wiki.esipfed.org/index.php/EnviroSensing_Cluster). The session organizers will present an overview of this project and the intent to engage the community in ongoing forums and the use of “crowdsourcing” to maintain, broaden and continually improve the best practices content. A discussion is planned where participants will be able to share their personal experiences with sensor networks and practices. We might consider future forums or presentations for ongoing monthly teleconferences within the EnviroSensing Cluster.
Brian Wee from NEON (in for Jim Taylor)
No standards among data providers
What can we all do to help?
-Offer standardized approaches
-Make data and algorithms freely available to everyone
-Provide forums/workshops to focus on uncertainty
-Emphasize this is papers/classes/labs
-Help educate policy makers
NEON Design: Designed for 30 years
The overarching goal of NEON is to enable understanding and forecasting of climate change, land use change and invasive species on continental-scale ecology by providing infrastructure to support research in these areas:
-Biodiversity, Biogeochemical cycles, climate change, ecohydrology, infectious disease, invasive species, land use.
NEON has 60 terrestrial sites over 20 nationwide domains
Propagation of uncertainty: graph of forecast lead time vs spatial scale showing increasing uncertainty w/ increasing forecast time and spatial scale.
How to manage this problem?
-Intergovernmental Panel on Climate Change Guidance Notes developed a "Likelihood Scale"
VISION: Tracing policy back to science
NEON Data Portal is available to everyone w/ATBD's attached to each dataset
-----
Janet Fredericks, WHOI
Using SWE to bind MetaData to Observational Data enabling dynamic data quality assessment
Goal: two paths - described well enough for assessment of data for specified use and for a repurposed application
Data provider needs to communicate how the sensible properties were turned into observations
Q2O: Quality-to-OGC: Integration of sensor provenance & procesing lineage aimed towards the ability to dynamically assess quality of observations
http://q2o.whoi.edu
Community-based development approach
Discussion of Five role-based categories of SensorML
Discussion of how this model enables dynamic quality assessment: Roles, Connections, Enabling Semantic Mappings & Encoding
Next Steps
-Encourage manufacturers & data managers to create meaningful vocabularies
-Update model to SWE 2.0
-Build better SensorML editors
-Provide tools and opportunities for domain experts to create and register ontologies
-Harmonize standards (ISO) and other technologies with this model
-----
A best practices guide for managing sensor networks and data
Don Henshaw, H.J. Andrews Experimental Forest LTER
Sensor, site and platform selection
-Selection of sites, science platforms and support systems are interacting planning processes
-Data quality and longevity is ultimate goal
-Optimal siting for science objectives can be impeded
Data acquisition and transmission
-Manual downloads of sensor data
-Remote data acquisition considerations
Discussion of Sensor management, tracking and documentation
Streaming data management middleware
-Middleware/software - Proprietary options
-Open Source Environments for Streaming Data
--GCE Data toolbox
--CUAHS HIS
--Open Source DataTurbine Initiative
Diagram: Streaming data management workflow
Sensor data quality assurance and quality control (QA/QC)
-Preventative QA measures in the field
-Automated QC is necessary
-Manual methods are unavoidable
-Data management considerations for QC system
QA - Preventative Measures
-Routine calibration and maintenance
-Regular human inspection and evaluation of sensor work
-Sensor redundancy
QC on Streaming Data: QC checks in near real-time
-Timestamp integrity
-Range checks
-Internal (plausibility) checks
-Variance checks
-Persistence checks
-Spatial checks
QC on Streaming Data: Data Qualifiers
QC on Streaming Data: Quality Levels
-Level 0: Raw streaming data
-Level 1: QC applied, qualifiers added
-Level 2: Gap-filled or estimated data
-Level 3+: Highly derived data
Sensor data archiving
-Archiving strategies
-Partner w/cross institution supported archives
-Best practices