Interdisciplinary Data Curation for Socio-Environmental Research
In the Earth Sciences, many researchers are striving to address larger environmental challenges. From understanding our changing climate, to surface processes that result in water issues, deforestation, biodiversity loss - these earth science questions can be framed at various spatial scales, ranging from local to global. These complex earth systems are best understood not just through any one disciplinary approach, but through an interdisciplinary lens.
Research on these complex, systems science problems, then, is often better organized around the site (geographic location) instead of by discipline. This has implications for:
Data collection and management: sampling sites must be well documented and contextualized, and data must be collected in a way that is usable and interpretable for researchers in a range of fields
Data sharing and archiving: repositories must take steps to avoid becoming disciplinary silos, and researchers must take additional steps in their metadata creation
Data analysis: researchers must be careful in scaling up the understanding / knowledge from site-specific to regional or global levels
This is especially true when considering that many environmental challenges within the earth sciences have a human/social component, often bringing social-science research (economics, psychology, decision-making) into the interdisciplinary fold. There are organizations, such as SESYNC, addressing social-environmental problems via case studies and interdisciplinary data analytics, and this Session is intended to bring the ESIP community into the conversation.
This Session will look at specific case studies and discuss ways in which the ESIP community might be able to offer insights into the data management and analytics necessary for addressing interdisciplinary research more broadly.
- Yellowstone: geobiology at Mammoth Springs
Los Angeles, California: paleontology and paleoecology research at the La Brea Tar Pits
Agricultural Sites: Ecosystem Services (Climate, Water Quality and Farmer Livlihoods) Vermont
Vermont Monitoring Cooperative (present brief series of Case Studies or general overview)
Jim Duncan, Vermont Monitoring Cooperative
Steve Posner, works with COMPASS - http://www.compassonline.org/staff/StephenPosner
Hoperful Collaborations with ESIP Clusters:
- Data Stewardship
- Data Analytics
1. Andrea Thormer, UI Urbana Champagne
Scientifically significant sites have unique data curation needs
-resource managers and curators typically need information more typical of “metadata”
-researchers need datasets collected with certain data points in order to reuse data and assess fitness-for-use
-both need info about unique natural phenomena, info about methods and collecting events
Can be a long time lag between data collection and data publication
-Apply standards at point of data collection?
Is this feasible? A good idea?
Phases of iterative collection of refinement that make observations into data
Many decision points
Potential problem of using standards early on: risks removing the scientists from key decisions about study design
Challenge: balancing idiosyncrasy (researchers’ priorities, often) and normativity (curators’ priorities, often)
Case study: Geobiology at Yellowstone NP
-Challenges to geobiology data integration at YNP: interdisciplinarity, heterogeneity, sociotechnicality
Two proposed solutions:
Minimum information framework for geobiology research: set of recommended parameters for data collection that could inform future standards work
Research process modeling: document workflows to make data duration easier & documenting data provenance with PROV
Case study: Paleontology at Rancho La Brea
-Challenges: idiosyncrasy, standards creep (people who apply the standard do so differently, e.g.), sociotechnicality (lack of adequate computing infrastructure, hesitation to share data, etc.)
From site-based to socioenvironmental
-similar challenges to data integration around sites
-site-based perspective could inform integration and curation efforts
2. Jim Duncan, Vermont Monitoring Cooperative
Interdisciplinary data for environmental monitoring
Each agency has their own mandate in terms of archiving data → big advantage for the VMC
VMC houses data related to forest ecosystem condition
Lessons in working with interdisciplinary data
-heterogeneity - the reason and the challenge
-real-time sensor data
-remote sensing data
-less traditional forms of data (land ownership change data, etc.) -- how to represent these effectively
→ need flexibility in standards, and flexibility in user interface (in what is displayed and having ability to drill down into the metadata)
-importance of linking to other catalogs - creating a complementary space for scientists to connect
-drivers of access and control concerns
-beyond the usual suspects (control, quality, use, and pubs) there can be discipline-specific types of concerns
-political concerns, locations of threatened/endangered species, plot location privacy concerns (e.g. FIA plots)
-embargo periods, designating things as sensitive, etc.
-improving communication between the data world and the science world
-how can we articulate value of semantic of metadata-driven data systems for improving science and credit in a way that is meaningful to the broader research and monitoring community?
-there are a lot of possible avenues for data science to pursue that could make data sharing “better” for scientific discovery and testing - how can we help cross-disciplinary repositories prioritize those?
-increasing ecological and social perspectives in earth science data sharing
→ why might this be is this hard?
3. Lindsay “Bar” Barbieri, University of Vermont
Earth Science Data
Description of research project to look at treatments to ag fields → GHG emissions
Economics → biophysical data (ag management) → ecosystem services → scale up to adoption of ag practices? Livelihoods? Policy? Leverage points? Etc.
-interdisciplinary work is important to address societal and env. challenges
-data management plan → researchers: “what does that mean?”
-synthesis? Lack of scaffolding to make this possible
-importance of scale: individual research project vs “big picture” synthesis
4. Kelly Hondula, SESYNC
Synthesize data, ideas, theories to understand coupled social and biophysical systems; feedbacks and nonlinear interactions common; emergent properties and surprises common
Description of SESYNC and what they do/how they work
Mismatch of data across regions, scales
Data and knowledge gaps: many related to data and data access
-knowing what data “is” and exist outside domain expertise
-need to revise Q based on data
-linking biophysical data to ecosystem services
-differing scales and extent of data
The conversation continues! Contact Lindsay Barbieri at [email protected] if you want to participate!
-distributed groups working in short bursts (this is often linked to funding) -authorship norms across disciplines