Is it time for a Data Decadal Survey?


Google notes:


The NRC conducts decadal surveys at the request of government agencies on key questions posed by them.  It is a broad study providing community consensus that looks forward 10 years or more.  The study is used to by agencies to prioritize research areas, observations, and national missions to make those observations.” [1]   For example, the NRC completed its first decadal survey for Earth science in January 2007 at the request of NASA, NOAA, and the USGS.   More recently, this August the NRC released  “Solar and Space Physics: A Science for a Technological Society.” [2], which was requested by NASA and the NSF.

With all the issues and difficulties that we face as data stewards, and with the growing interest in the repurposing of data and its use across domains and far from its point of collection, problems around data management and stewardship abound.  The mismanagement of data could have tremendous consequences.   

However, there is also much opportunity.  Understanding, technologies, and practices around data are improving.   Could a Decadal Survey for Data hasten and improve that process?

This session would discuss the possibility of a decadal survey focusing on data.   What would be the scope, effort involved, cost, benefits, possible impacts, resources, barriers, players, audience, engagement, etc.?   Might ESIP be a venue for incubating a Committee for a Decadal Survey for Data?  


[1]  [Paraphrased to remove NASA focus]



Notes on Data Decadal session

Anne began introductions and introduced Paul Uhlir from BRDI.

Anne began with a brief presentation.  She was inspired by the Decadal Survey for the Solar and Space Physics.

First she gave background on the National Academy

Moved on to NRC which is affiliated with the NA.  It is an operating arm and not an actual individual academy member.

NRC conducts research based on requests and funding.

Has a number of divisions within it.  Different boards.  One includes Division on Earth and Physical Sciences.

BRDI is one section of NRC.  The Board on Research Data.  In the Division on Policy and Global ???

Anne showed a list of publications they have created.  Including cross domain research.

CODATA and CODATA related publications before BRDI.

Example is the Decadal Study Anne mentioned earlier (on Solar and Space Physics).

Gave an overview of the funding process and costs.

Example of a recent report – Ensuring the Integrity; Ethical Foundations of Science Collaboration.

Report recommendations - from the summary they identified three principles: Researcher should design products to ensure the integrity of data, etc.  Anne read through some of the guidelines and principles.  Finally data stewardship.   Committee on Science engineering and public policy (COSEPUB?) paid for this publication to be developed.  It is a high level committee.

Data is a national investment but then so is software.

List of data issues not addressed in the 2009 report.  These were ideas that Anne thought of off the top of her head but they are still important.

Current efforts – including NSF requiring plans.  Includes CODATA and EarthCube etc.

Why a decadal survey?  During the panel there were discussions on cross agencies collaborations but they were ad hoc.  Perhaps if we did this, if they pool resources, might it be easier in the long run than individual agencies.

5 phases of developing one of these reports.  This is part of the larger process from the National Academy on how they develop the study.

Paul Y. discussed more of the details of the review process for how these reports get approved – more stringent then a peer-review.

Possible outcomes from a Data Decadal survey – cross agency vision for data management.  Anne outlined possible ideas and outcomes for this study.  What about a resource of places to go to get advice?  Curt mentioned that this is ESIP but Anne meant a funded program and not a volunteer group.

Anne ended her presentation with a list of questions – who would sponsor this, which board would conduct the survey? What would be the scope?  - Just earth science or others as well? Etc.  Also a discussion on software and not just data.

Denise mentioned that this really aligns with our meeting yesterday on bringing awareness to the existence of data collections.  She brought up the idea of creating our own cluster to handle this process.

Ruth mentioned the idea of funding and that we have different agency representatives here.

Rama had a few opinions on what this group could do under the data stewardship committee – define the problems and the scope.  Defining the scope to make it more useful and defining it as Earth Science data will be useful coming out of ESIP.  Also forward looking management of data.  Even defining scenarios by experts would be a really useful thing.  This can lead to other things like what kind of data solutions do we need to have 10 years from now. Looking at the data decadal report that Anne mentioned, it detailed how to define missions and gave some specific examples.  We need to think of a similar examples for data.

Mark mentioned that this decadal surveys are strategies for someone’s mission.   So what mission are we trying to create a strategy for?

Rama discussed another example of a study that was useful (did not catch name). But it was project specific and not for the nation.

Mark asked if ESIP can be a sponsor for this project?  Outside of just this committee?

Curt said we can create white papers, and maybe we are not the group to do this but we have a role and there can be grass root efforts.  Curt is currently working on a large scale interagency project that has produced a number of reports.  Approaching them with ideas is not as valuable as approaching them with a white paper that outlines what they should do – questions that should be addressed that will create a forum for marshaling agency funds and funding reports like this.  One of these reports – commission on geoscience and environmental resources. Committee on environmental data and other committees in the 80s and 90s that talk about the problems of data for climate change.  That might be an avenue to look at.

Peter – Channeling Ray Walker from UCLA – 1982 CODMAC summary findings and recommendations are all still true.  Mentioned that we have half a dozen european and world wide committees on data, recommended as being wary of being a self serving exercise.  The best way to elevate the importance of data across sciences is to get data elevated amongst different data decadal surveys – a different approach to these things.

Paul - discussed further the idea of data management and interoperability as a wider thing instead of as something specific to one science.  Across the sciences or maybe as something for the federal government and academics.  Limiting to just Earth Science is not going to yield as many results as if it was interdisciplinary.

Anne asked if a decadal survey is the right target or should it be something smaller?

Paul said it was possible but you have to get buy in from these different organizations.

Mark - discussed what he has seen in the past with other decadal surveys.  There are a lot of academy reports with recommendations without follow through.  Thinks that we have to be careful of making it too diffuse that it does not give a path.

Curt talked about provenance - when you move up within ideas you have to be very general but both are needed.  How should we address these general areas across disciplines, but what are the specific concerns for earth sciences and the way we approach these issues?  That are compatible and interoperable at a certain level.

Someone suggested a recommendation to focus on data (???) Speaker 1

Denise mentioned a tiered approach where you would look at data broadly and focus on a specific subdomain as examples.  Applications within it for other subdomains.
Rama - Report on digital data: Harnessing the power of digital data for ? and society. They also had a workshop which wrote a report about taking the next steps.  So there are things out there, but if we are going to make progress we need to focus it and find out what the next steps are.  

Denise pointed out that this is limited to digital data and ignores the needs of physical data and data that is not yet available digitally.

Jeane mentioned that report was not just a focus on digital.

Peter - There is no synthesis going on to see what the commonalities are.

Laura - only people who were on these committees seem to know they exist.  And that is why we need a larger review.

Peter - something well scoped that includes the whole world.  Triage of these publications.  Like a literature survey.

Rama - the key thing is, this is a very actionable report.

Jeane - the decadal survey on earth science had data mentioned.

Rama - across agencies gets diluted.  

Paul - there are more surveys in different discipline.  Maybe the thing to do is link it to have different panels that are grounded in these different decadal surveys?  A cross cutting one in the final report with each panel doing a report itself.  This would be expensive but maybe the most effective.

Mark - maybe do it in stages?  WIth recommendation reviews and second stage push that down to implementation stages in different focuses?

Rama - that sort of syntheses be done on a large committee level or grass roots within ESIP?

Jeane - are you looking for science data or for components of things that enable science?  In the report she was talking about there were VERY specific details for sensor data in an experiment.  

Paul - if you want to look at it at the data level - the management and technical issues/social cultural issues, the way data are managed rather than answering questions in science.

Jeane - so include addressing organizations that can not get their materials online?
Mark - also graduate students needing advanced training.

Paul - the institutional relationships across sectors.

Anne - we need to manage the software well included in the data.

Peter mentioned that this is in some of these reports.  

Mark suggested that this is why we need to review what has been done, but do we need a high level panel that has been blessed to do this or can we just do this?  What is better?

Rama suggested reading all the reports and issuing a report?

Mark said a review paper is really important.

Rama mentioned well, some of these already are review papers :)

Anne asked what would be the next steps in the next 6 months?

Peter reminded us of the very specific steps and approach Anne outlined - we need something like this as well so that it does not fall apart.  Having a formal committee - they know these processes and will make sure it gets done.  So get people who have done this before.  Staffers and research assistants who are trained in these process and can do some of the leg work.

Anne asked if this is a chicken or egg thing?

Peter said thinking that way might hinder you from getting started.

Denise - this gives us a good dose of reality.

Peter - gathering agency support and going to the academy can be a long process. If you want to start this now, you can form a cluster.  You do not need approval, you just inform the vp.

Erin asked if this is something that already falls under the existing committees?  She also said she can help us set it up if needed.  But also said she worked on something similar to this in the past as a graduate student, and should not be a limiting step.  Also the energy group - has a student from RPI who is prototyping and it was not hard to find students to work on these things.  If it takes off she can help us find money.

Peter - going through a committee or working group you can get funding.

Erin - which is an idea that staying under the preservation committee might be a better idea because we can get funding.  

Denise - It is a good home for this to start and if it grows we can split off in the future

Erin - it can be cross grouped and not silo specific and it can get funding these ways.

Anne - what is the next step?

Laura - what do you want to do next?  Do you want to do a synthesis or survey of what has been done?  It seems like you are not yet at the level to talk to agencies?

Paul -- you have to have a well developed plan

Erin - you should have a white paper that is shopped around, that is what energy did.

Ruth - the point of the white paper is to define the goals and what these things would be.

Mark - 6 month timeline - for the next ESIP meeting.  That gives you 4-5 telecons.  Do you think you could get to the point of proposing an approach?  Where the results of these meetings would be??? Getting a graduate student do a literature review, what the agencies would do?

Erin we can have interagencies meetings at the summer

Mark - it would be easier to get them in DC and we can have the summer to prepare and work on this for next winter?

Ruth - best of both worlds under stewardship - go off and have own meetings.

Mark - yeah, go off and have own telecons.

Denise - 5-10 minutes in the telecon to recap what we have been talking about in a smaller group.  Start out there, get footing and then grow.  Do a where are we during the summer meeting and then approach agencies during the winter meeting?

Anne asked Paul if he had any final thoughts?

Paul thought it was a useful exercise and it is not clear how it will end up and how it will be realized.  But having a small group try and define what would be useful is certainly a good exercise.  He is supportive of that and could be involved.  

Anne - so if we have questions we know where to find you?

Paul - I have done some work in the Earth Science area - and I see this as an analog to that, but the question is if you want to go beyond and into other fields.  It is not clear if it is something you want to do but the disconnect is a real one (in a specific study).  And this can be a way to bridge some gaps.  It is easy to talk to people who know the issues but that might not add some value.  Going beyond the earth sciences adds some complexity as well.

Wilson, A.; Is it time for a Data Decadal Survey?; Winter Meeting 2013. ESIP Commons , October 2012