Preservation and Stewardship Committee Telecon 2012-11-12
-
Maintaining the ESIP data management training modules that are beginning to be published (Ruth)
-
Collaborative efforts with Info Quality Cluster (Brent)
-
Updates on Winter meeting sessions
Ruth - To review - a year and half ago ESIP got a small grant from NOAA for training modules. They made a few 5-10 minute modules aimed at scientists which would be published as powerpoints and video. Over time since then, 50 plus of these have been created. All have been peer reviewed and they are starting to be published formally. A contractor is making scripts and recording the presentations professionally. 20 or more are going up now on the ESIP commons. We hope to have almost all of the peer reviewed ones up by the end of the month and online. The original grant is now over. We have these training materials but the original team is disbanding as the project is over. But as everyone knows, the world of data stewardship is changing rapidly and these materials will need to be maintained over time. Each is being given a DOI and version number. If any ESIP group who might be interested in maintaining these over the long term, it would be this group. Ruth would like to bring this to our attention and see if there is interest in moving this forward and taking over maintenance for these.
Bob Downs things this is the logical place to house these. Curt asked Ruth if there was a planned schedule for maintenance. Ruth said that we as a group would need to decide how to handle this - there is not currently a set of standards for how things uploaded to the commons are being managed. Some of these will also need to be updated/reviewed over time as well as things change. Where as some might be static. Ruth thinks it would be up to us to develop a plan for frequency of review. Curt also suggested tracking comments and questions. Ruth does not recall the address, but there is an email address to direct questions to. Curt suggested we could base changes on user requests instead of a review process.
The question is - does this group agree to manage these? Curt asked for any objections. The group appeared to support the idea. Curt asked if these had been presented or will be. Ruth said they have been in the past but does not see plans to do so in the future. But there are opportunities for members of this community to look for funding to support workshops and training. And maybe even material updates. NSF does support curriculum development.
Bob thought that many of these could work as a stand alone training module. They would be ideal to an individual to select and bring oneself up to speed on an area. Ruth agreed, that was the original intent.
The group agreed to support this as a mandate - but we do need to decide what that means for the long term (periodic updates, review etc). Perhaps this could be a topic for future meetings to brainstorm. This will be added to the ESIP winter meeting during the committees breakout session.
Brent - (Information Quality Cluster). Brent is working on building back up this cluster. It has had a 6-9 month break and is here to ask if anyone has efforts from the previous version of the cluster that they would like to see continued. Also other suggestions they would like to see worked on in the future. Basically collaboration efforts. Curt asked if there was a website for the cluster.
http://wiki.esipfed.org/index.php/Information_Quality
Brent - Efforts identified - terms and definitions for data terms. And a data clearinghouse to review the uses of these terms. NASA will be a work case to look at use of terms through the data lifecycle and data management and analysis. Defining information quality so that people at different levels can communicate. There have been a few other collaborations - semantic web and documentation. Curt suggested that the purpose of the group be pulled out from some of the topics and used for recruitment on the ESIP-ALL list. Brent thinks this is a good idea and it is on their list of things to work on. Brent said that data quality is a much talked about topic and there is a need for data metrics to help with data management. This cluster could be a focal point for this issue. It is a collaborative concept.
Ruth - mentioned that NSF held a data quality workshop where they asked for white papers on data quality and data quality metrics. And setting up a research agenda on data quality. Ruth has the final report from that and can share it (To download the report, go to http://datacuration.web.unc.edu/ and click "Go to Final Report."). Brent said there is much discussion on this issue in Europe. This includes how to set something up that could be used outside a specific organization can use it. Specifically working on scope of the framework.
Curt suggested like the data provenance - domain specific and across sciences. Mark Parsons mentioned looking at data use and not data quality as quality is in the eye of the user. Brent mentioned one question would be deciding how to preserve a data set for use 20 years in the future. And to prevent problems from the past - being unable to access data from the past now. This group has been working on this extensively. Mark - preservation is best done with a specific community in mind. Building from the OAIS model - designated communities. That might be defined broadly but it is like quality, there are no absolutes to it.
Brent discussed a framework for how to save data and what associated materials. Mark mentioned we have documentation on this which Brent might like to look at (we as a committee).
Mark mentioned tracking use is more promising than trying to define quality and our work with identifiers goes towards this goal. Quality is an endless question. The idea is, is the data valuable and useful and use might give you some insight on to this.
Curt agreed, we have focused on use as a group. And developing commonalities so that people can know where to look for materials. It is currently not done in a consistent or uniform way.
Brent mentioned discussing how and who is using the data and how well are they using it is important. So it is important to look at use as well.
Curt talked about uncertainty. This is a major challenge. Mark liked that focus - on uncertainty than quality. Curt mentioned an orthogonal look at compliance, identifiers and other aspect that help improve the quality of the set as a whole - things related to the quality of the presentation.
Curt added a list of sessions for the winter meeting - those who have selected preservation as their section. Of course our business meeting. Curt also put in one on GCIS. There are some Earth Cube sessions and a few other related sessions.
Ruth mentioned that one is missing from this list - but it is showing up on the google docs and the winter meeting list - the session on preservation of physical and analog materials. Sarah is not sure she tagged these correctly and will look into it.
Ruth is concerned that with the number of meetings we might conflict with ourselves for scheduling purposes.
Curt asked if we had any specific questions. These were talked about a bit on the email list.
Curt also mentioned the final two agenda items - our long term goals. In a number of different contexts data citations has come up and been supported. We need to work on pushing editors to encourage best practices across the community as a whole. He is open to ideas as to how we can push this along.
Brent asked if anyone was leading this effort or if it has an open chair. Curt said there is a need for it, but no champion has emerged as Mark phrased it. Curt thinks that people are busy at the moment. Brent would be interested in working on this but would like to work on it with someone else.
Ruth mentioned we have the authors version of this which was approved by ESIP. Brent asked if there should be two different documents on one document with two sections. Mark said maybe the latter but we were talking about more of a campaign then a document. An article in EOS (?) and letters to the editor etc. We need another push to say this is what we need to do editors and reviewers. Mark is interested in this but does not have the time to dedicate to it.
Brent asked what the next move would be for this project. Mark thought that it has multiple levels. We need to encourage that it be used in reviewer instructions, but the ideas are very basic. If you use data cite it. A letter writing champion would be one way to go - to the top journals in our field on behalf of ESIP. We need to define our strategies.
Curt posted some early brainstorming on the Webex. Ruth agrees with Mark - if someone takes lead she will be willing to act as a helper. Curt will also look on as things move forward. Brent is really interested in this topic. Curt thinks it just needs an organizer. Someone to lead discussions and be the straw man for this topic. Mark suggested a draft letter to editors or an EOS (?) article. AGU has a data policy that encourages data citations but this is not being done by their journals. This can be discussed at the winter planning meeting as well. Bob said he would be willing to pitch in as well.
Curt mentioned the DOI landing pages - we had a flurry of discussion earlier on this topic and we had examples from data centers. A lot of people were interested in seeing what other people were doing. He thinks it is worth continuing to discuss. Having materials available and connecting back to preservation materials is important. This is something that is moving along but could use a champion to push it. Data cite is working on it. This is something we might work on at the winter meeting. Mark is interested in hearing more about machine reading vs. human reading landing pages. This is not being well explored. (machine interpreted pages). Bob suggested a few sample scenarios or examples that would help discuss the needs. Curt said he would try to write up his - provenance is difficult to deal with right now.
Our next meeting is scheduled for 12/10 - after AGU. A few weeks after that will be the ESIP winter meeting. We can do final discussions for the winter meeting at that telecon.