Preservation and Stewardship Committee Telecon 2013-02-11
Curt started with a discussion on the winter meeting. Curt updated us on efforts concerning landing pages. A survey on current practices on DOI landing pages will be conducted by NASA, similar to efforts we made previously. Curt believes the use of DOIs for data sets and getting DOIs assigned to different things, and then capturing the information which would go on these pages is an interesting activity we should work on.
Preservation ontology - provenance extension for earth sciences has been approved as a NASA ESWG - we should look at what is different between this and the ESIP guidelines.
Curt asked if anyone had any updates. Ruth thought it would be good to go through the list and determine what should be next steps and priorities. Along with relationships between tasks.
Ruth - Data management training (With Nancy - not on call). This is a first step and there are 100 different topics they felt were important. About 50 were created. They required a lot of hard work. Does ESIP or other group wish to go out and find funding to take this further or not? And if not, we will not be able to develop more of these and must work on maintaining the current versions - which can be deferred for a while. Either in the summer or next winter. The question is, is there enough interest in the community to find more funding? Are there people in this group who would be interested in finding funding. If no one here today is interested we can revisit this later in the year or the end of the year.
Anne has a question - in terms of resources need to complete the other 50 modules - how much would that be? What did it take to get the first 50 done? And given that they were done on the side, was it not a whole lot of resources? Ruth - the biggest issue - finding volunteers to write modules was difficult. And people were burnt out by the process - it is very time consuming. To do a 5-10 minute module is a many day effort. And the editorial review of these is a major effort. So to do it and do it well, would require significant funding.
Bob Downs - Completely agree, this is a really big undertaking and would have to be funded.
Anne asked if it was a week or two per module when you include everyone’s time. Ruth agreed, that is correct.
Curt - you really need someone who understands these concepts as well.
Carol - it took a lot more effort than anyone of us expected. But we know of other groups who would like to potentially collaborate - maybe a great community effort could be undertaken so that no single person is unduly burdened.
Bob Downs - in addition to the time you have already identified, there was a lot of time the coordinators put in in addition to creation and revisions. Lots of emails went around on this topic. Ruth - I learned a lot about the editorial process. Agrees with Bob.
Anne - what funding possiblilties might there be? Ruth - that is a good question, NSF is increasingly becoming interested in these topics. Even EarthCube and those sorts of programs with training on data management. One of the topics brought up by scientists at these workshops. David agreed - EarthCube would be a good place to go and the current calls they have out RCNs for workshops, not a lot of funding but could be one start. Or the test governance call - this is education outreach. Ruth- right, to move that forward would require a small group of people to look at the funding landscape. David suggested contacting Lee Allison (from the governance steering committee). Anne - I think those proposals are due in March (March 26th to be specific) but they are still in draft. Ruth - are there people who are interested in moving forward on this? Bob and Anne volunteered to help (but Anne does need to scope out her time a bit). Denise agreed, very interested in this topic but does have a finite set of time. Anne said she could directly use these modules now and thinks they are important to have them.
Ruth is not sure what to do next, but suggested that she could email Lee Allison and outline what we are working on and see if there is any option on collaboration. David suggested including a link to the current versions. Also contacting someone directly from NSF - Barbara Rancin (?) or Eva (?). Carol said if it was part of an RCN that could relate back to governance that could be useful. In terms of approach. David - This is data management for the sciences. But it has to go to academic institutions. Barbara or Eva could talk about if this fits. They want to work with groups that are not heavily cyber enabled. What are the standards that would be used cross domain? This could be part of an RCN to focus on data management across a few domains.
Curt - sounds like we have a few different avenue and interest. If you are interested step up and let Nancy, Ruth know and keep the mailing list updated as well. Ruth will email Lee and Barbara.
Data Stewardship principles were published a little over a year ago - there was some discussion on new language on physical object stewardship. Ruth and Rama too lead in developing these to begin with. Are others interested in editing these? Denise - I would like to read through them and review that. Have not had a chance as of yet.
Use case activities - the provenance incubator has a nice example. We talked a good bit about these at the meeting. This would be good to move forward on.
The DOI landing pages - the ESWG is going to work on that next. And some of the stuff from last year needs to be organized and put up on the website. Other kinds of identifiers we have talked about, Curt is interested but not sure he has time to address these topics. It was a topic at AGU this year.
Data citations - we need work from people to adopt and cite the data they are using. Curt mentioned the draft of the National Climate Assessment. It has several thousand references identifying the data using those references has not been a trivial task. And NCBC is a big supplier of data to that report. Being able to connect back from figures to the data sets, the best practices for data citations, have not yet permeated this domain. ESIP can push these standards in to wider visibility, and some of what we have talked about (letters to reviewers and editors) is sorely needed. Curt posted some questions to the page but Brent and Marc are both interested in this, (not on the call currently). To start pushing these standards, and there is room to do those things here.
Ruth asked about the identifiers paper - Curt is focusing on the landing pages but can’t focus on the rest. Ruth would ask the question, is someone else interested in pursuing these areas? If there was a significant set of people, we could do that this summer. If we leave it much longer than that, it will not be useful. Curt asked if someone organized the activity, he could help with that but can not lead it. Ruth asked if we could have time (30 minutes) on the next call to organize who would work on what. Bob volunteered to help if Ruth leads it. Ruth felt this was a valuable task but we need to break it down to parts which would be one way to accomplish this.
PCCS - Rama encourages everyone to look at the current document and see if it needs changes. Curt mentioned a AGU paper - is it time for that or do we need more about it first? An opinion article for motivating the need for this type of standard? Rama - agrees, would like to be part of an article on that. Ruth said Mark was interested in that but is currently in a job transition. We should wait to see if he is ready. Rama said they had written a few IGR papers but not sure if we have had any AGU presentations. And thinks it is a good idea but not sure if it is mature enough in the thinking for it yet. Curt - going back to the use cases could build to these things the PCCS and DOIs and could tell a better story.
Hook - The preservation ontology and prov es. We started talking about this a few years ago. We need a formal provenance for improving trust and pedigree of the data products. Gave a few examples. But there is not a formal interoperable standard. Wants to provide a content model for these pedigrees and a track recommendation for coding and representing this information in an interoperable way. Want to start this week as a NASA ESWG called PROV ES. Produce consume and process some of these things. How should this NASA effort be different from the ESIP efforts here? Curt - it might be the part of the same process where one develops and the other reviews it and creates a W3C type note. Hook mentioned that this would dovetail the PCCS and the data identifiers. We can make this work together. Gave examples on how they would connect. Curt - might be worth an overview high level white paper with the need and use for all these activities. It would tell the story and how they work together. Overall data management.
Brent is not on the call but information quality is important.
Denise - physical object update. We have not done much yet but it does dovetail with the work being done by Anne with the DDS. Would love to have more help with this. If there any group with best practices already in place relating to earth science (or other fields we can modify/use as a starting point). Someone suggested ISGN (Kerstin Lenhert’s research) We can plan another telecon around this topic. This might get back to use cases as well.
The next teleconferences, we are going to pick some of these topics and look at them indepth. Curt will propose a list of topics and we can discuss ordering.
DDS - that is moving along. They have a dedicated mailing list and teleconferences as well. Seems to be quite a bit of work on that already. Anne - join on up! She has found that it is an interesting topic and others seem interested. We need people to help with work. We are starting with a literature review. We need people to review the literature. We have picked a data lifecycle model. We are working on figuring out what we already know and we hope this data lifecycle model will help us scope the problem.
Curt will send a list of topics and specific questions out on these.
Curt - Provenance incubator - They have a nice process for their creation of use cases. We can create a more formal way of creation of use cases and use them as a driver and organizer of the different elements we are interested in. If we do a good job of developing them. One of their earliest activities includes defining how they think about provenance. These could help inform ours even though we include more topics. So the ES Stewardship dimensions. Things that can go into a hierarchy to help us understand what has gone on before, not just in a mailing list. Take a look at these links and go through some of these things that are concerned to them, this will be of interest to us. From those dimensions, they have a formal template. We might want to adopt a formal template as well. Many of the ones listed here might be relevant to us. Curt went through the list and discussed a few of them as examples.
They include an owner of the content, and some use cases scenarios. Working through this would be useful for many of our other activities. Curt encourages us to look at a few examples and understand how they work. We can use a future telecon on how we can address use cases for data stewardship and use those as a starting point. This could be a valuable exercise for us.
David asked about spatial references - is there a use case on this list that addresses this topic? Curt said these were supposed to be broadly applicable but perhaps that is something we could address. These were based on who was being part of the process. Part of our process can be addressing similar dimensions but with earth science perspectives. Our earth science domain to use cases even if they were these same terminology and structure.
Curt will work with Sarah to plan out the telecon topics and agendas. Will do that soon and send it around for discussion to get it nailed down sooner.