Preservation and Stewardship Committee Telecon 2013-11-06

Abstract/Agenda: 
  • Quick brainstorm on purpose and content for a Data Management training brochure to hand out at AGU & other places

  • Possible topics to cover in a paper reporting and expanding upon the Identifiers Testbed project (e.g., current status of DOI assignment to datasets)

  • Status of Data Citation Principles: http://www.force11.org/datacitation

  • RDA reports

  • Data Preservation working group

  • DFT working group activities

  • PIT (persistent identifier types, Anne)

  • Others?

  • PCCS paper for science

  • Sessions for the January meeting

  • AOB?

Notes: 

Strategic plans

Ruth went over a draft of the strategic plans (first year for formal plans like this).  Erin put something together for each collaboration area that is active.  It is not meant to be heavy, but to help figure out key milestones and to help her and Carol make connections across the community.  The first bit is informational.   The next page has the stewardship mission.  This is followed by the budget and the subgroups which are active - this is an area that could be fleshed out.  Also deliverables - the budget, staff needs,partnerships etc.  Those kinds of nuts and bolts needed to fulfill these activities and the last bit is outreach - metrics and how people will know what we are doing. Other clusters/collaborations and outreach beyond ESIP as well.  (For example, RDA and DataONE).  It might be good to flesh out this area as well.  It is not binding but will help us better understand goals and directions for 2014.  It is a framework that will evolve as we use it.

Stewardship 2014 Plan: https://docs.google.com/a/esipfed.org/document/d/17bCKlcvzwtR65uK6cwk-2qu-OqKSU2ZYt8PTVShA6Pw/edit

The group has been requested to fill this out. Ruth asked if this could take place at the planning meeting in January.  Erin mentioned a cluster/committee fair at the meeting to let people know what each group does, and how to connect.  If we want to table it for the winter meeting that would be OK, but she would like something to share with the general ESIP population for the January meeting.  Ruth suggested we could create a draft based on other documentation, but there may be significant changes based on that meeting.

Sessions for the January meeting

We had some sessions proposed at the last meeting.  Bruce for example and a few others.  But we should get them submitted at this point.  Sarah will post Bruce’s session for him along with her own.

Bruce asked if we were going to table the quality section because the group was not active.  Ruth asked if someone should try and spin that back up, but it has not been successful, and it is not actively being done by this group at this time.

Ruth said we have statuses for each of working group and will cut and paste it into the plan for January.  At that time, she can send out an email to the link to the document for people to review and see if they agree.

Data Management brochure

Nancy - We have put in a request for funding to create a brochure (based on a similar concept from NASA) for places like AGU on the data management modules.  If you would like to see any example, she has an image.  While we have not heard if we have gotten the funding (though fairly likely) it would make sense to talk about what we would like to do with it.  Nancy thought it would be good to talk in a general sense as to what we would put in the brochure/do with it.  Educating people in general about the training modules and where they are; alternatively giving an overall view of data management and how these might address some of these questions; etc.  Ideas?

Ruth thought this was a good idea, especially highlighting the training modules.  She mentioned an example of someone doing a workshop which may use these materials to help address gaps in her community.  A good chunk of the brochure talking about what these are and where to find them would be a good idea.  Nancy asked about including the topics and titles? Or perhaps questions these might address?  Ruth suggested having one page of the fold being a list of the titles/topics.  Nancy said more of a reference then marketing?  Ruth suggested both.  What is actually available and how to get it, and some marketing.  Bob agreed.  Rama asked for clarification, mentioned information in the proposal itself could be used for this as well.  And mentioning other partners like DataONE etc.  Nancy asked if it would list other places where they might find information on these topics, data management materials, beyond ESIP?  Rama said yes, no matter what the source is we should propagate data management in earth sciences.  Nancy asked about Kerstin Lenhert’s work (IDEA) and the sloane proposal would be a good source.

Hoping to include pretty and colorful stuff as well beyond just text.  Ruth suggested looking at some of the presentations to get images - which have already been vetted as far as public availability.

If you have any other ideas, send them to Nancy.  She will try and get a first cut to share out.  We need to get this done quickly as we wanted to get this done for AGU.  Hopefully having a draft out in a week or two.  Hopefully members can take and distribute the brochures at their booths etc.  A few individuals volunteered.

Topics to include in a paper - metadata test bed

Nancy and a few others have been working on a report on work done related to the metadata test bed.  Curt thought perhaps it could be used/extended and published someplace.  The current status of DOI assignments, what other topics could be included in a paper like that.  The report basically talks about the additional paper, recommendations, categories and criteria, testing, report on each and a little on the actual process of implementing, coming up with recommendations for each one and drawing all that up to a conclusion.  But there has been activity since the last report on this.  Ruth mentioned a few groups which might be doing similar work and thought that might be an easy extension.  Bob mentioned doing DOIs now.  Rama asked if there was enough data there to do an update on the paper.  The information about the test bed could be a paper with the updates, or an update could go on to the Duerr et al. paper.  Ruth asked how much materials are in the report - Nancy said the final recommendations for 7-9 ideas and abstract views on them.  More than a few paragraphs, the first page or two includes information from the paper.  Ruth suggested letting people look at it once the report is sent out.  Nancy would also like to know who at these other organizations would be the right person to talk to.  Bruce mentioned something about collections structures which might relate to this topic.  What are the stable entities and entity types and how do you put them in an archive inventory, recognize that archives and accounting have different labeling processes.  Nancy thought this might be a good way to discuss UUIDs.

Data citations principles

http://www.force11.org/datacitation

There is a data synthesis group.  A wide variety of organizations have come up with DOI standards and principles, which are not all the same.  Paul Uhlir tried to get the groups together and come up with a common set.  There are about 20 organizations along with CODATA and ESIP.  They have a draft and they have not been published yet.  They are a much higher level then the data management guidelines.  And more specific than ESIP guidelines.  It might be worth this communities wile, to review these and see if there are any comments ESIP can add.  Ruth would like to take these high level principles to ESIP at a high level to approve.  Something that would work across all disciplines.  Ruth asked if this was a reasonable plan.  Rama said yes, this would be a greater detailed level and we just need to make sure that the two play together (the ESIP and this set).  Carol asked if Ruth had planned a mapping exercise of the two.  Mark said he did not think they conflicted with the ESIP guidelines.  Carol thought a mapping exercise might help it move forward through ESIP at the highest levels.  Mark added, the next step for this group is how they are implemented and ESIP might be a use case.

Mark thought these were a step back from the work we have conducted.  So look them over and see if these meet our goals.  Carol mentioned thinking about these principles and how we are applying them.

RDA update

Ruth and others are tracking activities at RDA.  She asked if those who are working with different groups could update the list.  Ruth took over co-chair of the preservation group.  BUt they are having a hard time scheduling monthly calls.  The data foundations terminology group is off and running.  They are working on getting agreed upon definitions even if they are not strict.  Are there others on the call who are working with an RDA group?

Natalie Myers said they were participating in some of the RDA working groups.  mostly with the focus on data citation and data management.  They recently received a Sloane/Orchid grant to integrate researchers ID’s.  Most of the data is from high energy physics.

Carol said there would be a session or couple of sessions on the community capability model (by microsoft UK) and they are bring it to ESIP to see how a community works well together.  Looking to do a gap analysis of communities.  With a poster and descriptive session at the ESIP meeting.  Our community is being used as a higher level study.

PCCS paper

It is just about done (but we are running out of time on the call).  The authors said they would send it to the list for one week review - Rama said they would try and send it out this friday and have it reviewed by the following friday.  They would like to get it submitted this month.

At the next call Anne will update us on her RDA experiences.

Those of you who have a session, submit them to the commons!

Citation:
Preservation and Stewardship Committee Telecon 2013-11-06; Telecon Minutes. ESIP Commons , September 2014