Preservation and Stewardship Committee Telecon 2014-03-14

Abstract/Agenda: 
  • Joint Declaration of Data Citation Principles

  • ESIP summer meeting plans  

  • RDA data citation liaison discussion

  • AGU session planning

  • Joint Declaration of Data Citation PrinciplesReports from all the other activities (if there is time)

Notes: 

Joint Declaration of Data Citation Principles

Ruth is hoping someone from the RDA data citation group to join us, if his schedule permits.  We will switch topics when he arrives.

Joint declaration on data citation principles is out for endorsement.  Ruth brought up to the executive group whether or not we should be endorsing these principles.  They can be seen online: http://force11.org/datacitation.  There are 8 of them, Ruth asked everyone to read them.  Some people have already endorsed these, but ESIP as a whole has not.  It was agreed at the excom meeting it was discussed if there were any downsides to ESIP endorsing these - Ruth is bring this question to the group - are there any issues with ESIP endorsing this?  

Rama pointed out that we have our own set of principles that was created a few years ago.  Ruth said those are a lower level from this, and Rama asked if there were any inconsistencies between these?  Ruth said her and Joe H, worked on making them consistent.  Mark agreed, said he worked on this as well, and that ESIP is beyond this, but this brings many communities together at a high level and recognizing that different communities will implement them in different ways.  So if ESIP wants to support data citations this is one way to do that.  Rama said did not think a detailed look was warranted based on these recommendations.  Anne agrees with Rama.   Rama said that ESIP should endorse these then.  

Erin asked if it would be useful to get a list of other domains that have a detailed list of citation guidelines?  That can link back to the specific domains from this higher level list?  it would be nice to have a back and forth.  Ruth said that was a good point - and that this was a collaboration between many different groups (20-25).  But this particular work group is now disbanded, and being replaced by two new ones - and are soliciting people who want to be members.  One related to dissemination - one group led by Paul Ulir trying to get university presidents, repositories etc. to sign up.  Other group is data citation implementors group.  Which would make recommendations to groups who have not gotten as far as others.  Ruth said they would probably have a place on the website for that idea.  Mark agreed with Erin, that perhaps this might be a good issue to push early on.  So that ESIP endorsement should include showing they have thought hard about these issues and have worked on things.  Erin felt it would also be a broader exposure to the work we have done.

Mark said that data identifiers groups of groups coming together removes some of the competition but there is still some push back.  They have to keep insisting that these are joint principles and not one single group.

Rama asked what form the endorsement takes?  How does that work?  Ruth said, for an organization, there is a button and you upload the name of the organization and a logo.  There is not much more than that.  Bob said as an individual he clicked something and it added him.

Rama asked if the endorsement is just a yes or no, or also includes more like being able to point to the ESIP principles.  Ruth said they do not currently have that but would ask about it.  Rama said that would be another mechanism for endorsement as well - pointing to their stuff.  Ruth will bring that up with this group.

Erin mentioned what it meant for ESIP to endorse this - as an organization we have only endorsed a few things.  And what does it mean - does this cover individual members, organization members, the assembly etc. And it does both, ESIP would but the members could as well - but there are still some questions about how that would work and they are thinking it over.

Ruth asked again for any downsides.  Ruth said she would send out an email today to say that ESIP is interested in endorsing this but has concerns about linking to our own guidelines.  Ruth guess that there are not many others outside of maybe DCC.  Mark said Dataverse.  Rama said if we were allowed to link to our guidelines, that might spur others to do their own at their domain level.

Mark suggested this might be one way to look for commonalities as well.  Mentioned the Digital curation meeting last month, where they discussed what might be included on landing pages and websites.  And some of these things might work across disciplines and domains.  Rama said there is a lot of work on that DOI’s and landing pages at a specific group (???) for what different groups, different users need on landing pages.

Mark mentioned off topic, with our existing guidelines, all of our examples are FOO and John Doe, and it would be really nice, they have been out for a few years, maybe this effort should have real live examples that would illustrate the point and the breathe of the adoption.  Rama mentioned a few groups they could pull from.  Mark suggested this could be a future project.

ESIP summer meeting plans

The cut off is April 13th - for the general list of topics and sessions needed for the summer session.

Ruth wanted to have a discussion now to see what sessions we wanted to have this summer.  She suggested our normal planning session.  Rama suggested a GCIS which Curt agreed on, Curt is no longer involved in the project but several members from that group will attend the summer meeting. Ruth asked who would submit the session?  Curt said he would work with someone.  

Rama suggested as session on PCCS and other similar topics - a reporting session different from the planning session.  Denise mentioned wrapping it into a single session, better than several sessions, and maybe include some of Bruce’s work on collection structure.  Bruce said he does not have the time on the book right now and would have to withdraw the topic.  He also mentioned he has not gotten feedback on the notes he posted to the wiki.  Denise felt a reporting session would be useful because we tend to take up the planning session with reports.  Ruth agreed.  Denise mentioned a few topics and said others could be included - use cases, PCCS and physical objects.  Rama said he could report things from the NASA domain related to PCCS if he had the time.  Ruth said it sounds like we have plenty of materials to fill the time and Denise said she would work on the session if Rama takes lead.  Denise suggested Sarah might take part as well.

Ruth thought the list looks good but if others have ideas we should speak up on the list over the next week.

Suggestions:

  1. normal planning session (Ruth)

  2. GCIS implementation (Curt)

  3. Reporting session (Different from the planning session) (Rama)

RDA data citation liaison discussion

Ruth discussed a RDA working group on making data citable - looking for a formal liaison from ESIP.  This had been a discussion at the last summer meeting. What they would like to get out of us would be potential things: picking our brain for expertise, provide scenarios they might not have thought about, and they have a variety of approaches to data citations but they are focused on databases.  Ruth said she could send out information to the group about the scenarios and pdfs of solutions and a case statement.  

Ruth asked: Do we think a liaison is still a good idea, and is there a pilot that ESIP could do to demonstrate those kinds of citations?  They are interested in streaming data, or time data.  Ruth was thinking we could come up with a small experiment, a test bed proposal to see what could come out of it.  And she had invited someone from RDA to join us but he (Name? Andreas) was unable to make it.

Anne was contacted by Andreas, and he sent her a proposal that she found interesting - citations for dynamic data sets.  And she can give that as background.  The case statement proposal for the working group.  She felt the topic was important and needs to be addressed and might be able to contribute to it.

Mark said he could provide some background.  The RDA working group goes through a significant review before coming in to existence and only exists for 18 months. In that time they must create or adopt something.  IMplement something or show how it would work in practice.  He is not sure what the requested case is for this working group is.  The RDA council was discussing this working group and they liked it, but someone brought up they should connect with ESIP.  He thinks the committee should think hard about if we want to participate, how, not just as an advisory role but maybe doing something on the test bed.  Mark said this is a way to amplify the work of ESIP - bring it to a broader audience of practitioners (particularly in Europe).  If it is Anne, could she do a demonstration of its application with a set of specific data, to show ESIPs abilities .

Anne asked what we might try and implement, she thinks she understands but asked if it would encompass - capturing a query id that creates dynamic data?  Mark discussed some of the ideas behind the data citations and the ecosystem it creates.  One way, is capturing a query that can be repeated.  He does not know the particulars of their approach but it is like that model.  And seeing the differences with different subsets.  Anne said it could be transformations that might occur on demand.  Mark is not sure how tightly they scoped it.  It is meant to be little building blocks, and Andy can speak more to this.

Ruth said, one of the things is that they are interested in gathering use cases, and the use case Anne just mentioned would be useful to be written up and added to their collection.  Anne said these are things they do now, and would like to see it citatable in a rigorous way.

Ruth mentioned an example she felt could be done in a test bed which would be based on metadata but ...

Bruce said they should consider scaling.  Something that should be concerned with upfront, just building small blocks will not address scaling issues.  Or realistic for large things, if machines are coupled.

Ruth mentioned MODIS, and lots of products use versioning and you might have to make lots of recordings a day to make that reproducible, even with the query recording.

Rama asked - create citation and check out if using it to see if a human or a machine could get the data intended?  Yes, Ruth said, what would you have to do to your data systems to make that possible.  Anne said generating the citation and following back? Rama said yes.  Anne - in an automated way, capture the data about how the request was made, and the version of the server if there is dynamic reformatting, and that could be a component as well.  But that Rama is considering following it back?  Rama said if you do not follow it back, what use is it.  Anne, evaluating the effectiveness of the citation.  Yes.

Bruce - had recently tried to reconstruct something from a program he wrote 20 years ago. To help Nancy H on the test bed.  He had some trouble, and realistically he is saying people will need to recreate the understanding of the algorithm not just the environment etc. which required him to double the size of the code.  He felt this will be a problem in the future, particularly with changes to Java and databases.  Ruth said this is why she felt these tests would be valuable.  Even if they did one of the products from MODIS, even if you did not have an immediate solution.  But she encouraged Anne to write up her use case, Mark said for her to join the forum too.  Anne said she would and to some degree serve as a liaison.  She is not going to Dublin.  Mark asked if anyone else was?  Ruth is going.  Mark suggested she attend this session.  She said she would if she did not have other obligations.  Mark asked if she is in the long tail or brokering groups as they conflict.  He thought it would be nice to have someone from ESIP to attend if anyone could.  Ruth said she has to attend the brokering event.  Ruth said we could bring this up as an activity to be reported on in the future.  

AGU session planning

Do we want to submit anything related to data stewardship?  Last year there were like 60 session proposals for ESSI, and a large number were combined.  And ESIP is a place to have the discussion to stewardship related ones, to see if we as a group could go in on a set instead of as individuals combined later.  Mark asked if the group that organized the stewardship last year (Cynthia etc.) if they were going to bump it to a union session?  Ruth asked if ESIP was involved if this would help or be indifferent?  Mark said indifferent, but that it would be about showing that it appealed to the union level.  ANd there is a group working on that now.

Anne asked what a union session was - Mark said sessions at AGU are divided between union and focus groups, ESSI which is not a section.  But they also have union sessions that are meant to be broad topics that reach across disciplines, and are not co listed.  It is everything.  High profile to the broader community.  A few years ago they had one related to libraries.  There have been one or two others related to informatics.  Ruth asked if data citations would be of interest?  Mark said that this would be an idea - it is getting more of a buzz.  the library session was in relation to AGU’s existing policies.  So if you get someone from the publication group, this could be useful.  Ruth said they have published a new policy, if the data has been published it must use the ESIP guidelines.  And that was why she was thinking about this.  Mark suggested reaching out to that group and maybe Leslie, Cindy and Dawn, and see what they think.  Including people from this group.

Ruth asked if there are other topics this group would like to see as sessions at AGU - someone mentioned provenance.  Rama said data rescue and a few others combined into one session.  Which was a success.  Steve thought it was a success, he and Bob thought it was a great session and thought it should be repeated.  Ruth said if you did a session like that you would get a lot of different input and that sounds reasonable to her.  Denise agreed, she would be happy to be involved with a session of this variety.  She suggested moving this conversation to the listserv as we are out of time.

Ruth agreed and closed the meeting.  Denise asked before hand if we should have a mid month meeting to talk about the summer and AGU as they are both due before our next monthly telecon?  Ruth suggested doing them by email and see how things go forward.

 

Anne asked about the data management leaflets.  Ruth has them.

Citation:
Preservation and Stewardship Committee Telecon 2014-03-14; Telecon Minutes. ESIP Commons , September 2014