Preservation September 2012

Notes: 

ESIP Preservation and Stewardship Cluster Monthly Telecon 2012-09-10

Attendees: Bruce Vollmer, Denise Hills, Hook Hua, Jason Cooper, John Moses, Mark Parsons, Pam Michael, Rama, Steve Aulenbach, Steve Morris, Tyler Stevens, Curt Tilmes, Sarah Ramdeen, Jeff DLB, Carol Meyer, Bruce Barkstrom, Linda Musser, Anne Wilson, Heather Brown

Notes:
There was a question about DOI’s and if they were limited to just pages from the Commons or if they could be used on other outcomes, like software etc.  Carol mentioned that they are currently manually assigned and there is a need for a test bed where they could be automatically assigned.  This is different from what Nancy has been working on.

Nancy is working on a test bed for some of the recommendations to ESIP, and another has been testing DOIs and things that would be on the Commons.  Hook is asking about things that would broaden the use of DOIs.  Carol said no - our contract expressly prohibits that.  Only thing that are in the ESIP commons.

Hook mentioned having a session on the topic, but it is not yet a well developed idea.  The DOI and Provenance things in particular.  But there are still some misconceptions.  Particularly with identifiers.  So doing something to broaden education on these topics would be good.  But Curt said we would need to develop  something more concrete in order to ask for budget funds.

Rama mentioned a good talk for IGARSS would be DOIs, if they could get travel support for one person in order to present on the ESIP idea on these things.  Curt asked if Rama and Mark could write up a request on this which could be run past the preservation mailing list to see what other people think about it.  Mark is busy this week but Rama may have some time.  Mark mentioned going to a conference next week with a similar topic and can solicit someone there.  But Mark does not think he would be able to attend the event we are looking for someone to speak at.  Rama would like to have the idea finalized by Sept 21st to have time to work out the details.

Curt said we are a bit behind on the budget discussions but if you have any other ideas, send them to the mailing list.

Curt has started a wiki page based on summer and list discussions on Data Citation Guidelines for Editors and Reviewers.  This could be used for data providers.  We had divisions between users and providers early on in this discussion.  This can help our advocacy and pushing standards from the other side.  Journal editors have a great deal of impact on research and reviewers have an opportunity to push good citation practices into the community.  A set of guidelines we can share with the community.  Curt would like to solicit input and encourage people to take part in this activity.  This can also go through the mailing list but congeal on the wiki page.  Hopefully this will move rather quickly, and we as a committee can get a document/view of the committee that we can take to the assembly.  Perhaps this can turn in to more articles and build on the things in the data preservation guidelines.

Mark asked if any agency or society guidance or mandates on the need to track back to original source?  Curt said a number of journals.  Mark mentioned from the science community to reinforce the why question.  Mark mentioned AGU and another source where the data has to be adequately cited.  Curt mentioned the NSF open letter as well.  We need participation from this group to drive these guidelines.  Mark was not at the summer meeting and asked for clarification, these guidelines are for editors to use in reviewing papers?  What is the end result?  

Curt thought we should write up a general document and describe how to do it, and reference data citation guidelines, but what we are looking for are some proposed language that can be told to submitters to encourage them to include them.  Curt believes putting the ESIP stamp would encourage more editors to support this.  Editors can reject if the submitters have not followed the basic recommendations.  This can be refined or changed based.  Part of the reviewing process of an article.  And people will think of citations up front.  Rama mentioned that editors should tell reviewers they should consider citations when considering an article.  We should create a set of language to express this.

Bruce mentioned work breakdown scenarios. He mentioned a set of references (under google for example) for standards for task type.  How are citations used and what do people need in citations?  Looking at scenarios for interacting in a user interface would help.  How do people use the data citations?  Curt mentioned, reasons why it is important to have good data citations.  Which Bruce agrees is something we need to talk to the science community.  Curt mentioned we touched on this a bit with the guidelines for data providers.  Mark mentioned this as well.  Also now that we have convinced you that it is a good idea, how do we do it?  And finally why and how, here is specific recommendations for the journal publication process.

Bruce talked about the page citation system in humanities, is there something similar for scientific data?  Also how do you reference changes to data out of a citation?  Mark said this would be a long ongoing topic.

Mark mentioned specifically, space and time are often very useful but they do not solve this.  He suggested that we create scenarios to provide to editors to explain this issue of changes in datasets (how a new product was created).  Some of this can be described in the methods section as well.  But we need to be able to get the reader back to the data set - through a DOI etc.

Linda introduced herself.  She is a librarian with interests in this topic.  Curt mentioned that we are interested in what library science has to add to this conversation.

Curt mentioned that Bruce discussed brings us back to the discussion on levels of granular and there is still more work to be researched.

Linda mentioned trying to convince the editors that this is important will be hard work.

Curt asked for thoughts on the wiki page he created on this topic. Anyone can get access to edit this page.

Linda suggested that she had seen pushback from journal editors -either they did not have guidelines or just wanted data mentioned but not cited.  She suggested asking the editors to create or adopt a policy concerning data citations and it should also be included in the citation list and not just acknowledged.  Curt mentioned that this is what we would like to work on, something targeted.  Linda mentioned that this was awhile ago and the time is right to try it again.

Mark mentioned that this is why it is important to have it in the words of the scientists.  BUt we also have to follow up with targeted places where we think we can make a difference.  AGU was used as an example.  

An additional concern in the past was a worry that authors were only trying to increase their citation counts.  Curt mentioned the NSF open letter.  Mark said it is about reference and to make good science.  Secondary things like author credit should be that, secondary.

Someone asked if they would be presenting this idea at the GSA meeting in November.  Curt mentioned that it would be a good idea to send a notice to the mailing list about this meeting.

Carol mentioned that a couple of ESIP folks are chairing a session that might be relevant to this.  Arizona Survey staff and Kerstin Lenhar as an example.

Curt suggested people look at the wiki page and bring topics up on the mailing list.

DOI landing pages - there is currently a lot of discussion on the mailing list.  We started a DOI landing page for this issue.  Curt mentioned when you have a page about a data set - what should be on there?  How things are organized, how the DOI itself should be constructed.  Curt grabbed a few comments from the mailing list to capture that discussion.  We have links to different examples as well.

Comments about this page?  Mark asked if this will lead to revisions of our data citation guidelines?  This DOI opacity thing is one topic that is a bit controversial.  John mentioned that he was surprised that this topic came up, there are a lot of good things to learn about them but that is a separate issue then the construction of the DOI name.

Mark suggested best practices on the identifier.  Curt said it might be two different activities, guidelines for each of these independently.  John said it probably makes sense to report on the various range of how the community is using DOIs.

Curt mentioned creating a list of what is out there, comparing them to our guidelines and coming up with a criteria for that and seeing what we find.

John - Include example citation on landing page,

Jeff DLB - NOAA Data Citation procedural directive encouraging users to cite data, looking for ESIP advice. Considering check digit for opaque DOIs. Human readable page as target for DOI, considering content negotiation, could use ISO XML page styled to human HTML.
 

Citation:
Tilmes, C.; Preservation September 2012; Telecon Minutes. ESIP Commons , October 2012