Data Stewardship Committee Reporting Out and Planning Session

Abstract/Agenda: 

Participants in the Data Stewardship Committee will report out on their  relevant activities over the past six months and will discuss future directions.

Notes: 
  1. Justin provided the overview of the agenda.

    1. The agenda and the notes are recorded as follows.

    2. Please see also “State of the Federation” for more information about our 2016 activities.

 

Timeframe:

  • Since the summer meeting.

 

All topics to be discussed from the vantage points of:

  1. Developments since the ESIP 2015 Summer Meeting

  2. The upcoming six (6) months until the 2016 summer meeting.

 

Listing of discussion topics: - 5-7 mins apiece

  1. Data Management Training - Nancy and Sophie

    1. Link to Data Management Short Course for Scientists: http://commons.esipfed.org/datamanagementshortcourse

    2. Link to Data Management Training Wiki: http://wiki.esipfed.org/index.php/Data_Management_Training

    3. Nancy -

      1. The data management training sub-group re-initiated  activities in 2015 after the original 35 modules were produced between 2011 and 2013.

      2. One of the key activities that we completed last year was the “Data Management Training Resource Survey” - a FUNding Friday project.

        1. The project was completed during fall, 2015.

        2. The project result was presented during a breakout session and the poster session at the ESIP Winter Meeting 2016.

          1. Breakout Session: Data Management Training Resources Survey and Clearinghouse Project Report (http://commons.esipfed.org/node/8755)

          2. Poster Session: Data Management Training Modules: An Initial Survey and Comparison Result (Funding Friday) (http://commons.esipfed.org/node/8763)

      3. The Data Management subgroup is also pursuing the building of a clearinghouse of data management training resources.

        1. The project is being proposed to USGS to acquire funding.  The proposal is due at the end of January, 2016.

        2. It would be important that we ensure the sustainability of the clearinghouse.

      4. There is a specific listserv that is set up for the data management training sub-group.

        1. Link to sign up for the data management training listserv: http://lists.esipfed.org/mailman/listinfo/esip_dmtraining

        2. Please also feel welcome to contact Nancy (nhoebel AT kmotifs DOT com) or Sophie (hou AT illinois DOT edu) if there are questions or if further information is needed.

 

  1. iSamples - Denise and Sarah

    1. An EarthCube RCN.

    2. Sarah is a co-chair to collect use cases.

    3. There are also additional working groups, including best practices.

    4. Sarah will be able to provide additional updates after attending an upcoming workshop.

    5. Sarah completed a FUNding Friday project, and the result is available as a poster.

      1. Stewardship of physical data: Use case and community engagement (Funding Friday) - http://commons.esipfed.org/node/8771

    6. An interest group is also being developed within RDA.

    7. The physical samples does not need to be restricted to geological samples only; it could be any physical objects.

    8. Potential Action: Contact Gao Chen at NASA Langley.

 

  1. NASA ESDSWG Involvement - Rama

    1. Provenance, stewardship, data quality, and citation are the ESDSWG working groups that would be interest to the Data Stewardship Committee.

    2. PROV-ES is the Provenance working group’s key activity.

    3. The ESIP Information Quality Cluster is working related issues in Data Quality, and the Cluster is considered to be part of the Data Stewardship Committee.

      1. The co-chairs for the Cluster are Rama, David Moroni, and Ge Peng.

    4. Assigning identifiers for objects other than datasets is the key activity for Citation group.

      1. Update regarding ESDIS DOI progress was presented (the presentation file is attached in the section below).

      2. There are 8 steps involved in the ESDIS DOI workflow (first 4 steps are for the Data Provider and the second 4 steps are for the ESDIS DOI team).

      3. There is a guidance available to provide the fields that should be available on the DOI landing page.

        1. Please see the wiki’s main page: https://wiki.earthdata.nasa.gov/display/DOIsforEOSDIS/ESDIS+DOI+Process

          1. DOI Landing Page: https://wiki.earthdata.nasa.gov/display/DOIsforEOSDIS/DOI+Landing+Page

 

  1. PCCS - Rama

    1. Link to Provenance and Context Content Standard Wiki: http://wiki.esipfed.org/index.php/Provenance_and_Context_Content_Standard

    2. PCCS = Provenance and Context Content Standard

    3. The key goal is to enable future users to review the lineage/history of the data through documentation with standard format/categories, so that the future users could understand the data with the correct context.

    4. In relation to ISO19165, ISO19165 is a framework that states principles and rules rather than going into any details of content that should be preserved.

      1. It will take more work to integrate the PCCS into the ISO standard.

    5. Planned work for this upcoming year?

 

  1. Data maturity matrices and similar exercises - Rama and Peng (virtual)

    1. Link to Data Maturity Matrix Wiki: http://wiki.esipfed.org/index.php/Data_Maturity_Matrix

    2. Peng will provide a 10 min update for the DSMM during the Information Quality Cluster  breakout session - http://commons.esipfed.org/node/8781

    3. Sophie provided a review of the work that was performed during 2015.

      1. A poster for the work completed was presented at the AGU Fall Meeting 2015 - https://agu.confex.com/agu/fm15/meetingapp.cgi/Paper/71072

    4. Planned work for this upcoming year?

 

  1. Data Policy involvement - Matt and Ruth

    1. American Meteorological Society (AMS) Data Archiving and Citation -  http://www.ametsoc.org/PubsDataPolicy

    2. One possible extension of this Data Policy could be to consider how the policy could be apply to other object types, like software.

    3. Matt will attend additional workshops this year and can help in providing further updates.

    4. A related effort from RDA (the recommendation is currently out for comments) provides guidance on how to cite datasets (and its subsets) programmatically instead of manually.

      1. https://rd-alliance.org/group/data-citation-wg/outcomes/data-citation-recommendation.html

 

  1. “Identify everything” paper (to really begin work in Jan. 2016) (Justin)

    1. Current items to consider for the paper:

      1. Scope:

        1. Using the PCCS to structure the paper, so that we could start prioritizing the next areas that might need identifiers the most?

        2. Identifying gaps where identifiers are needed?

        3. Identifying gaps, but also reviewing the types of specifications that are available for assigning identifiers?

        4. Is the scope more about the “identifiers” themselves or the “implementation” of the identifiers?

      2. Value:

        1. Certain types of objects have been assigned identifiers longer than the others.  Evaluating the objects that have not had identifiers assigned and understanding the rationale could be helpful?

    2. Next steps:

      1. The potential authors will meet to identify the tasks and to plan the schedule for the paper.

      2. Justin will send out a Doodle Poll.

 

  1. Other directions

    1. Semantic Web?

    2. Others: EarthCube, DataOne, DOIs?

 

Aims for next year - papers?  Next steps?  Funding activities?

 

Housekeeping:

  1. No January or February monthly telecon.  Side-telecons will continue to meet.

 

Publications since Last Summer:

 

1 - Mayernik, M. S. Ramamurthy, M. K., & Rauber, R. M. (2015). Data archiving and citation within AMS journals. Mon. Wea. Rev., 143, 993–994. doi: 10.1175/2015MWR2222.1

2 - Downs, R. R., Duerr, R., Hills, D. J., & Ramapriyan, H. K. (2015). Data stewardship in the Earth sciences. D-Lib, 22(7/8). doi: 10.1045/july2015-downs

3 - Special Issue: Rescuing Legacy Data for Future Science

  • Hills, D. (2015). Let’s make it easy: A workflow for physical sample metadata rescue. GeoResJ, 6, 1-8. doi: 10.1016/j.grj.2015.02.007

  • Ramdeen, S. (2015). Preservation challenges for geological data at state geological surveys. GeoResJ, 6, 213-220. doi: 10.1016/j.grj.2015.04.002

4 - Hills, D., Downs, R. R., Duerr, R., Goldstein, J. C., Parsons, M. A., & Ramapriyan, H. K. (2015). The importance of data set provenance for science. EOS, 96. doi:10.1029/2015EO040557  

5 - Peng, G., Privette, J. L., Kearns, E. J., Ritchey, N. A., & Ansari, S. (2015). A unified framework for measuring stewardship practices applied to digital environmental datasets. Data Science Journal, 13, 231-253. doi:10.2481/dsj.14-049

6 - Hou, C.-Y. (2015). Meeting the needs of data management training: The Federation of Earth Science Information Partners (ESIP) Data Management for Scientists Short Course. Issues in Science and Technology Librarianship, 80. doi: 10.5062/F42805MM  

 
Attachments/Presentations: 
Citation:
Goldstein, J.; Data Stewardship Committee Reporting Out and Planning Session; Winter Meeting 2016. ESIP Commons , October 2015