Data Stewardship Maturity Matrix – Use Case Study

Abstract/Agenda: 

Assessing the current stewardship maturity state of individual or a collection of datasets is an important part of ensuring and improving the ways that the datasets are being archived, stewarded, and disseminated to users. It is a critical step towards meeting U.S. federal regulations and user requirements exerted on datasets funded or produced by U.S government agencies.

Stewardship maturity assessment models provide a uniform framework for a consistent assessment within a context of data management in organizations and portfolios, and stewardship of individual datasets, respectively.

In this session, use case studies will be used to demonstrate the utility of various stewardship maturity assessment models and to provide an opportunity for people in the ESIP community to discover the capability of stewardship maturity models, to provide feedback, and to discuss a consistent way to implement them on various tiers of scientific data stewardship.  

Agenda

Ge Peng: An overview of the data stewardship maturity matrix and tiers of maturity assessment

Ruth Duerr: An overview of the ESIP Data Stewardship Committee’s activity on use case studies of the data stewardship maturity matrix, and a use case study of the long-tail datasets maturity

Sophie Hou: A use case study of stewardship maturity of NCAR model reanalysis data and DataOne member node ecological data

Lorri Peltz-Lewis:  An overview of the portfolio maturity matrix and a use case study of OMB A-16 portfolio

All: Open floor questions and discussions

Notes: 
  • Peng:
    • Provided the background and the overview of the Data Stewardship Maturity Matrix (DSMM).
    • Descriptions of the maturity matrix categories were also discussed.
    • Collaborators of the DSMM  included: ESIP Data Stewardship Committee, NOAA NCEI Data Stewardship Division, NOAA Climate Data Record (CDR) Program, NOAA Office of Systems Development (OS), and NOAA Technology Planning and Integration Office.
  • Tiers of Maturity Assessment within Context of Scientific Data Stewardship:
    • Repository Trustworthiness Maturity (e.g. ISO 16363: 2012).
    • Asset Management Maturity (e.g. National Geospatial Dataset Asset).
    • Stewardship Practice Maturity (e.g. NCEI/CICS-NC matrix).
  • 3 stages of individual datasets maturity assessment: Create/Evaluate/Obtain (product) --> Maintain/Preserve/Access (stewardship) --> Use/User Service (service).
  • Questions: 
    • Why were so many maturity matrix categories developed and if there are overlaps?  Why different metadata categories? Could we simplify or condense these categories?
      • The current matrix categories are the result of balancing practices in the functional areas that are quasi-independent and number of degrees of freedom we want to deal with. Metadata is categorized into each maturity key component as maturity of metadata in one component such as preservability does not imply that in another area such as usability. The refinement of categories' definitions would definitely be possible, but the the improvement would ideally be based on feedback from use case studies and from the ESIP community.  
  • Ruth:
    • ACADIS, Analytic Potential, and Stewardship Maturity.
    • The maturity when done across different organizations can help people in comparing stewardship practices.
    • ACADIS partners: NSF, NSIDC, UCAR, NCAR
    • Many of the datasets are long tail. 
    • Specific definitions from Curation Stack Model (e.g. Curation, Preservation, Archiving, and Storage) were used in parallel with the maturity matrix.
    • Details of the evaluation results for each maturity matrix category using ACADIS datasets were presented.
    • Overall, the experience showed that the maturity matrix was helpful, but additional adjustment or refinement of the maturity matrix categories could improve the evaluation process further.
    • Other observations:
      • “Data” could take on many different forms, but not in the “standard” sense, such as photos.
      • The definition of “dataset” will need to be clarified as well.
    • Questions:
      • Mechanism for merging duplicate dataset?
        • Different individuals and organizations are exploring different possibilities, including the NCAR Library (Matt Mayernik).
      • Is there a need to separate out the evaluation of different types of metadata, such as discovery, preservation, description metadata, etc?
        • Separating the evaluations of different metadata types would be helpful.
        • This would allow the evaluators to understand the strengths and the weaknesses of each dataset.
  • Sophie:
    • Presented use case experience with the DSMM and lessons learned using 2 different datasets from the National Center for Atmospheric Research (NCAR) Research Data Archive (RDA) and Santa Barbara Coastal (SBC) Long Term Ecological Research (LTER).
  • Lorri:
    • NGDA Lifecycle Maturity Assessment.
    • Background: OMB Circular A-16.
    • Different stages of the Dashboard Metrics: Define, Inventory/Evaluate, Obtain, Access, Maintain, Use/Evaluate, and Archive.
    • The dashboard enables the assessment based on 19 questions.
    • A Survey has been developed in Word document as the starting point; the document is structured using the parallel structure as the stages of the Dashboard Metrics.
    • A specific timeline is given to the data manager to complete the survey for each dataset.
    • Once the survey has been submitted, a summary with scores will be provided for the maturity level for each stage.
    • The definition “Best Management Practices” might not universal, so it is helpful to understand and share how each group defines and propagate its best practices.
    • Additional analysis will also be generated based on the survey results.
    • Possible next steps:
      • Cross walks among the different maturity methodologies?
    • Question:
      • How are information of the survey results shared?
        • This is still being evaluated because there are different ways that the information can be structured.
      • Can the reulsts of the survey be included in the metadata for the dataset?
        •  Yes, but the method and location for storing this in the current metadata formats would need to be determined.
Citation:
Peng, G.; Duerr, R.; Data Stewardship Maturity Matrix – Use Case Study ; Summer Meeting 2015. ESIP Commons , April 2015

Comments

ge.peng's picture

Ge Peng: An overview of the data stewardship maturity matrix and tiers of maturity assessment Ruth Duerr: An overview of the ESIP Data Stewardship Committee’s activity on use case studies of the data stewardship maturity matrix, and a use case study of the long-tail datasets maturity Sophie Hou: A use case study of stewardship maturity of NCAR model reanalysis data and DataOne member node ecological data Lorri Peltz-Lewis: an overview of the portfolio maturity matrix and a use case study of OMB A-16 portfolio Open floor questions and discussions