AGU Data Management Maturity (DMM) Workshop

Abstract/Agenda: 

The AGU Data Management Maturity (DMM) Workshop is an opportunity to learn more about the process areas included in the DMM reference model as it applies to the Earth and space sciences.  We will discuss important process inter-dependencies, and how to characterize the level of data management capability for your organization.  

You will learn about common data management weaknesses and specific steps you can take to improve your own data management practices as soon as you get home. 

This is a great workshop for any data curation team, repository of any size, or anyone working with data.

 

 

 

 

 

Notes: 
  • Shelley’s full presentation is attached.
  • AGU’s position is to focus on work with data, especially focusing the value of data as world heritage.
  • Data management challenges include that fact that different organizations/projects/disciplines have different context.
    • Example of the challenges associated with implementing open data policy:
      • How to enforce the policy; i.e. how to validate and show evidence of compliance.
      • How to maintain ongoing financial support.
      • Clarification/clarity of roles and responsibilities that need to be involved within a data life cycle.
      • Not all data can receive the same attention even when they are all equally important.
      • There are talents with data management skills but not enough domain skills. As a result, it is not always easy to convince management to hire additional data management staff.
      • Funding is only sufficient for providing management for a limited amount of datasets.
      • Determination of an authoritative/trusted source for data, such as for deposit and for retrieval.
      • Many younger researchers do not have sufficient training in methods and practices of data quality assurance and control.
        • This includes the quality of metadata that is associated with the datasets.
      • Making data accessible is a complex process because it is not just making the data available but also preparing and presenting the data so that it is appropriate and understandable for the users who would want to use the data.
      • Determination of what data is actually being used, so that informed decisions could be made regarding what are the necessary steps to management data to optimize the impact.
  • The Data Management Maturity (DMM) model developed by CMMI Institute is being adapted for the management of data.
    • A process improvement and capability maturity model for the management of an organization’s data assets.
    • There are 6 major components of the DMM structure: data management strategy, data quality, data operations, supporting processes, platform & architecture, and data governance.
    • Capability vs. maturity; i.e. what we can do vs. how we can prove it.
      • 5 capability levels: performed (lowest), managed, defined, measured, and optimized (highest).
        • The requirements from the previous levels need to be achieved before getting to the next levels.
        • “Defined” (level 3) is recommended as the standard target.
      • Question: Does the experience level of the user base affect the capability level that an organization can achieve?
        • Answer: It is possible, but the DMM assessment process will allow the assessor to differentiate the factors that the organization has control over versus those that the organization does not.
      • Self-assessment is possible; however, the understanding of the DMM is vital in order to optimize the effective of the assessment process and the corresponding results.
  • Highlight of the Data Management Strategy Process Areas:
    • Data Management Strategy – definition of vision, goals and objectives.
    • Communications – ensure related policies, progress announcement, etc. are published, enacted, understood, and adjusted as needed.
    • Data Management Function – guidance and leadership.
    • Grant Strategy/Business Case
    • Funding – adequate and sustainable.
    • Governance Management – ownership, stewardship and operational structure.
    • Vocabulary/Glossary – support for a common understanding of terms and definitions.
    • Metadata Management – establish the processes and infrastructure for specifying and extending clear and organized information about the structured and unstructured data assets.
    • The following four areas are sometimes considered as the same area by an organization:
      • Data Quality Strategy – defines an integrated, organization-wide strategy to achieve and maintain the level of data quality per the organization’s goals and objectives.
      • Data Profiling – develops an understanding of the content, quality, and rules of a specified set of data.
      • Data Quality Assessment – provides a systematic approach to measure and evaluate data quality.
      • Data Cleansing and Curation – define the mechanism, rules, processes, and methods to validate and correct data and metadata.
    • Data Requirements Definition – ensure the data submitted and accessed by the scientific community will satisfy organizational objectives and is understood by all relevant stakeholders.
    • Data Lifecycle Management – ensure the organization understands, maps, inventories and controls its data flows through processes.
    • Additional areas that are also part of the Data Management Strategy Process Areas (these areas were not covered in detail due to session time constraint; please refer to Shelley’s presentation for further details):
      • Contribution/Provider Management
      • Architectural Approach
      • Architectural Standards
      • Data Management Platform
      • Data Integration
      • Data Archiving and Preservation
      • Measurement and Analysis
      • Process Management
      • Process Quality Assurance
      • Risk Management
      • Configuration Management
    • ​​Overall, if there is a natural process that is logical and reasonable for the users and the specified task but the process is different from the process that is currently defined by the organization, it is important to review how the process might be updated to meet the users’ needs instead of forcing the users to change their behavior to meet the process.
    • These areas are not intended to be sequential.
  • A data management assessment establishes a baseline of capability and maturity and is conducted by a trained and certified Enterprise Data Management Expert (EDME) to ensure consistency of the method.
  • The process of the assessment takes about 2+ weeks for the preparation phase, 3-5 days for the assessment phases, and 2+ weeks for the conclusion phase.
Attachments/Presentations: 
Citation:
Stall, S.; AGU Data Management Maturity (DMM) Workshop; 2016 ESIP Summer Meeting. ESIP Commons , February 2016