Information Quality and Stewardship

Abstract/Agenda: 

The purpose of this session is to restart discussions in the Information Quality cluster and strategize its future activities.

The objective of IQ cluster is to bring together people from various disciplines to assess aspects of quality of remote sensing data. We will be learning and sharing best practices with a goal to build a framework for consistent capture, harmonization, and presentation of data quality for the purposes of climate change studies, earth science and applications. The efforts and goals of this cluster are not predefined and are motivated by the participants of the cluster, so new ideas and participants are always welcome.

The IQ cluster met during the ESIP Meeting of Summer 2014, but has been dormant since then. In the meanwhile, there have been significant activities related to this cluster’s goals in NASA (through its Data Quality Working Group, one of the Earth Science Data System Working Groups – ESDSWG), and NOAA (with its Data Stewardship Maturity Matrix). We will discuss progress in these two areas and chart a course for future collaboration to meet the objectives of the IQ cluster.

Agenda:

  • Introduction – H. Ramapriyan (Rama) – 5 min.
  • NASA ESDSWG DQWG Overview – David Moroni - 15 min.
  • DQWG’s Usability Subgroup – Bob Downs – 5 min.
  • Roles and responsibilities of stewards and stakeholders in ensuring and improving data quality and usability - Ge Peng – 20 min.
  • Discussion – All – 45 min.
Notes: 
  • Objectives:

    • Mainly to bring together people from different disciplines to assess aspects of quality of Earth science data.

      • To provide guidelines and best practices for data quality.

  • There are several relevant activities that the Information Quality group will be leveraging.

  • NASA ESDSWG Data Quality Working Group Overview:

    • Mission: Discover/Assess the existing data quality standards and practices in the inter-agency and international arena to improve upon existing recommendations relevant to ESDIS, DAAC’s and NASA data providers.

    • A flow chart was presented showing key milestones of the DQWG. 12 major milestones have been completed between 2014 and 2015.

    • A recommendation report has been submitted to the ESDIS Project. Final recommendations are aimed to be published by 2016.

    • 16 use cases were submitted and evaluated to arrive at the recommendations:

      • 4 major categories: accuracy/precision/uncertainty, applicability, distinguishability, usability.

      • Separation of the use cases allow each group to be reviewed more in detail.

    • Data Quality Management (i.e., Management of information on data quality) is separated into four Phases:

      • Phase 1: Capturing

      • Phase 2: Describing

      • Phase 3: Facilitating discovery

      • Phase 4: Enabling use

    • Recommendations were developed for the phases, and the recommendations were split into 2 domains:

      • Data system and Science

    • Each domain has recommendations specified under the following Recommendation Categories (7 in total):

      • General, Standard Documents & Processes, Quality of Input Datasets used in Generating Products, Quality Flags & Indicators, Metadata Consistency Checking, Publicizing Quality Issues, and Dataset Recommendations.

    • Samples of Key Findings:

      • User commonly lack the means to distinguish between datasets.

      • Data producers lack a standard set of instructions for generating and representing the requisite data quality information.

      • Tools/services for assessing data quality are not consistently available for every dataset.

  • The Usability Subgroup of the NASA Earth Science Data Systems Working Group (ESDSWG) on Data Quality:

    • Mission: Improve capabilities of Earth science data users to discover/assess/use Earth science data by accessing/understanding/evaluating quality aspects of the data.

    • Data System/center and science team have different focuses, and this is why it was important for the recommendations to be separated.

  • A New Paradigm for Ensuring and Improving Data Quality and Usability:

    • Data Quality = how good or bad a data product is.

      • Include product quality and stewardship quality.

        • 3 types of stewards: data, scientific, and tech.

    • Data Usability = how easy or hard a data product is understood and used.

    • The data quality and usability is now being influenced by a team of expertise and not just an individual.

    • An analogy that was given was how food quality was measured.

      • However, it is important to explain clearly the different grades of quality, so that the consumers could understand and choose the appropriate quality accordingly.

    • Among data producers, data archive/centers, and data users, same context/language are not always shared.  This makes ensuring and upholding consistency of data quality challenging.

    • Roles involved to collaborate with for ensuring high data quality: data producer, data steward, tech steward, scientific steward, data provide/publisher, and stakeholder.

  • Questions/Comments:

    • What if the PIs are not supportive of the level of metadata/data quality that is recommended?

      • It is an issue that the data center/archive management could be involved in order to determine on a case-by-case basis or as an opportunity for outreach.

    • Metadata/Data quality should be understood as value to all the parties involved, including the data providers.

    • Out of the suggested activities for the Information Quality cluster, obtaining the support from the management to allocate the necessary resources could be very crucial to the success of the data quality program.  This could an activity that needs to be prioritized as well.
      • It is also important to discuss a way to link quality information with the data and make the information public.
  • High priority activities:
    • Engage data producers/providers, data managers, and data user communitites as resources to improve our standards and best practices.
    • Establish and provide community-wide guidance on roles and responsibilities of key players and stakeholders including users and managemet.
    • Coordinate use case studies with broad and diverse applications, collaborating with the ESIP Data Stewardship Committee and various national and international programs.
  • Additional suggestions:
    • Provide education on information quality to different communities, such as science students and librarians.
Citation:
K., H.; Information Quality and Stewardship; Summer Meeting 2015. ESIP Commons , March 2015